memU Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

memU

From workspace to agent memory

System Overview & Architecture

Related topics: Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Runtime Operations

Continue reading this section for the full explanation and source context.

Section Preprocessing Stage

Continue reading this section for the full explanation and source context.

System Overview & Architecture

Purpose and Scope

memU is an agent memory framework that turns raw sources (chat logs, documents, deployment logs, agent traces) into a structured, self-organizing compiled workspace that can be queried on demand. The framework exposes two runtime operations: memorize() for writing new sources into the workspace, and retrieve() for serving only the relevant layers back to an agent. Source: README.md.

The framework targets three concrete needs of long-running agents:

Context — inject the right facts, preferences, and source material instead of a cold prompt.
Continuity — the workspace persists and self-organizes across sessions, sources, and tasks.
Control — every record is structured and inspectable, tracing back to its raw source for auditing and editing.

Source: README.md.

The framework is delivered as a Python package (with both Linux x86_64 and ARM64 wheel targets added in v1.1.0) and offers a pluggable storage layer supporting in-memory, SQLite, and Postgres backends behind a shared repository contract. Source: README.md, v1.1.0 release notes.

The Compiled Workspace

The primary output of memU is a navigable workspace composed of three layered records. Source: README.md.

Record	Role	Contents
`MemoryCategory`	Folder — a topic with an evolving summary	`name`, `description`, `summary`, `embedding`, child `MemoryItem[]`
`MemoryItem`	File — a typed atomic memory	`memory_type` ∈ {profile, event, knowledge, behavior, skill, tool}, `summary`, `extra`, `happened_at`, `embedding`
`Resource`	Source — the raw artifact behind the memory	`url`, `modality`, `local_path`, `caption`, `embedding`

The on-disk projection of this workspace is a tree of INDEX.md, MEMORY.md, and per-skill SKILL.md files, persisted through the configured storage backend. Source: README.md.

Runtime Operations

WRITE — memorize()                                   READ — retrieve()
─────────────────────────────────────────            ─────────────────────────────────────────
raw files → extract → files + folders               query → walk folders → ranked files
persist via repository contracts                     return scoped, ranked context

Source: README.md.

Both operations are exposed through the MemoryService class. The memorize() method accepts a resource_url and a modality (e.g. conversation, document) along with scoping metadata such as user_id. The retrieve() method accepts a list of query items and a where filter for scoped lookup. Source: README.md.

Typed Memory Extraction

memU classifies extracted memory into six canonical types, each with a dedicated prompt module that enforces type-specific rules. Source: README.md.

flowchart LR
    A[Raw Resource] --> B[Preprocess]
    B --> C{Type Router}
    C -->|profile| P[profile.py]
    C -->|event| E[event.py]
    C -->|knowledge| K[knowledge.py]
    C -->|behavior| Bp[behavior.py]
    C -->|skill| S[skill.py]
    C -->|tool| T[tool.py]
    P --> M[MemoryItem]
    E --> M
    K --> M
    Bp --> M
    S --> M
    T --> M
    M --> Cat[MemoryCategory]

Knowledge — declarative facts, concepts, and explanations; forbids opinions, personal experiences, and user-specific traits. Items must be self-contained, under ~50 words, and merged when redundant. Source: src/memu/prompts/memory_type/knowledge.py.
Behavior — recurring patterns, routines, and solutions; forbids one-time actions unless they reveal a significant pattern, and forbids assistant-only turns. Items must use the word "user" consistently to attribute the subject. Source: src/memu/prompts/memory_type/behavior.py.
Skill — comprehensive skill profiles with YAML frontmatter (name, description, category, demonstrated-in) and full sections (Core Principles, When to Use, Implementation Guide, Success Patterns, Common Pitfalls, Key Takeaways). Each profile must be at least 300 words to ensure depth and actionability. Source: src/memu/prompts/memory_type/skill.py.
Tool — tool usage patterns including tool name, scenario, outcome, and a when_to_use retrieval hint, capturing both successful patterns and failure lessons. Source: src/memu/prompts/memory_type/tool.py.

Preprocessing Stage

Before typed extraction, document inputs are condensed by a dedicated preprocess prompt that produces two outputs: a processed_content block that preserves all key information while removing verbosity, and a single-sentence caption summarizing the document. Source: src/memu/prompts/preprocess/document.py.

Category Organization and Retrieval

After extraction, memory items are sorted into MemoryCategory folders, cross-linked, embedded, and summarized into a browsable tree. Source: README.md.

Two summary prompt variants are used to keep the folder summaries current:

Plain category summary — merges original content with new memory items, preserves Markdown hierarchy, and excludes one-off actions that lack long-term value (e.g. "ate Malatang today"). Source: src/memu/prompts/category_summary/category.py.
Reference-annotated category summary — adds inline [ref:ITEM_ID] references pointing back to the specific memory items that contributed each fact, enabling auditability and inline citation. Source: src/memu/prompts/category_summary/category_with_refs.py.

During retrieve(), the framework navigates these folders and returns only the files relevant to the current user, agent, session, or task, returning scoped, ranked context that can be injected into any agent workflow. Source: README.md.

Community Notes and Known Limitations

Several issues reported by the community are worth noting for anyone reading this overview:

Storage column bug — the memory_items.happened_at column has been observed to store NULL even when proper timestamps are provided in conversational input. This affects the optional temporal metadata on memory items added in v1.3.0. Source: issue #428.
SQLite embedding storage — the SQLite backend has had issues storing embedding fields as native lists, raising ValueError: <class 'list'> has no matching SQLAlchemy type in affected test scripts. Source: issue #382.
Server entry point — the memu-server console script has been reported to point to a missing memu.server.cli module on main; users packaging the server should verify the entry point exists in their installed version. Source: issue #354.

These limitations are tracked in the issue tracker and the release notes (currently at v1.5.1, which includes an alembic URL interpolation fix). Source: v1.5.1 release notes.

Storage Backends & Data Model

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section In-Memory

Continue reading this section for the full explanation and source context.

Section SQLite

Continue reading this section for the full explanation and source context.

Section Postgres

Continue reading this section for the full explanation and source context.

Storage Backends & Data Model

The storage layer is the persistence core of memU. It defines the canonical data model — Resource, MemoryItem, MemoryCategory, and the join record CategoryItem — and provides pluggable repository implementations behind a shared contract. The runtime is decoupled from any specific engine: callers go through DatabaseFactory, which returns a backend that satisfies the same interfaces used during memorize() and retrieve(). The README explicitly advertises "in-memory, SQLite, or Postgres backends with the same repository contracts" as a core feature. Source: README.md

Data Model

The model treats memory as a workspace hierarchy rather than a flat table.

Record	Role	Key Fields
`Resource`	Raw source artifact (chat log, document, image caption)	`url`, `modality`, `local_path`, `caption`, `embedding`
`MemoryItem`	Typed atomic memory extracted from a resource	`memory_type`, `summary`, `extra`, `happened_at`, `embedding`
`MemoryCategory`	Folder grouping items by topic with an evolving summary	`name`, `description`, `summary`, `embedding`
`CategoryItem`	Many-to-many relation linking items to categories	`category_id`, `item_id`

memory_type is an enum-like string drawn from a fixed set: profile, event, knowledge, behavior, skill, tool. The README and the per-type prompt modules (src/memu/prompts/memory_type/*.py) confirm this set. Each prompt module defines extraction rules for its type — for example, event.py requires a declarative sentence with a timestamp, while behavior.py insists on recurring patterns rather than one-off actions.

flowchart TB
    R[Resource] -->|extract| MI[MemoryItem]
    MI -->|belongs to| CI[CategoryItem]
    MC[MemoryCategory] -->|groups| CI
    MC -.embedding.-> VS[(Vector Store)]
    MI -.embedding.-> VS
    R -.embedding.-> VS

All three primary records carry an embedding field, which the retrieve pipeline uses for semantic ranking across both category-level (broad context) and item-level (precise facts) queries.

Backend Implementations

In-Memory

The default backend stores everything in Python dictionaries and is intended for tests, demos, and ephemeral agent runs. Its repositories (src/memu/database/inmemory/repo.py) cache Resource, MemoryItem, MemoryCategory, and CategoryItem records in instance attributes and write through synchronously. No process restart can recover state.

SQLite

Introduced in v1.2.0 (release notes), the SQLite backend uses SQLAlchemy models and a per-record repository layout. The SQLiteMemoryStore defined in src/memu/database/sqlite/sqlite.py composes four repositories — ResourceRepo, MemoryCategoryRepo, MemoryItemRepo, and CategoryItemRepo — and caches each in a dict for read access. The constructor accepts a Pydantic scope_model for user scoping and optional custom model overrides, so application code can extend Resource or MemoryItem without subclassing the store. Source: src/memu/database/sqlite/sqlite.py:23-58

The DSN is passed as a plain SQLAlchemy URL (sqlite:///path/to/db.sqlite). As of v1.5.1 the alembic migration URL interpolation was fixed (commit fd87ceb), so URL templating with environment variables now behaves correctly.

Postgres

The Postgres backend shares the same repository contracts and is the recommended option for production deployments where embeddings and high-volume writes need to coexist with ACID guarantees.

Configuration

The DatabaseConfig block in src/memu/app/settings.py selects the backend and its connection parameters. The BlobConfig controls where raw resources are materialized on disk (resources_dir, default ./data/resources), which is independent of the relational store.

The MemoryFilesConfig block — disabled by default — toggles the on-disk "memory file system" rendering that writes INDEX.md, MEMORY.md, and per-skill SKILL.md files under output_dir. When synthesize=True, those markdown files are generated via an LLM call using the configured synthesis_llm_profile; otherwise they are rendered deterministically from already-extracted records. Source: src/memu/app/settings.py:25-60

The retrieve-side options live in RetrieveCategoryConfig and RetrieveItemConfig: top_k controls how many categories (default 5) and how many items are returned per query. Both are enabled by default.

Common Failure Modes

The community has surfaced three recurring issues that map directly onto the storage layer:

happened_at NULL on conversational input — When MemoryService.memorize() ingests chat messages with explicit timestamps, the resulting memory_items.happened_at column is sometimes stored as NULL even though the source carried a valid time. This indicates a propagation gap between the message envelope and the typed event item writer. Tracked in issue #428.

SQLite embedding serialization — Earlier snapshots of the SQLite backend raised ValueError: <class 'list'> has no matching SQLAlchemy type when persisting embeddings, because the embedding column was declared without a SQLAlchemy type that understands Python list. Reported in issue #382.

memu-server entry point — pyproject.toml configured memu-server = "memu.server.cli:main", but src/memu/server/cli.py was missing from the tree, so the installed console script failed at import time. Reported in issue #354; fixed in a later release.

For all three, the mitigation is the same: pin to a release at or after v1.5.1, where the alembic interpolation fix and prompt/item fallback improvements have landed.

LLM, Embedding & VLM Providers and Routing

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 3.1 The backend contract

Continue reading this section for the full explanation and source context.

Section 3.2 Transports

Continue reading this section for the full explanation and source context.

Section 3.3 The LLM gateway

Continue reading this section for the full explanation and source context.

LLM, Embedding & VLM Providers and Routing

1. Purpose and Scope

memU needs to talk to a heterogeneous set of model providers for three distinct capabilities: text chat / summarization (LLM), vector embeddings (Embedding), and vision-language analysis (VLM). The "Providers and Routing" subsystem is the abstraction layer that decouples the rest of the memory pipeline from any single vendor.

Its responsibilities are:

Normalize the request/response shape of every supported provider behind a small, capability-scoped interface.
Let the rest of memU call client.summarize(...) or client.vision(...) without knowing whether the request actually went to OpenAI, Anthropic, OpenRouter, or a self-hosted HTTP endpoint.
Allow new providers to be added by registering a backend module and (optionally) a transport client, without editing the service composition root.

The README states the project is "Profile-Based LLM Routing" — chat, embedding, vision, and transcription work are routed through configurable LLM profiles, which is implemented via the per-capability gateway.py files.

Source: README.md

2. Package Layout

The two text/image capabilities each live in their own sibling package and follow the same internal shape:

src/memu/llm/                  # text / chat / summarization
├── base.py                    # LLMClient interface
├── gateway.py                 # builds a client from settings.LLMConfig
├── wrapper.py                 # higher-level façade used by services
├── openai_client.py           # OpenAI SDK transport
├── anthropic_client.py        # Anthropic SDK transport
├── lazyllm_client.py          # LazyLLM transport
├── http_client.py             # raw HTTP transport
├── defaults.py                # per-provider default model picks
└── backends/                  # per-provider request/response shape
    ├── base.py
    ├── openrouter.py
    └── ...

src/memu/vlm/                  # image / video understanding (mirrors llm/)
├── base.py                    # VLMClient interface + encode_image()
├── gateway.py                 # builds a VLM client from settings.VLMConfig
├── openai_client.py
├── anthropic_client.py
├── http_client.py
├── defaults.py
└── backends/
    ├── base.py
    └── ...

The package docstrings spell this out explicitly: "backends/: per-provider vision request/response shapes (HTTP transport). http_client/openai_client/anthropic_client: transport clients. gateway: build a client from a :class:memu.app.settings.VLMConfig`."

Source: src/memu/vlm/__init__.py:1-20

3. LLM Provider System

3.1 The backend contract

Every LLM provider is described by a subclass of LLMBackend defined in src/memu/llm/backends/base.py. The base class exposes three customization points:

default_headers(api_key) — returns the auth header set. Defaults to OpenAI-style Authorization: Bearer …; Anthropic overrides this to use x-api-key.
build_summary_payload(...) — converts (text, system_prompt, chat_model, max_tokens) into the provider-specific JSON body.
parse_summary_response(data) — extracts the assistant text from the provider's response envelope.
build_vision_payload(...) — builds the vision request body for text+image prompts (reused from the VLM path on providers that share an endpoint).

Concrete backends such as OpenRouterLLMBackend set summary_endpoint = "/api/v1/chat/completions" and emit OpenAI-compatible message arrays, then parse the response with data["choices"][0]["message"]["content"]. This means many OpenAI-shaped providers can be supported by a single small backend module.

Source: src/memu/llm/backends/base.py:1-44, src/memu/llm/backends/openrouter.py:1-25

3.2 Transports

The transport layer is independent of the backend layer. For each capability there is a small set of clients:

openai_client.py — uses the official OpenAI SDK.
anthropic_client.py — uses the official Anthropic SDK; the gateway strips a stale OpenAI base_url default and falls back to https://api.anthropic.com.
http_client.py — a raw HTTP/httpx transport for any OpenAI-compatible endpoint, useful for self-hosted models and proxies.
lazyllm_client.py — LazyLLM transport.

3.3 The LLM gateway

src/memu/llm/gateway.py exposes a build_llm_client(cfg) function that inspects LLMConfig (base URL, API key, provider name, model) and returns a fully-wired LLMClient. Adding a new transport is a single-line registration in the gateway; adding a new provider that reuses an existing transport is a new backend module.

The LLMClient base class in src/memu/llm/base.py defines the methods the rest of memU calls (summarize, vision, etc.); concrete clients implement them by calling the active backend.

Source: src/memu/llm/base.py, src/memu/llm/gateway.py

flowchart LR
    Settings["LLMConfig<br/>(base_url, api_key, model)"] --> Gateway["llm/gateway.py<br/>build_llm_client()"]
    Gateway -->|provider=openai| OpenAI["openai_client.py"]
    Gateway -->|provider=anthropic| Anthropic["anthropic_client.py"]
    Gateway -->|provider=http| HTTP["http_client.py"]
    Gateway -->|provider=lazyllm| Lazy["lazyllm_client.py"]
    OpenAI --> Backend["backends/*<br/>payload + response shape"]
    Anthropic --> Backend
    HTTP --> Backend
    Lazy --> Backend
    Backend --> Memory["MemoryService<br/>memorize / retrieve"]

4. VLM Provider System (Mirror of LLM)

The vision subsystem is intentionally a near-clone of the LLM subsystem. VLMClient exposes a single multimodal capability, vision(prompt, image_path, ...), and encode_image(image_path) base64-encodes a file and infers the MIME type from the extension (.jpg/.jpeg → image/jpeg, .png → image/png, etc.).

The VLM gateway builds the client from a VLMConfig and registers three builders by default — SDK, Anthropic SDK, and raw HTTP — and shares the same OpenAI-default base_url stripping logic as the LLM gateway so an LLM-targeted config does not leak into a vision call.

Source: src/memu/vlm/base.py:1-44, src/memu/vlm/gateway.py:1-40

4.1 Embeddings

Although embedding is a separate logical capability, it is wired into the same routing model: the v1.5.0 release added HTTP proxy support for "LLM & embedding clients" together, and v1.0.1 explicitly fixed "get embedding client," indicating that embedding clients are constructed through the same gateway pattern. Embedding providers therefore plug in by adding a backend module and (if needed) a transport — no service-layer changes are required.

Source: README release notes for v1.5.0 and v1.0.1

5. Routing, Configuration, and Common Failure Modes

5.1 Profile-based routing

The README summarizes the goal: "Profile-Based LLM Routing — Route chat, embedding, vision, and transcription work through configurable LLM profiles." In practice this means:

A LLMConfig (and its VLM / embedding siblings) is declared once in app/settings.py.
MemoryService and friends never import a specific provider — they only depend on LLMClient / VLMClient interfaces.
Swapping providers is a config change, not a code change.

5.2 Provider-customization table

Concern	Where it lives	What you change
Auth header scheme	`LLMBackend.default_headers` / `VLMBackend.default_headers`	Providers that don't use `Authorization: Bearer …` (e.g. Anthropic's `x-api-key`)
Request body shape	`build_summary_payload` / `build_vision_payload`	Providers whose message/tool format diverges from OpenAI
Response parsing	`parse_summary_response` / `parse_vision_response`	Providers that wrap `choices[0].message.content` differently
Endpoint path	`summary_endpoint` / `vision_endpoint` class attribute	Providers whose path differs from `/chat/completions`
Transport (SDK vs HTTP)	`llm/gateway.py` / `vlm/gateway.py`	Self-hosted endpoints, proxies, LazyLLM
Default model	`defaults.py`	Picking the latest recommended model per provider

Source: src/memu/llm/backends/base.py:1-44, src/memu/vlm/backends/base.py:1-40

5.3 Known limitations and community-reported issues

Stale entry point on a release line — memu-server was configured to point to a memu.server.cli module that did not exist on main, breaking startup. The fix lives in the packaging layer (pyproject.toml), not in the LLM/VLM code, but it is the kind of regression users hit immediately and is worth being aware of when upgrading. See issue #354.
SQLite + embeddings — a community-reported bug ("sqlite backend embding issue", #382) raised ValueError: <class 'list'> has no matching SQLAlchemy type because the embedding column is stored as a list/JSON shape. When the embedding provider returns vectors, the persistence layer must serialize them; ensure your backend writes JSON, not Python list, when the target dialect does not support array types natively.
HTTP proxy support — added in v1.5.0 across LLM and embedding clients. If you sit behind a corporate proxy, set the new proxy option on the relevant *Config rather than patching transports.
Embedding client wiring — v1.0.1's "get embedding client" fix is a reminder that embedding construction goes through the gateway; if a custom embedding provider is added, the corresponding builder must be registered there or build_*_client will return None.

Source: community issues #354, #382; release notes v1.0.1, v1.5.0.

Memorize & Retrieve Workflows, Preprocessing and Prompts

Related topics: System Overview & Architecture, Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 2.1 Preprocessing

Continue reading this section for the full explanation and source context.

Section 2.2 Type-Specific Extraction

Continue reading this section for the full explanation and source context.

Section 2.3 Category Routing and Summarization

Continue reading this section for the full explanation and source context.

Memorize & Retrieve Workflows, Preprocessing and Prompts

1. Overview and Compiled Workspace

memU exposes a two-operation API to agents: memorize() and retrieve(). Internally, these operations compile raw sources (chat logs, documents, logs, images) into a navigable workspace and then serve only the layers that match a query. The README frames the design as a "compiled workspace" that emulates a file system: MemoryCategory records act as folders, MemoryItem records act as files, and Resource records preserve the original artifact that produced each memory (README.md).

The compiled workspace follows a fixed shape:

MemoryCategory                       (folder: topic with evolving summary)
├── name, description, summary, embedding
└── MemoryItem[]                     (files: typed, atomic memories)
    ├── memory_type: profile | event | knowledge | behavior | skill | tool
    ├── summary, extra, happened_at, embedding
    └── Resource                     (source: raw file)
        └── url, modality, local_path, caption, embedding

memorize() accepts a resource_url and a modality (e.g. conversation, document) and returns a dictionary describing what was written. retrieve() accepts queries and a where filter (typically user_id) and returns only the folders and files relevant to the request, ranked for prompt injection. The README's "WRITE — memorize()" / "READ — retrieve()" diagram describes the full flow as: raw files → extract → files + folders → persist; and query → walk folders → ranked files (README.md).

2. The Memorize Pipeline

The memorize path lives under src/memu/app/memorize.py, which orchestrates preprocessing, type-specific extraction, category routing, embedding, and persistence through the CRUD layer in src/memu/app/crud.py. The pipeline has five logical stages (README.md):

Preprocess — convert the raw resource into a clean text representation
Extract — run one or more memory-type extractors to produce typed items
Categorize & link — place items into MemoryCategory folders, cross-link, and embed
Summarize — produce/refresh per-category summaries
Persist — write items, relations, embeddings, and summaries through the repository contract

flowchart LR
    A[Raw resource<br/>conversation / document] --> B[Preprocess]
    B --> C[Memory-type extractors<br/>profile · event · knowledge · behavior · skill · tool]
    C --> D[MemoryItem + MemoryCategory]
    D --> E[Embedding + Category Summary]
    E --> F[(Repository<br/>in-memory / SQLite / Postgres)]
    F --> G[retrieve<br/>walk folders → ranked files]

2.1 Preprocessing

src/memu/preprocess/base.py defines the abstract preprocessor contract, and concrete implementations live in src/memu/preprocess/conversation.py and src/memu/preprocess/document.py. The conversation preprocessor normalizes message lists and timestamps (relevant to the happened_at column), while the document preprocessor delegates to the LLM via the prompt in src/memu/prompts/preprocess/document.py to produce a condensed version plus a one-sentence caption (src/memu/prompts/preprocess/document.py).

Community note: Issue #428 reports that the memory_items.happend_at column stores NULL even when messages carry proper timestamps. This is a known data-flow bug in the conversation preprocessing → extract path; verify preprocessor output before relying on happened_at.

2.2 Type-Specific Extraction

The src/memu/prompts/memory_type/ package contains one prompt module per supported memory type. Each module follows a similar block-based structure: an OBJECTIVE block identifying the role (e.g. "professional User Memory Extractor"), a WORKFLOW block, a RULES block, a CATEGORY block listing the target categories, an OUTPUT block specifying the response shape, and an EXAMPLES block.

Memory type	Prompt module	Extraction target	Output shape
`event`	src/memu/prompts/memory_type/event.py	Time-bounded happenings involving the user (with time, place, participants)	One declarative sentence per item
`knowledge`	src/memu/prompts/memory_type/knowledge.py	Objective facts, concepts, definitions, explanations	Single-line plain text, `< 50 words` per item
`behavior`	src/memu/prompts/memory_type/behavior.py	Recurring patterns, routines, solutions (not one-time events)	Single or multi-line record of a pattern
`skill`	src/memu/prompts/memory_type/skill.py	Actionable skill profiles with frontmatter and sectioned markdown	Full SKILL.md body, `≥ 300 words`
`tool`	src/memu/prompts/memory_type/tool.py	Tool usage patterns: name, use case, outcome, retrieval hint	XML `<memory>` with `<when_to_use>`

All five type prompts share these guard rails: items must be in the same language as the source, items must be self-contained without context, identical/similar items must be merged into a single category, and forbidden content includes illegal/harmful topics, opinions without factual basis (in knowledge), and assistant-only turns (in behavior) (src/memu/prompts/memory_type/knowledge.py, src/memu/prompts/memory_type/behavior.py).

The skill prompt is the most demanding: it requires a YAML frontmatter (name, description, category, demonstrated-in) followed by Core Principles, When to Use This Skill, Implementation Guide (with Prerequisites, Techniques and Approaches, Example from Resource), Success Patterns, Common Pitfalls, and Key Takeaways, with a 300-word minimum to guarantee reusability (src/memu/prompts/memory_type/skill.py). The tool prompt produces a lighter record but adds an explicit <when_to_use> field so retrieval can match the right tool to the right task (src/memu/prompts/memory_type/tool.py).

2.3 Category Routing and Summarization

After extraction, items are placed into MemoryCategory folders, and src/memu/prompts/category_summary/category.py is invoked to maintain a per-category user markdown profile. The category prompt implements a three-step workflow: parse the initial profile and new items, perform Update (conflict detection, validity priority, overwrite/supplement) and Add (deduplication, category matching, insertion) operations, and re-render the result as a Markdown hierarchy with H1 category titles and H2 sub-categories (src/memu/prompts/category_summary/category.py).

The prompt also enforces explicit exclusions: vague or non-user items are dropped, one-time events without long-term relevance (e.g. "ate Malatang today") are removed, and assistant-introduced content is rejected. The final output is only the updated Markdown profile — no explanations or operation traces.

3. The Retrieve Pipeline

src/memu/app/retrieve.py consumes a queries list (each a role/content pair) and a where filter (typically {"user_id": "..."}). It walks the MemoryCategory tree, scores items against the query embeddings, and returns only the relevant files and folder summaries as a dictionary suitable for prompt injection (README.md, src/memu/app/retrieve.py). The persisted shape is identical to what memorize() returns, so downstream agents can treat both responses uniformly.

4. Common Failure Modes

Several community-reported issues map directly onto this workflow:

happened_at is NULL for conversation memories — issue #428. The conversation preprocessor must propagate per-message timestamps into the extracted event items; check src/memu/preprocess/conversation.py and the event extractor in src/memu/prompts/memory_type/event.py when diagnosing.
SQLite backend rejects the embedding column — issue #382. The SQLAlchemy type for embedding (a list) is not registered, raising ValueError: <class 'list'> has no matching SQLAlchemy type. The fix typically lives in the SQLite repository under src/memu/app/crud.py.
memu-server entry point points to a missing module — issue #354. pyproject.toml declares memu-server = "memu.server.cli:main" but src/memu/server/cli.py is missing on main; the entry point and the module must be kept in sync.
Alembic URL interpolation — fixed in v1.5.1 (fd87ceb) to use correct interpolation, which affects migration-driven schema updates for the MemoryItem and MemoryCategory tables.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 10 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/428

2. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/354

3. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/382

4. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/NevaMind-AI/memU

5. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/NevaMind-AI/memU

6. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/NevaMind-AI/memU

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/NevaMind-AI/memU

9. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

10. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using memU with real data or production workflows.

[[BUG] memory_items table's happend_at collumn stores NULL value even tho](https://github.com/NevaMind-AI/memU/issues/428) - github / github_issue
Bug: memu-server entry point points to missing module (memu.server.cli) - github / github_issue
[[BUG] sqlite backend embding issue](https://github.com/NevaMind-AI/memU/issues/382) - github / github_issue
v1.5.1 - github / github_release
v1.5.0 - github / github_release
v1.4.0 - github / github_release
v1.3.0 - github / github_release
v1.2.0 - github / github_release
v1.1.2 - github / github_release
v1.1.1 - github / github_release
v1.1.0 - github / github_release
v1.0.1 - github / github_release

Source: Project Pack community evidence and pitfall evidence

memU

System Overview & Architecture

Related Pages

System Overview & Architecture

Purpose and Scope

The Compiled Workspace

Runtime Operations

Typed Memory Extraction

Preprocessing Stage

Category Organization and Retrieval

Community Notes and Known Limitations

See Also

Storage Backends & Data Model

Related Pages

Storage Backends & Data Model

Data Model

Backend Implementations

In-Memory

SQLite

Postgres

Configuration

Common Failure Modes

See Also

LLM, Embedding & VLM Providers and Routing

Related Pages

LLM, Embedding & VLM Providers and Routing

1. Purpose and Scope

2. Package Layout

3. LLM Provider System

3.1 The backend contract

3.2 Transports

3.3 The LLM gateway

4. VLM Provider System (Mirror of LLM)

4.1 Embeddings

5. Routing, Configuration, and Common Failure Modes

5.1 Profile-based routing

5.2 Provider-customization table

5.3 Known limitations and community-reported issues

See Also

Memorize & Retrieve Workflows, Preprocessing and Prompts

Related Pages

Memorize & Retrieve Workflows, Preprocessing and Prompts

1. Overview and Compiled Workspace

2. The Memorize Pipeline

2.1 Preprocessing

2.2 Type-Specific Extraction

2.3 Category Routing and Summarization

3. The Retrieve Pipeline

4. Common Failure Modes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Configuration risk: Configuration risk requires verification

2. Installation risk: Installation risk requires verification

3. Installation risk: Installation risk requires verification

4. Configuration risk: Configuration risk requires verification

5. Capability evidence risk: Capability evidence risk requires verification

6. Maintenance risk: Maintenance risk requires verification

7. Security or permission risk: Security or permission risk requires verification

8. Security or permission risk: Security or permission risk requires verification

9. Maintenance risk: Maintenance risk requires verification

10. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence