Doramagic Project Pack · Human Manual

memU

From workspace to agent memory

System Overview & Architecture

Related topics: Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Runtime Operations

Continue reading this section for the full explanation and source context.

Section Preprocessing Stage

Continue reading this section for the full explanation and source context.

Related topics: Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing, Memorize & Retrieve Workflows, Preprocessing and Prompts

System Overview & Architecture

Purpose and Scope

memU is an agent memory framework that turns raw sources (chat logs, documents, deployment logs, agent traces) into a structured, self-organizing compiled workspace that can be queried on demand. The framework exposes two runtime operations: memorize() for writing new sources into the workspace, and retrieve() for serving only the relevant layers back to an agent. Source: README.md.

The framework targets three concrete needs of long-running agents:

  • Context — inject the right facts, preferences, and source material instead of a cold prompt.
  • Continuity — the workspace persists and self-organizes across sessions, sources, and tasks.
  • Control — every record is structured and inspectable, tracing back to its raw source for auditing and editing.

Source: README.md.

The framework is delivered as a Python package (with both Linux x86_64 and ARM64 wheel targets added in v1.1.0) and offers a pluggable storage layer supporting in-memory, SQLite, and Postgres backends behind a shared repository contract. Source: README.md, v1.1.0 release notes.

The Compiled Workspace

The primary output of memU is a navigable workspace composed of three layered records. Source: README.md.

RecordRoleContents
MemoryCategoryFolder — a topic with an evolving summaryname, description, summary, embedding, child MemoryItem[]
MemoryItemFile — a typed atomic memorymemory_type ∈ {profile, event, knowledge, behavior, skill, tool}, summary, extra, happened_at, embedding
ResourceSource — the raw artifact behind the memoryurl, modality, local_path, caption, embedding

The on-disk projection of this workspace is a tree of INDEX.md, MEMORY.md, and per-skill SKILL.md files, persisted through the configured storage backend. Source: README.md.

Runtime Operations

WRITE — memorize()                                   READ — retrieve()
─────────────────────────────────────────            ─────────────────────────────────────────
raw files → extract → files + folders               query → walk folders → ranked files
persist via repository contracts                     return scoped, ranked context

Source: README.md.

Both operations are exposed through the MemoryService class. The memorize() method accepts a resource_url and a modality (e.g. conversation, document) along with scoping metadata such as user_id. The retrieve() method accepts a list of query items and a where filter for scoped lookup. Source: README.md.

Typed Memory Extraction

memU classifies extracted memory into six canonical types, each with a dedicated prompt module that enforces type-specific rules. Source: README.md.

flowchart LR
    A[Raw Resource] --> B[Preprocess]
    B --> C{Type Router}
    C -->|profile| P[profile.py]
    C -->|event| E[event.py]
    C -->|knowledge| K[knowledge.py]
    C -->|behavior| Bp[behavior.py]
    C -->|skill| S[skill.py]
    C -->|tool| T[tool.py]
    P --> M[MemoryItem]
    E --> M
    K --> M
    Bp --> M
    S --> M
    T --> M
    M --> Cat[MemoryCategory]
  • Knowledge — declarative facts, concepts, and explanations; forbids opinions, personal experiences, and user-specific traits. Items must be self-contained, under ~50 words, and merged when redundant. Source: src/memu/prompts/memory_type/knowledge.py.
  • Behavior — recurring patterns, routines, and solutions; forbids one-time actions unless they reveal a significant pattern, and forbids assistant-only turns. Items must use the word "user" consistently to attribute the subject. Source: src/memu/prompts/memory_type/behavior.py.
  • Skill — comprehensive skill profiles with YAML frontmatter (name, description, category, demonstrated-in) and full sections (Core Principles, When to Use, Implementation Guide, Success Patterns, Common Pitfalls, Key Takeaways). Each profile must be at least 300 words to ensure depth and actionability. Source: src/memu/prompts/memory_type/skill.py.
  • Tool — tool usage patterns including tool name, scenario, outcome, and a when_to_use retrieval hint, capturing both successful patterns and failure lessons. Source: src/memu/prompts/memory_type/tool.py.

Preprocessing Stage

Before typed extraction, document inputs are condensed by a dedicated preprocess prompt that produces two outputs: a processed_content block that preserves all key information while removing verbosity, and a single-sentence caption summarizing the document. Source: src/memu/prompts/preprocess/document.py.

Category Organization and Retrieval

After extraction, memory items are sorted into MemoryCategory folders, cross-linked, embedded, and summarized into a browsable tree. Source: README.md.

Two summary prompt variants are used to keep the folder summaries current:

During retrieve(), the framework navigates these folders and returns only the files relevant to the current user, agent, session, or task, returning scoped, ranked context that can be injected into any agent workflow. Source: README.md.

Community Notes and Known Limitations

Several issues reported by the community are worth noting for anyone reading this overview:

  • Storage column bug — the memory_items.happened_at column has been observed to store NULL even when proper timestamps are provided in conversational input. This affects the optional temporal metadata on memory items added in v1.3.0. Source: issue #428.
  • SQLite embedding storage — the SQLite backend has had issues storing embedding fields as native lists, raising ValueError: <class 'list'> has no matching SQLAlchemy type in affected test scripts. Source: issue #382.
  • Server entry point — the memu-server console script has been reported to point to a missing memu.server.cli module on main; users packaging the server should verify the entry point exists in their installed version. Source: issue #354.

These limitations are tracked in the issue tracker and the release notes (currently at v1.5.1, which includes an alembic URL interpolation fix). Source: v1.5.1 release notes.

See Also

  • Storage backends and repository contracts (in-memory, SQLite, Postgres)
  • LLM profile configuration and routing
  • MemoryService.memorize() / retrieve() API reference
  • Custom memory type prompt registration
  • Category summary reference format and inline citations

Source: https://github.com/NevaMind-AI/memU / Human Manual

Storage Backends & Data Model

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section In-Memory

Continue reading this section for the full explanation and source context.

Section SQLite

Continue reading this section for the full explanation and source context.

Section Postgres

Continue reading this section for the full explanation and source context.

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

Storage Backends & Data Model

The storage layer is the persistence core of memU. It defines the canonical data model — Resource, MemoryItem, MemoryCategory, and the join record CategoryItem — and provides pluggable repository implementations behind a shared contract. The runtime is decoupled from any specific engine: callers go through DatabaseFactory, which returns a backend that satisfies the same interfaces used during memorize() and retrieve(). The README explicitly advertises "in-memory, SQLite, or Postgres backends with the same repository contracts" as a core feature. Source: README.md

Data Model

The model treats memory as a workspace hierarchy rather than a flat table.

RecordRoleKey Fields
ResourceRaw source artifact (chat log, document, image caption)url, modality, local_path, caption, embedding
MemoryItemTyped atomic memory extracted from a resourcememory_type, summary, extra, happened_at, embedding
MemoryCategoryFolder grouping items by topic with an evolving summaryname, description, summary, embedding
CategoryItemMany-to-many relation linking items to categoriescategory_id, item_id

memory_type is an enum-like string drawn from a fixed set: profile, event, knowledge, behavior, skill, tool. The README and the per-type prompt modules (src/memu/prompts/memory_type/*.py) confirm this set. Each prompt module defines extraction rules for its type — for example, event.py requires a declarative sentence with a timestamp, while behavior.py insists on recurring patterns rather than one-off actions.

flowchart TB
    R[Resource] -->|extract| MI[MemoryItem]
    MI -->|belongs to| CI[CategoryItem]
    MC[MemoryCategory] -->|groups| CI
    MC -.embedding.-> VS[(Vector Store)]
    MI -.embedding.-> VS
    R -.embedding.-> VS

All three primary records carry an embedding field, which the retrieve pipeline uses for semantic ranking across both category-level (broad context) and item-level (precise facts) queries.

Backend Implementations

In-Memory

The default backend stores everything in Python dictionaries and is intended for tests, demos, and ephemeral agent runs. Its repositories (src/memu/database/inmemory/repo.py) cache Resource, MemoryItem, MemoryCategory, and CategoryItem records in instance attributes and write through synchronously. No process restart can recover state.

SQLite

Introduced in v1.2.0 (release notes), the SQLite backend uses SQLAlchemy models and a per-record repository layout. The SQLiteMemoryStore defined in src/memu/database/sqlite/sqlite.py composes four repositories — ResourceRepo, MemoryCategoryRepo, MemoryItemRepo, and CategoryItemRepo — and caches each in a dict for read access. The constructor accepts a Pydantic scope_model for user scoping and optional custom model overrides, so application code can extend Resource or MemoryItem without subclassing the store. Source: src/memu/database/sqlite/sqlite.py:23-58

The DSN is passed as a plain SQLAlchemy URL (sqlite:///path/to/db.sqlite). As of v1.5.1 the alembic migration URL interpolation was fixed (commit fd87ceb), so URL templating with environment variables now behaves correctly.

Postgres

The Postgres backend shares the same repository contracts and is the recommended option for production deployments where embeddings and high-volume writes need to coexist with ACID guarantees.

Configuration

The DatabaseConfig block in src/memu/app/settings.py selects the backend and its connection parameters. The BlobConfig controls where raw resources are materialized on disk (resources_dir, default ./data/resources), which is independent of the relational store.

The MemoryFilesConfig block — disabled by default — toggles the on-disk "memory file system" rendering that writes INDEX.md, MEMORY.md, and per-skill SKILL.md files under output_dir. When synthesize=True, those markdown files are generated via an LLM call using the configured synthesis_llm_profile; otherwise they are rendered deterministically from already-extracted records. Source: src/memu/app/settings.py:25-60

The retrieve-side options live in RetrieveCategoryConfig and RetrieveItemConfig: top_k controls how many categories (default 5) and how many items are returned per query. Both are enabled by default.

Common Failure Modes

The community has surfaced three recurring issues that map directly onto the storage layer:

  1. happened_at NULL on conversational input — When MemoryService.memorize() ingests chat messages with explicit timestamps, the resulting memory_items.happened_at column is sometimes stored as NULL even though the source carried a valid time. This indicates a propagation gap between the message envelope and the typed event item writer. Tracked in issue #428.
  1. SQLite embedding serialization — Earlier snapshots of the SQLite backend raised ValueError: <class 'list'> has no matching SQLAlchemy type when persisting embeddings, because the embedding column was declared without a SQLAlchemy type that understands Python list. Reported in issue #382.
  1. memu-server entry pointpyproject.toml configured memu-server = "memu.server.cli:main", but src/memu/server/cli.py was missing from the tree, so the installed console script failed at import time. Reported in issue #354; fixed in a later release.

For all three, the mitigation is the same: pin to a release at or after v1.5.1, where the alembic interpolation fix and prompt/item fallback improvements have landed.

See Also

  • Memory Extraction Prompts — per-type rules for profile, event, knowledge, behavior, skill, tool.
  • Configuration Reference — full list of DatabaseConfig, BlobConfig, and retrieve options.

Source: https://github.com/NevaMind-AI/memU / Human Manual

LLM, Embedding & VLM Providers and Routing

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 3.1 The backend contract

Continue reading this section for the full explanation and source context.

Section 3.2 Transports

Continue reading this section for the full explanation and source context.

Section 3.3 The LLM gateway

Continue reading this section for the full explanation and source context.

Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts

LLM, Embedding & VLM Providers and Routing

1. Purpose and Scope

memU needs to talk to a heterogeneous set of model providers for three distinct capabilities: text chat / summarization (LLM), vector embeddings (Embedding), and vision-language analysis (VLM). The "Providers and Routing" subsystem is the abstraction layer that decouples the rest of the memory pipeline from any single vendor.

Its responsibilities are:

  • Normalize the request/response shape of every supported provider behind a small, capability-scoped interface.
  • Let the rest of memU call client.summarize(...) or client.vision(...) without knowing whether the request actually went to OpenAI, Anthropic, OpenRouter, or a self-hosted HTTP endpoint.
  • Allow new providers to be added by registering a backend module and (optionally) a transport client, without editing the service composition root.

The README states the project is "Profile-Based LLM Routing" — chat, embedding, vision, and transcription work are routed through configurable LLM profiles, which is implemented via the per-capability gateway.py files.

Source: README.md

2. Package Layout

The two text/image capabilities each live in their own sibling package and follow the same internal shape:

src/memu/llm/                  # text / chat / summarization
├── base.py                    # LLMClient interface
├── gateway.py                 # builds a client from settings.LLMConfig
├── wrapper.py                 # higher-level façade used by services
├── openai_client.py           # OpenAI SDK transport
├── anthropic_client.py        # Anthropic SDK transport
├── lazyllm_client.py          # LazyLLM transport
├── http_client.py             # raw HTTP transport
├── defaults.py                # per-provider default model picks
└── backends/                  # per-provider request/response shape
    ├── base.py
    ├── openrouter.py
    └── ...

src/memu/vlm/                  # image / video understanding (mirrors llm/)
├── base.py                    # VLMClient interface + encode_image()
├── gateway.py                 # builds a VLM client from settings.VLMConfig
├── openai_client.py
├── anthropic_client.py
├── http_client.py
├── defaults.py
└── backends/
    ├── base.py
    └── ...

The package docstrings spell this out explicitly: "backends/: per-provider vision request/response shapes (HTTP transport). http_client/openai_client/anthropic_client: transport clients. gateway: build a client from a :class:memu.app.settings.VLMConfig`."

Source: src/memu/vlm/__init__.py:1-20

3. LLM Provider System

3.1 The backend contract

Every LLM provider is described by a subclass of LLMBackend defined in src/memu/llm/backends/base.py. The base class exposes three customization points:

  • default_headers(api_key) — returns the auth header set. Defaults to OpenAI-style Authorization: Bearer …; Anthropic overrides this to use x-api-key.
  • build_summary_payload(...) — converts (text, system_prompt, chat_model, max_tokens) into the provider-specific JSON body.
  • parse_summary_response(data) — extracts the assistant text from the provider's response envelope.
  • build_vision_payload(...) — builds the vision request body for text+image prompts (reused from the VLM path on providers that share an endpoint).

Concrete backends such as OpenRouterLLMBackend set summary_endpoint = "/api/v1/chat/completions" and emit OpenAI-compatible message arrays, then parse the response with data["choices"][0]["message"]["content"]. This means many OpenAI-shaped providers can be supported by a single small backend module.

Source: src/memu/llm/backends/base.py:1-44, src/memu/llm/backends/openrouter.py:1-25

3.2 Transports

The transport layer is independent of the backend layer. For each capability there is a small set of clients:

  • openai_client.py — uses the official OpenAI SDK.
  • anthropic_client.py — uses the official Anthropic SDK; the gateway strips a stale OpenAI base_url default and falls back to https://api.anthropic.com.
  • http_client.py — a raw HTTP/httpx transport for any OpenAI-compatible endpoint, useful for self-hosted models and proxies.
  • lazyllm_client.py — LazyLLM transport.

3.3 The LLM gateway

src/memu/llm/gateway.py exposes a build_llm_client(cfg) function that inspects LLMConfig (base URL, API key, provider name, model) and returns a fully-wired LLMClient. Adding a new transport is a single-line registration in the gateway; adding a new provider that reuses an existing transport is a new backend module.

The LLMClient base class in src/memu/llm/base.py defines the methods the rest of memU calls (summarize, vision, etc.); concrete clients implement them by calling the active backend.

Source: src/memu/llm/base.py, src/memu/llm/gateway.py
flowchart LR
    Settings["LLMConfig<br/>(base_url, api_key, model)"] --> Gateway["llm/gateway.py<br/>build_llm_client()"]
    Gateway -->|provider=openai| OpenAI["openai_client.py"]
    Gateway -->|provider=anthropic| Anthropic["anthropic_client.py"]
    Gateway -->|provider=http| HTTP["http_client.py"]
    Gateway -->|provider=lazyllm| Lazy["lazyllm_client.py"]
    OpenAI --> Backend["backends/*<br/>payload + response shape"]
    Anthropic --> Backend
    HTTP --> Backend
    Lazy --> Backend
    Backend --> Memory["MemoryService<br/>memorize / retrieve"]

4. VLM Provider System (Mirror of LLM)

The vision subsystem is intentionally a near-clone of the LLM subsystem. VLMClient exposes a single multimodal capability, vision(prompt, image_path, ...), and encode_image(image_path) base64-encodes a file and infers the MIME type from the extension (.jpg/.jpeg → image/jpeg, .png → image/png, etc.).

The VLM gateway builds the client from a VLMConfig and registers three builders by default — SDK, Anthropic SDK, and raw HTTP — and shares the same OpenAI-default base_url stripping logic as the LLM gateway so an LLM-targeted config does not leak into a vision call.

Source: src/memu/vlm/base.py:1-44, src/memu/vlm/gateway.py:1-40

4.1 Embeddings

Although embedding is a separate logical capability, it is wired into the same routing model: the v1.5.0 release added HTTP proxy support for "LLM & embedding clients" together, and v1.0.1 explicitly fixed "get embedding client," indicating that embedding clients are constructed through the same gateway pattern. Embedding providers therefore plug in by adding a backend module and (if needed) a transport — no service-layer changes are required.

Source: README release notes for v1.5.0 and v1.0.1

5. Routing, Configuration, and Common Failure Modes

5.1 Profile-based routing

The README summarizes the goal: "Profile-Based LLM Routing — Route chat, embedding, vision, and transcription work through configurable LLM profiles." In practice this means:

  • A LLMConfig (and its VLM / embedding siblings) is declared once in app/settings.py.
  • MemoryService and friends never import a specific provider — they only depend on LLMClient / VLMClient interfaces.
  • Swapping providers is a config change, not a code change.

5.2 Provider-customization table

ConcernWhere it livesWhat you change
Auth header schemeLLMBackend.default_headers / VLMBackend.default_headersProviders that don't use Authorization: Bearer … (e.g. Anthropic's x-api-key)
Request body shapebuild_summary_payload / build_vision_payloadProviders whose message/tool format diverges from OpenAI
Response parsingparse_summary_response / parse_vision_responseProviders that wrap choices[0].message.content differently
Endpoint pathsummary_endpoint / vision_endpoint class attributeProviders whose path differs from /chat/completions
Transport (SDK vs HTTP)llm/gateway.py / vlm/gateway.pySelf-hosted endpoints, proxies, LazyLLM
Default modeldefaults.pyPicking the latest recommended model per provider
Source: src/memu/llm/backends/base.py:1-44, src/memu/vlm/backends/base.py:1-40

5.3 Known limitations and community-reported issues

  • Stale entry point on a release linememu-server was configured to point to a memu.server.cli module that did not exist on main, breaking startup. The fix lives in the packaging layer (pyproject.toml), not in the LLM/VLM code, but it is the kind of regression users hit immediately and is worth being aware of when upgrading. See issue #354.
  • SQLite + embeddings — a community-reported bug ("sqlite backend embding issue", #382) raised ValueError: <class 'list'> has no matching SQLAlchemy type because the embedding column is stored as a list/JSON shape. When the embedding provider returns vectors, the persistence layer must serialize them; ensure your backend writes JSON, not Python list, when the target dialect does not support array types natively.
  • HTTP proxy support — added in v1.5.0 across LLM and embedding clients. If you sit behind a corporate proxy, set the new proxy option on the relevant *Config rather than patching transports.
  • Embedding client wiring — v1.0.1's "get embedding client" fix is a reminder that embedding construction goes through the gateway; if a custom embedding provider is added, the corresponding builder must be registered there or build_*_client will return None.
Source: community issues #354, #382; release notes v1.0.1, v1.5.0.

See Also

  • Storage Backends (in-memory, SQLite, Postgres)
  • MemoryService: memorize and retrieve
  • Configuration: app/settings.py and *Config types
  • Multimodal Pipeline: image and video ingestion

Source: https://github.com/NevaMind-AI/memU / Human Manual

Memorize & Retrieve Workflows, Preprocessing and Prompts

Related topics: System Overview & Architecture, Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 2.1 Preprocessing

Continue reading this section for the full explanation and source context.

Section 2.2 Type-Specific Extraction

Continue reading this section for the full explanation and source context.

Section 2.3 Category Routing and Summarization

Continue reading this section for the full explanation and source context.

Related topics: System Overview & Architecture, Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing

Memorize & Retrieve Workflows, Preprocessing and Prompts

1. Overview and Compiled Workspace

memU exposes a two-operation API to agents: memorize() and retrieve(). Internally, these operations compile raw sources (chat logs, documents, logs, images) into a navigable workspace and then serve only the layers that match a query. The README frames the design as a "compiled workspace" that emulates a file system: MemoryCategory records act as folders, MemoryItem records act as files, and Resource records preserve the original artifact that produced each memory (README.md).

The compiled workspace follows a fixed shape:

MemoryCategory                       (folder: topic with evolving summary)
├── name, description, summary, embedding
└── MemoryItem[]                     (files: typed, atomic memories)
    ├── memory_type: profile | event | knowledge | behavior | skill | tool
    ├── summary, extra, happened_at, embedding
    └── Resource                     (source: raw file)
        └── url, modality, local_path, caption, embedding

memorize() accepts a resource_url and a modality (e.g. conversation, document) and returns a dictionary describing what was written. retrieve() accepts queries and a where filter (typically user_id) and returns only the folders and files relevant to the request, ranked for prompt injection. The README's "WRITE — memorize()" / "READ — retrieve()" diagram describes the full flow as: raw files → extract → files + folders → persist; and query → walk folders → ranked files (README.md).

2. The Memorize Pipeline

The memorize path lives under src/memu/app/memorize.py, which orchestrates preprocessing, type-specific extraction, category routing, embedding, and persistence through the CRUD layer in src/memu/app/crud.py. The pipeline has five logical stages (README.md):

  1. Preprocess — convert the raw resource into a clean text representation
  2. Extract — run one or more memory-type extractors to produce typed items
  3. Categorize & link — place items into MemoryCategory folders, cross-link, and embed
  4. Summarize — produce/refresh per-category summaries
  5. Persist — write items, relations, embeddings, and summaries through the repository contract
flowchart LR
    A[Raw resource<br/>conversation / document] --> B[Preprocess]
    B --> C[Memory-type extractors<br/>profile · event · knowledge · behavior · skill · tool]
    C --> D[MemoryItem + MemoryCategory]
    D --> E[Embedding + Category Summary]
    E --> F[(Repository<br/>in-memory / SQLite / Postgres)]
    F --> G[retrieve<br/>walk folders → ranked files]

2.1 Preprocessing

src/memu/preprocess/base.py defines the abstract preprocessor contract, and concrete implementations live in src/memu/preprocess/conversation.py and src/memu/preprocess/document.py. The conversation preprocessor normalizes message lists and timestamps (relevant to the happened_at column), while the document preprocessor delegates to the LLM via the prompt in src/memu/prompts/preprocess/document.py to produce a condensed version plus a one-sentence caption (src/memu/prompts/preprocess/document.py).

Community note: Issue #428 reports that the memory_items.happend_at column stores NULL even when messages carry proper timestamps. This is a known data-flow bug in the conversation preprocessing → extract path; verify preprocessor output before relying on happened_at.

2.2 Type-Specific Extraction

The src/memu/prompts/memory_type/ package contains one prompt module per supported memory type. Each module follows a similar block-based structure: an OBJECTIVE block identifying the role (e.g. "professional User Memory Extractor"), a WORKFLOW block, a RULES block, a CATEGORY block listing the target categories, an OUTPUT block specifying the response shape, and an EXAMPLES block.

Memory typePrompt moduleExtraction targetOutput shape
eventsrc/memu/prompts/memory_type/event.pyTime-bounded happenings involving the user (with time, place, participants)One declarative sentence per item
knowledgesrc/memu/prompts/memory_type/knowledge.pyObjective facts, concepts, definitions, explanationsSingle-line plain text, < 50 words per item
behaviorsrc/memu/prompts/memory_type/behavior.pyRecurring patterns, routines, solutions (not one-time events)Single or multi-line record of a pattern
skillsrc/memu/prompts/memory_type/skill.pyActionable skill profiles with frontmatter and sectioned markdownFull SKILL.md body, ≥ 300 words
toolsrc/memu/prompts/memory_type/tool.pyTool usage patterns: name, use case, outcome, retrieval hintXML <memory> with <when_to_use>

All five type prompts share these guard rails: items must be in the same language as the source, items must be self-contained without context, identical/similar items must be merged into a single category, and forbidden content includes illegal/harmful topics, opinions without factual basis (in knowledge), and assistant-only turns (in behavior) (src/memu/prompts/memory_type/knowledge.py, src/memu/prompts/memory_type/behavior.py).

The skill prompt is the most demanding: it requires a YAML frontmatter (name, description, category, demonstrated-in) followed by Core Principles, When to Use This Skill, Implementation Guide (with Prerequisites, Techniques and Approaches, Example from Resource), Success Patterns, Common Pitfalls, and Key Takeaways, with a 300-word minimum to guarantee reusability (src/memu/prompts/memory_type/skill.py). The tool prompt produces a lighter record but adds an explicit <when_to_use> field so retrieval can match the right tool to the right task (src/memu/prompts/memory_type/tool.py).

2.3 Category Routing and Summarization

After extraction, items are placed into MemoryCategory folders, and src/memu/prompts/category_summary/category.py is invoked to maintain a per-category user markdown profile. The category prompt implements a three-step workflow: parse the initial profile and new items, perform Update (conflict detection, validity priority, overwrite/supplement) and Add (deduplication, category matching, insertion) operations, and re-render the result as a Markdown hierarchy with H1 category titles and H2 sub-categories (src/memu/prompts/category_summary/category.py).

The prompt also enforces explicit exclusions: vague or non-user items are dropped, one-time events without long-term relevance (e.g. "ate Malatang today") are removed, and assistant-introduced content is rejected. The final output is only the updated Markdown profile — no explanations or operation traces.

3. The Retrieve Pipeline

src/memu/app/retrieve.py consumes a queries list (each a role/content pair) and a where filter (typically {"user_id": "..."}). It walks the MemoryCategory tree, scores items against the query embeddings, and returns only the relevant files and folder summaries as a dictionary suitable for prompt injection (README.md, src/memu/app/retrieve.py). The persisted shape is identical to what memorize() returns, so downstream agents can treat both responses uniformly.

4. Common Failure Modes

Several community-reported issues map directly onto this workflow:

  • happened_at is NULL for conversation memoriesissue #428. The conversation preprocessor must propagate per-message timestamps into the extracted event items; check src/memu/preprocess/conversation.py and the event extractor in src/memu/prompts/memory_type/event.py when diagnosing.
  • SQLite backend rejects the embedding columnissue #382. The SQLAlchemy type for embedding (a list) is not registered, raising ValueError: <class 'list'> has no matching SQLAlchemy type. The fix typically lives in the SQLite repository under src/memu/app/crud.py.
  • memu-server entry point points to a missing moduleissue #354. pyproject.toml declares memu-server = "memu.server.cli:main" but src/memu/server/cli.py is missing on main; the entry point and the module must be kept in sync.
  • Alembic URL interpolation — fixed in v1.5.1 (fd87ceb) to use correct interpolation, which affects migration-driven schema updates for the MemoryItem and MemoryCategory tables.

See Also

  • Storage Backends (in-memory, SQLite, Postgres)
  • LLM Profiles and Routing
  • Category Summary Generation
  • Memory Types Overview

Source: https://github.com/NevaMind-AI/memU / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 10 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

  • Severity: high
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/428

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/354

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/382

4. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.host_targets | https://github.com/NevaMind-AI/memU

5. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/NevaMind-AI/memU

6. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/NevaMind-AI/memU

8. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/NevaMind-AI/memU

9. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

10. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using memU with real data or production workflows.

  • [[BUG] memory_items table's happend_at collumn stores NULL value even tho](https://github.com/NevaMind-AI/memU/issues/428) - github / github_issue
  • Bug: memu-server entry point points to missing module (memu.server.cli) - github / github_issue
  • [[BUG] sqlite backend embding issue](https://github.com/NevaMind-AI/memU/issues/382) - github / github_issue
  • v1.5.1 - github / github_release
  • v1.5.0 - github / github_release
  • v1.4.0 - github / github_release
  • v1.3.0 - github / github_release
  • v1.2.0 - github / github_release
  • v1.1.2 - github / github_release
  • v1.1.1 - github / github_release
  • v1.1.0 - github / github_release
  • v1.0.1 - github / github_release

Source: Project Pack community evidence and pitfall evidence