Doramagic Project Pack · Human Manual
memU
From workspace to agent memory
System Overview & Architecture
Related topics: Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing, Memorize & Retrieve Workflows, Preprocessing and Prompts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing, Memorize & Retrieve Workflows, Preprocessing and Prompts
System Overview & Architecture
Purpose and Scope
memU is an agent memory framework that turns raw sources (chat logs, documents, deployment logs, agent traces) into a structured, self-organizing compiled workspace that can be queried on demand. The framework exposes two runtime operations: memorize() for writing new sources into the workspace, and retrieve() for serving only the relevant layers back to an agent. Source: README.md.
The framework targets three concrete needs of long-running agents:
- Context — inject the right facts, preferences, and source material instead of a cold prompt.
- Continuity — the workspace persists and self-organizes across sessions, sources, and tasks.
- Control — every record is structured and inspectable, tracing back to its raw source for auditing and editing.
Source: README.md.
The framework is delivered as a Python package (with both Linux x86_64 and ARM64 wheel targets added in v1.1.0) and offers a pluggable storage layer supporting in-memory, SQLite, and Postgres backends behind a shared repository contract. Source: README.md, v1.1.0 release notes.
The Compiled Workspace
The primary output of memU is a navigable workspace composed of three layered records. Source: README.md.
| Record | Role | Contents |
|---|---|---|
MemoryCategory | Folder — a topic with an evolving summary | name, description, summary, embedding, child MemoryItem[] |
MemoryItem | File — a typed atomic memory | memory_type ∈ {profile, event, knowledge, behavior, skill, tool}, summary, extra, happened_at, embedding |
Resource | Source — the raw artifact behind the memory | url, modality, local_path, caption, embedding |
The on-disk projection of this workspace is a tree of INDEX.md, MEMORY.md, and per-skill SKILL.md files, persisted through the configured storage backend. Source: README.md.
Runtime Operations
WRITE — memorize() READ — retrieve()
───────────────────────────────────────── ─────────────────────────────────────────
raw files → extract → files + folders query → walk folders → ranked files
persist via repository contracts return scoped, ranked context
Source: README.md.
Both operations are exposed through the MemoryService class. The memorize() method accepts a resource_url and a modality (e.g. conversation, document) along with scoping metadata such as user_id. The retrieve() method accepts a list of query items and a where filter for scoped lookup. Source: README.md.
Typed Memory Extraction
memU classifies extracted memory into six canonical types, each with a dedicated prompt module that enforces type-specific rules. Source: README.md.
flowchart LR
A[Raw Resource] --> B[Preprocess]
B --> C{Type Router}
C -->|profile| P[profile.py]
C -->|event| E[event.py]
C -->|knowledge| K[knowledge.py]
C -->|behavior| Bp[behavior.py]
C -->|skill| S[skill.py]
C -->|tool| T[tool.py]
P --> M[MemoryItem]
E --> M
K --> M
Bp --> M
S --> M
T --> M
M --> Cat[MemoryCategory]- Knowledge — declarative facts, concepts, and explanations; forbids opinions, personal experiences, and user-specific traits. Items must be self-contained, under ~50 words, and merged when redundant. Source: src/memu/prompts/memory_type/knowledge.py.
- Behavior — recurring patterns, routines, and solutions; forbids one-time actions unless they reveal a significant pattern, and forbids assistant-only turns. Items must use the word "user" consistently to attribute the subject. Source: src/memu/prompts/memory_type/behavior.py.
- Skill — comprehensive skill profiles with YAML frontmatter (
name,description,category,demonstrated-in) and full sections (Core Principles, When to Use, Implementation Guide, Success Patterns, Common Pitfalls, Key Takeaways). Each profile must be at least 300 words to ensure depth and actionability. Source: src/memu/prompts/memory_type/skill.py. - Tool — tool usage patterns including tool name, scenario, outcome, and a
when_to_useretrieval hint, capturing both successful patterns and failure lessons. Source: src/memu/prompts/memory_type/tool.py.
Preprocessing Stage
Before typed extraction, document inputs are condensed by a dedicated preprocess prompt that produces two outputs: a processed_content block that preserves all key information while removing verbosity, and a single-sentence caption summarizing the document. Source: src/memu/prompts/preprocess/document.py.
Category Organization and Retrieval
After extraction, memory items are sorted into MemoryCategory folders, cross-linked, embedded, and summarized into a browsable tree. Source: README.md.
Two summary prompt variants are used to keep the folder summaries current:
- Plain category summary — merges original content with new memory items, preserves Markdown hierarchy, and excludes one-off actions that lack long-term value (e.g. "ate Malatang today"). Source: src/memu/prompts/category_summary/category.py.
- Reference-annotated category summary — adds inline
[ref:ITEM_ID]references pointing back to the specific memory items that contributed each fact, enabling auditability and inline citation. Source: src/memu/prompts/category_summary/category_with_refs.py.
During retrieve(), the framework navigates these folders and returns only the files relevant to the current user, agent, session, or task, returning scoped, ranked context that can be injected into any agent workflow. Source: README.md.
Community Notes and Known Limitations
Several issues reported by the community are worth noting for anyone reading this overview:
- Storage column bug — the
memory_items.happened_atcolumn has been observed to storeNULLeven when proper timestamps are provided in conversational input. This affects the optional temporal metadata on memory items added in v1.3.0. Source: issue #428. - SQLite embedding storage — the SQLite backend has had issues storing embedding fields as native lists, raising
ValueError: <class 'list'> has no matching SQLAlchemy typein affected test scripts. Source: issue #382. - Server entry point — the
memu-serverconsole script has been reported to point to a missingmemu.server.climodule onmain; users packaging the server should verify the entry point exists in their installed version. Source: issue #354.
These limitations are tracked in the issue tracker and the release notes (currently at v1.5.1, which includes an alembic URL interpolation fix). Source: v1.5.1 release notes.
See Also
- Storage backends and repository contracts (in-memory, SQLite, Postgres)
- LLM profile configuration and routing
MemoryService.memorize()/retrieve()API reference- Custom memory type prompt registration
- Category summary reference format and inline citations
Source: https://github.com/NevaMind-AI/memU / Human Manual
Storage Backends & Data Model
Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts
Storage Backends & Data Model
The storage layer is the persistence core of memU. It defines the canonical data model — Resource, MemoryItem, MemoryCategory, and the join record CategoryItem — and provides pluggable repository implementations behind a shared contract. The runtime is decoupled from any specific engine: callers go through DatabaseFactory, which returns a backend that satisfies the same interfaces used during memorize() and retrieve(). The README explicitly advertises "in-memory, SQLite, or Postgres backends with the same repository contracts" as a core feature. Source: README.md
Data Model
The model treats memory as a workspace hierarchy rather than a flat table.
| Record | Role | Key Fields |
|---|---|---|
Resource | Raw source artifact (chat log, document, image caption) | url, modality, local_path, caption, embedding |
MemoryItem | Typed atomic memory extracted from a resource | memory_type, summary, extra, happened_at, embedding |
MemoryCategory | Folder grouping items by topic with an evolving summary | name, description, summary, embedding |
CategoryItem | Many-to-many relation linking items to categories | category_id, item_id |
memory_type is an enum-like string drawn from a fixed set: profile, event, knowledge, behavior, skill, tool. The README and the per-type prompt modules (src/memu/prompts/memory_type/*.py) confirm this set. Each prompt module defines extraction rules for its type — for example, event.py requires a declarative sentence with a timestamp, while behavior.py insists on recurring patterns rather than one-off actions.
flowchart TB
R[Resource] -->|extract| MI[MemoryItem]
MI -->|belongs to| CI[CategoryItem]
MC[MemoryCategory] -->|groups| CI
MC -.embedding.-> VS[(Vector Store)]
MI -.embedding.-> VS
R -.embedding.-> VSAll three primary records carry an embedding field, which the retrieve pipeline uses for semantic ranking across both category-level (broad context) and item-level (precise facts) queries.
Backend Implementations
In-Memory
The default backend stores everything in Python dictionaries and is intended for tests, demos, and ephemeral agent runs. Its repositories (src/memu/database/inmemory/repo.py) cache Resource, MemoryItem, MemoryCategory, and CategoryItem records in instance attributes and write through synchronously. No process restart can recover state.
SQLite
Introduced in v1.2.0 (release notes), the SQLite backend uses SQLAlchemy models and a per-record repository layout. The SQLiteMemoryStore defined in src/memu/database/sqlite/sqlite.py composes four repositories — ResourceRepo, MemoryCategoryRepo, MemoryItemRepo, and CategoryItemRepo — and caches each in a dict for read access. The constructor accepts a Pydantic scope_model for user scoping and optional custom model overrides, so application code can extend Resource or MemoryItem without subclassing the store. Source: src/memu/database/sqlite/sqlite.py:23-58
The DSN is passed as a plain SQLAlchemy URL (sqlite:///path/to/db.sqlite). As of v1.5.1 the alembic migration URL interpolation was fixed (commit fd87ceb), so URL templating with environment variables now behaves correctly.
Postgres
The Postgres backend shares the same repository contracts and is the recommended option for production deployments where embeddings and high-volume writes need to coexist with ACID guarantees.
Configuration
The DatabaseConfig block in src/memu/app/settings.py selects the backend and its connection parameters. The BlobConfig controls where raw resources are materialized on disk (resources_dir, default ./data/resources), which is independent of the relational store.
The MemoryFilesConfig block — disabled by default — toggles the on-disk "memory file system" rendering that writes INDEX.md, MEMORY.md, and per-skill SKILL.md files under output_dir. When synthesize=True, those markdown files are generated via an LLM call using the configured synthesis_llm_profile; otherwise they are rendered deterministically from already-extracted records. Source: src/memu/app/settings.py:25-60
The retrieve-side options live in RetrieveCategoryConfig and RetrieveItemConfig: top_k controls how many categories (default 5) and how many items are returned per query. Both are enabled by default.
Common Failure Modes
The community has surfaced three recurring issues that map directly onto the storage layer:
happened_atNULL on conversational input — WhenMemoryService.memorize()ingests chat messages with explicit timestamps, the resultingmemory_items.happened_atcolumn is sometimes stored asNULLeven though the source carried a valid time. This indicates a propagation gap between the message envelope and the typedeventitem writer. Tracked in issue #428.
- SQLite embedding serialization — Earlier snapshots of the SQLite backend raised
ValueError: <class 'list'> has no matching SQLAlchemy typewhen persisting embeddings, because theembeddingcolumn was declared without a SQLAlchemy type that understands Pythonlist. Reported in issue #382.
memu-serverentry point —pyproject.tomlconfiguredmemu-server = "memu.server.cli:main", butsrc/memu/server/cli.pywas missing from the tree, so the installed console script failed at import time. Reported in issue #354; fixed in a later release.
For all three, the mitigation is the same: pin to a release at or after v1.5.1, where the alembic interpolation fix and prompt/item fallback improvements have landed.
See Also
- Memory Extraction Prompts — per-type rules for
profile,event,knowledge,behavior,skill,tool. - Configuration Reference — full list of
DatabaseConfig,BlobConfig, and retrieve options.
Source: https://github.com/NevaMind-AI/memU / Human Manual
LLM, Embedding & VLM Providers and Routing
Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Overview & Architecture, Memorize & Retrieve Workflows, Preprocessing and Prompts
LLM, Embedding & VLM Providers and Routing
1. Purpose and Scope
memU needs to talk to a heterogeneous set of model providers for three distinct capabilities: text chat / summarization (LLM), vector embeddings (Embedding), and vision-language analysis (VLM). The "Providers and Routing" subsystem is the abstraction layer that decouples the rest of the memory pipeline from any single vendor.
Its responsibilities are:
- Normalize the request/response shape of every supported provider behind a small, capability-scoped interface.
- Let the rest of memU call
client.summarize(...)orclient.vision(...)without knowing whether the request actually went to OpenAI, Anthropic, OpenRouter, or a self-hosted HTTP endpoint. - Allow new providers to be added by registering a backend module and (optionally) a transport client, without editing the service composition root.
The README states the project is "Profile-Based LLM Routing" — chat, embedding, vision, and transcription work are routed through configurable LLM profiles, which is implemented via the per-capability gateway.py files.
Source: README.md
2. Package Layout
The two text/image capabilities each live in their own sibling package and follow the same internal shape:
src/memu/llm/ # text / chat / summarization
├── base.py # LLMClient interface
├── gateway.py # builds a client from settings.LLMConfig
├── wrapper.py # higher-level façade used by services
├── openai_client.py # OpenAI SDK transport
├── anthropic_client.py # Anthropic SDK transport
├── lazyllm_client.py # LazyLLM transport
├── http_client.py # raw HTTP transport
├── defaults.py # per-provider default model picks
└── backends/ # per-provider request/response shape
├── base.py
├── openrouter.py
└── ...
src/memu/vlm/ # image / video understanding (mirrors llm/)
├── base.py # VLMClient interface + encode_image()
├── gateway.py # builds a VLM client from settings.VLMConfig
├── openai_client.py
├── anthropic_client.py
├── http_client.py
├── defaults.py
└── backends/
├── base.py
└── ...
The package docstrings spell this out explicitly: "backends/: per-provider vision request/response shapes (HTTP transport). http_client/openai_client/anthropic_client: transport clients. gateway: build a client from a :class:memu.app.settings.VLMConfig`."
Source: src/memu/vlm/__init__.py:1-20
3. LLM Provider System
3.1 The backend contract
Every LLM provider is described by a subclass of LLMBackend defined in src/memu/llm/backends/base.py. The base class exposes three customization points:
default_headers(api_key)— returns the auth header set. Defaults to OpenAI-styleAuthorization: Bearer …; Anthropic overrides this to usex-api-key.build_summary_payload(...)— converts(text, system_prompt, chat_model, max_tokens)into the provider-specific JSON body.parse_summary_response(data)— extracts the assistant text from the provider's response envelope.build_vision_payload(...)— builds the vision request body for text+image prompts (reused from the VLM path on providers that share an endpoint).
Concrete backends such as OpenRouterLLMBackend set summary_endpoint = "/api/v1/chat/completions" and emit OpenAI-compatible message arrays, then parse the response with data["choices"][0]["message"]["content"]. This means many OpenAI-shaped providers can be supported by a single small backend module.
Source: src/memu/llm/backends/base.py:1-44, src/memu/llm/backends/openrouter.py:1-25
3.2 Transports
The transport layer is independent of the backend layer. For each capability there is a small set of clients:
openai_client.py— uses the official OpenAI SDK.anthropic_client.py— uses the official Anthropic SDK; the gateway strips a stale OpenAIbase_urldefault and falls back tohttps://api.anthropic.com.http_client.py— a raw HTTP/httpx transport for any OpenAI-compatible endpoint, useful for self-hosted models and proxies.lazyllm_client.py— LazyLLM transport.
3.3 The LLM gateway
src/memu/llm/gateway.py exposes a build_llm_client(cfg) function that inspects LLMConfig (base URL, API key, provider name, model) and returns a fully-wired LLMClient. Adding a new transport is a single-line registration in the gateway; adding a new provider that reuses an existing transport is a new backend module.
The LLMClient base class in src/memu/llm/base.py defines the methods the rest of memU calls (summarize, vision, etc.); concrete clients implement them by calling the active backend.
Source: src/memu/llm/base.py, src/memu/llm/gateway.py
flowchart LR
Settings["LLMConfig<br/>(base_url, api_key, model)"] --> Gateway["llm/gateway.py<br/>build_llm_client()"]
Gateway -->|provider=openai| OpenAI["openai_client.py"]
Gateway -->|provider=anthropic| Anthropic["anthropic_client.py"]
Gateway -->|provider=http| HTTP["http_client.py"]
Gateway -->|provider=lazyllm| Lazy["lazyllm_client.py"]
OpenAI --> Backend["backends/*<br/>payload + response shape"]
Anthropic --> Backend
HTTP --> Backend
Lazy --> Backend
Backend --> Memory["MemoryService<br/>memorize / retrieve"]4. VLM Provider System (Mirror of LLM)
The vision subsystem is intentionally a near-clone of the LLM subsystem. VLMClient exposes a single multimodal capability, vision(prompt, image_path, ...), and encode_image(image_path) base64-encodes a file and infers the MIME type from the extension (.jpg/.jpeg → image/jpeg, .png → image/png, etc.).
The VLM gateway builds the client from a VLMConfig and registers three builders by default — SDK, Anthropic SDK, and raw HTTP — and shares the same OpenAI-default base_url stripping logic as the LLM gateway so an LLM-targeted config does not leak into a vision call.
Source: src/memu/vlm/base.py:1-44, src/memu/vlm/gateway.py:1-40
4.1 Embeddings
Although embedding is a separate logical capability, it is wired into the same routing model: the v1.5.0 release added HTTP proxy support for "LLM & embedding clients" together, and v1.0.1 explicitly fixed "get embedding client," indicating that embedding clients are constructed through the same gateway pattern. Embedding providers therefore plug in by adding a backend module and (if needed) a transport — no service-layer changes are required.
Source: README release notes for v1.5.0 and v1.0.1
5. Routing, Configuration, and Common Failure Modes
5.1 Profile-based routing
The README summarizes the goal: "Profile-Based LLM Routing — Route chat, embedding, vision, and transcription work through configurable LLM profiles." In practice this means:
- A
LLMConfig(and its VLM / embedding siblings) is declared once inapp/settings.py. MemoryServiceand friends never import a specific provider — they only depend onLLMClient/VLMClientinterfaces.- Swapping providers is a config change, not a code change.
5.2 Provider-customization table
| Concern | Where it lives | What you change |
|---|---|---|
| Auth header scheme | LLMBackend.default_headers / VLMBackend.default_headers | Providers that don't use Authorization: Bearer … (e.g. Anthropic's x-api-key) |
| Request body shape | build_summary_payload / build_vision_payload | Providers whose message/tool format diverges from OpenAI |
| Response parsing | parse_summary_response / parse_vision_response | Providers that wrap choices[0].message.content differently |
| Endpoint path | summary_endpoint / vision_endpoint class attribute | Providers whose path differs from /chat/completions |
| Transport (SDK vs HTTP) | llm/gateway.py / vlm/gateway.py | Self-hosted endpoints, proxies, LazyLLM |
| Default model | defaults.py | Picking the latest recommended model per provider |
Source: src/memu/llm/backends/base.py:1-44, src/memu/vlm/backends/base.py:1-40
5.3 Known limitations and community-reported issues
- Stale entry point on a release line —
memu-serverwas configured to point to amemu.server.climodule that did not exist onmain, breaking startup. The fix lives in the packaging layer (pyproject.toml), not in the LLM/VLM code, but it is the kind of regression users hit immediately and is worth being aware of when upgrading. See issue #354. - SQLite + embeddings — a community-reported bug ("sqlite backend embding issue", #382) raised
ValueError: <class 'list'> has no matching SQLAlchemy typebecause theembeddingcolumn is stored as a list/JSON shape. When the embedding provider returns vectors, the persistence layer must serialize them; ensure your backend writes JSON, not Pythonlist, when the target dialect does not support array types natively. - HTTP proxy support — added in v1.5.0 across LLM and embedding clients. If you sit behind a corporate proxy, set the new proxy option on the relevant
*Configrather than patching transports. - Embedding client wiring — v1.0.1's "get embedding client" fix is a reminder that embedding construction goes through the gateway; if a custom embedding provider is added, the corresponding builder must be registered there or
build_*_clientwill returnNone.
Source: community issues #354, #382; release notes v1.0.1, v1.5.0.
See Also
- Storage Backends (in-memory, SQLite, Postgres)
- MemoryService:
memorizeandretrieve - Configuration:
app/settings.pyand*Configtypes - Multimodal Pipeline: image and video ingestion
Source: https://github.com/NevaMind-AI/memU / Human Manual
Memorize & Retrieve Workflows, Preprocessing and Prompts
Related topics: System Overview & Architecture, Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Overview & Architecture, Storage Backends & Data Model, LLM, Embedding & VLM Providers and Routing
Memorize & Retrieve Workflows, Preprocessing and Prompts
1. Overview and Compiled Workspace
memU exposes a two-operation API to agents: memorize() and retrieve(). Internally, these operations compile raw sources (chat logs, documents, logs, images) into a navigable workspace and then serve only the layers that match a query. The README frames the design as a "compiled workspace" that emulates a file system: MemoryCategory records act as folders, MemoryItem records act as files, and Resource records preserve the original artifact that produced each memory (README.md).
The compiled workspace follows a fixed shape:
MemoryCategory (folder: topic with evolving summary)
├── name, description, summary, embedding
└── MemoryItem[] (files: typed, atomic memories)
├── memory_type: profile | event | knowledge | behavior | skill | tool
├── summary, extra, happened_at, embedding
└── Resource (source: raw file)
└── url, modality, local_path, caption, embedding
memorize() accepts a resource_url and a modality (e.g. conversation, document) and returns a dictionary describing what was written. retrieve() accepts queries and a where filter (typically user_id) and returns only the folders and files relevant to the request, ranked for prompt injection. The README's "WRITE — memorize()" / "READ — retrieve()" diagram describes the full flow as: raw files → extract → files + folders → persist; and query → walk folders → ranked files (README.md).
2. The Memorize Pipeline
The memorize path lives under src/memu/app/memorize.py, which orchestrates preprocessing, type-specific extraction, category routing, embedding, and persistence through the CRUD layer in src/memu/app/crud.py. The pipeline has five logical stages (README.md):
- Preprocess — convert the raw resource into a clean text representation
- Extract — run one or more memory-type extractors to produce typed items
- Categorize & link — place items into
MemoryCategoryfolders, cross-link, and embed - Summarize — produce/refresh per-category summaries
- Persist — write items, relations, embeddings, and summaries through the repository contract
flowchart LR
A[Raw resource<br/>conversation / document] --> B[Preprocess]
B --> C[Memory-type extractors<br/>profile · event · knowledge · behavior · skill · tool]
C --> D[MemoryItem + MemoryCategory]
D --> E[Embedding + Category Summary]
E --> F[(Repository<br/>in-memory / SQLite / Postgres)]
F --> G[retrieve<br/>walk folders → ranked files]2.1 Preprocessing
src/memu/preprocess/base.py defines the abstract preprocessor contract, and concrete implementations live in src/memu/preprocess/conversation.py and src/memu/preprocess/document.py. The conversation preprocessor normalizes message lists and timestamps (relevant to the happened_at column), while the document preprocessor delegates to the LLM via the prompt in src/memu/prompts/preprocess/document.py to produce a condensed version plus a one-sentence caption (src/memu/prompts/preprocess/document.py).
Community note: Issue #428 reports that thememory_items.happend_atcolumn storesNULLeven when messages carry proper timestamps. This is a known data-flow bug in the conversation preprocessing → extract path; verify preprocessor output before relying onhappened_at.
2.2 Type-Specific Extraction
The src/memu/prompts/memory_type/ package contains one prompt module per supported memory type. Each module follows a similar block-based structure: an OBJECTIVE block identifying the role (e.g. "professional User Memory Extractor"), a WORKFLOW block, a RULES block, a CATEGORY block listing the target categories, an OUTPUT block specifying the response shape, and an EXAMPLES block.
| Memory type | Prompt module | Extraction target | Output shape |
|---|---|---|---|
event | src/memu/prompts/memory_type/event.py | Time-bounded happenings involving the user (with time, place, participants) | One declarative sentence per item |
knowledge | src/memu/prompts/memory_type/knowledge.py | Objective facts, concepts, definitions, explanations | Single-line plain text, < 50 words per item |
behavior | src/memu/prompts/memory_type/behavior.py | Recurring patterns, routines, solutions (not one-time events) | Single or multi-line record of a pattern |
skill | src/memu/prompts/memory_type/skill.py | Actionable skill profiles with frontmatter and sectioned markdown | Full SKILL.md body, ≥ 300 words |
tool | src/memu/prompts/memory_type/tool.py | Tool usage patterns: name, use case, outcome, retrieval hint | XML <memory> with <when_to_use> |
All five type prompts share these guard rails: items must be in the same language as the source, items must be self-contained without context, identical/similar items must be merged into a single category, and forbidden content includes illegal/harmful topics, opinions without factual basis (in knowledge), and assistant-only turns (in behavior) (src/memu/prompts/memory_type/knowledge.py, src/memu/prompts/memory_type/behavior.py).
The skill prompt is the most demanding: it requires a YAML frontmatter (name, description, category, demonstrated-in) followed by Core Principles, When to Use This Skill, Implementation Guide (with Prerequisites, Techniques and Approaches, Example from Resource), Success Patterns, Common Pitfalls, and Key Takeaways, with a 300-word minimum to guarantee reusability (src/memu/prompts/memory_type/skill.py). The tool prompt produces a lighter record but adds an explicit <when_to_use> field so retrieval can match the right tool to the right task (src/memu/prompts/memory_type/tool.py).
2.3 Category Routing and Summarization
After extraction, items are placed into MemoryCategory folders, and src/memu/prompts/category_summary/category.py is invoked to maintain a per-category user markdown profile. The category prompt implements a three-step workflow: parse the initial profile and new items, perform Update (conflict detection, validity priority, overwrite/supplement) and Add (deduplication, category matching, insertion) operations, and re-render the result as a Markdown hierarchy with H1 category titles and H2 sub-categories (src/memu/prompts/category_summary/category.py).
The prompt also enforces explicit exclusions: vague or non-user items are dropped, one-time events without long-term relevance (e.g. "ate Malatang today") are removed, and assistant-introduced content is rejected. The final output is only the updated Markdown profile — no explanations or operation traces.
3. The Retrieve Pipeline
src/memu/app/retrieve.py consumes a queries list (each a role/content pair) and a where filter (typically {"user_id": "..."}). It walks the MemoryCategory tree, scores items against the query embeddings, and returns only the relevant files and folder summaries as a dictionary suitable for prompt injection (README.md, src/memu/app/retrieve.py). The persisted shape is identical to what memorize() returns, so downstream agents can treat both responses uniformly.
4. Common Failure Modes
Several community-reported issues map directly onto this workflow:
happened_atis NULL for conversation memories — issue #428. The conversation preprocessor must propagate per-message timestamps into the extracted event items; checksrc/memu/preprocess/conversation.pyand theeventextractor insrc/memu/prompts/memory_type/event.pywhen diagnosing.- SQLite backend rejects the
embeddingcolumn — issue #382. The SQLAlchemy type forembedding(alist) is not registered, raisingValueError: <class 'list'> has no matching SQLAlchemy type. The fix typically lives in the SQLite repository undersrc/memu/app/crud.py. memu-serverentry point points to a missing module — issue #354.pyproject.tomldeclaresmemu-server = "memu.server.cli:main"butsrc/memu/server/cli.pyis missing onmain; the entry point and the module must be kept in sync.- Alembic URL interpolation — fixed in v1.5.1 (fd87ceb) to use correct interpolation, which affects migration-driven schema updates for the
MemoryItemandMemoryCategorytables.
See Also
- Storage Backends (in-memory, SQLite, Postgres)
- LLM Profiles and Routing
- Category Summary Generation
- Memory Types Overview
Source: https://github.com/NevaMind-AI/memU / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 10 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.
1. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/428
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/354
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/NevaMind-AI/memU/issues/382
4. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | https://github.com/NevaMind-AI/memU
5. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/NevaMind-AI/memU
6. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/NevaMind-AI/memU
8. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/NevaMind-AI/memU
9. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU
10. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/NevaMind-AI/memU
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using memU with real data or production workflows.
- [[BUG] memory_items table's happend_at collumn stores NULL value even tho](https://github.com/NevaMind-AI/memU/issues/428) - github / github_issue
- Bug: memu-server entry point points to missing module (memu.server.cli) - github / github_issue
- [[BUG] sqlite backend embding issue](https://github.com/NevaMind-AI/memU/issues/382) - github / github_issue
- v1.5.1 - github / github_release
- v1.5.0 - github / github_release
- v1.4.0 - github / github_release
- v1.3.0 - github / github_release
- v1.2.0 - github / github_release
- v1.1.2 - github / github_release
- v1.1.1 - github / github_release
- v1.1.0 - github / github_release
- v1.0.1 - github / github_release
Source: Project Pack community evidence and pitfall evidence