langmem Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

langmem

LangMem helps agents learn and adapt from their interactions over time.

LangMem Overview and Architecture

Related topics: Long-Term Memory: Extraction, Tools, and Store Managers, Prompt Optimization and Learning, Short-Term Memory, Reflection, and Graph Workflows

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Knowledge Extraction and Storage

Continue reading this section for the full explanation and source context.

Section Prompt Optimization

Continue reading this section for the full explanation and source context.

Section Short-Term Memory

Continue reading this section for the full explanation and source context.

LangMem Overview and Architecture

Purpose and Scope

LangMem is a toolkit that helps AI agents learn and adapt from interactions over time. It provides three primary capabilities: extracting important information from conversations, optimizing agent behavior through prompt refinement, and maintaining long-term memory. The library exposes both functional primitives that work with any storage system and native integration with LangGraph's storage layer, allowing agents to continuously improve, personalize responses, and maintain consistent behavior across sessions. Source: README.md:1-10

The high-level feature surface is summarized in the README:

Capability	Description
Core memory API	Storage-agnostic memory extraction primitives
Memory management tools	In-conversation tools agents can call to record/search
Background memory manager	Automatic extraction, consolidation, and updates
LangGraph integration	Native long-term store support in all platform deployments

Source: README.md:13-19

Module Layout and Public API

The package is organized into three top-level subsystems, all re-exported through a single entry point. Source: src/langmem/__init__.py:1-23

langmem.knowledge — long-term / semantic memory: extractors, store managers, and tools.
langmem.prompts — prompt optimization: single and multi-prompt optimizers, reflection executors, and the Prompt TypedDict.
langmem.short_term — within-thread memory: rolling summarization nodes and helpers.
langmem.reflection — orchestration entry point (ReflectionExecutor) for running reflective updates over trajectories.

The public surface from the knowledge subsystem includes create_memory_manager, create_memory_store_manager, create_memory_searcher, create_manage_memory_tool, create_search_memory_tool, and create_thread_extractor, plus the MemoryPhase enum. Source: src/langmem/knowledge/__init__.py:1-31

High-Level Architecture

LangMem treats memory as a layered system: a short-term rolling summary inside an active thread, plus a long-term knowledge base in an external store, with prompt optimizers that act on the resulting trajectories.

flowchart LR
    User[User / App] -->|messages| Agent[LangGraph Agent]
    Agent -->|hot path| Tools[manage_memory / search_memory tools]
    Agent --> STM[short_term.SummarizationNode]
    STM --> Agent
    Background[Background Manager] -->|extract/update| Store[(LangGraph BaseStore)]
    Tools <-->|read/write| Store
    Optimizer[Prompt Optimizer] -->|refined prompts| Agent
    Trajectories[(Annotated Trajectories)] --> Optimizer

This mirrors the README's framing of an in-thread tool path, a background manager, and prompt-level adaptation. Source: README.md:15-18

Knowledge Extraction and Storage

The knowledge module provides two layers: stateless extractors that return structured data, and stateful managers that write into a BaseStore. The create_thread_extractor builds a Runnable around trustcall.create_extractor, prompts the model to call a tool, and returns a schema-typed summary (defaulting to a title/summary pair). Source: src/langmem/knowledge/extraction.py:142-180

Stateful managers such as create_memory_store_manager consume messages, optionally deduplicate or update existing entries, and persist results into a namespaced store. The namespace is templated and resolved at runtime through config["configurable"] placeholders such as {langgraph_user_id}. Source: src/langmem/knowledge/extraction.py:1-120

For in-conversation use, create_manage_memory_tool and create_search_memory_tool wrap the store as StructuredTool instances. The manage tool supports create, update, and delete actions and uses a custom _ToolWithRequired subclass to guarantee a required array appears in the generated JSON schema, which keeps tool-calling providers happy. Source: src/langmem/knowledge/tools.py:1-200

Prompt Optimization

Prompt optimization is expressed through the Prompt TypedDict, which carries the prompt text plus optional update_instructions and when_to_update fields used by the optimizers. Source: src/langmem/prompts/types.py:1-30

Two optimizer strategies ship out of the box:

Metaprompt optimizer — embeds the current prompt, update instructions, and trajectories into a single LLM reflection pass that proposes a revised prompt. Source: src/langmem/prompts/metaprompt.py:1-60
Gradient-style optimizer — uses a structured critique pass to diagnose failures and a metaprompt pass to synthesize updates, both gated by warrants_adjustment. Source: src/langmem/prompts/gradient.py:1-120

A graph-based orchestrator in src/langmem/graphs/prompts.py exposes an optimize node that selects between create_prompt_optimizer (single prompt) and create_multi_prompt_optimizer (multiple interdependent prompts) based on whether when_to_update is set. Source: src/langmem/graphs/prompts.py:1-50

Short-Term Memory

The short-term subsystem compresses a conversation in place using a SummarizationNode plus the summarize_messages / asummarize_messages helpers. The RunningSummary dataclass tracks the current summary and the IDs of messages already summarized, so subsequent calls only summarize new content. Source: src/langmem/short_term/summarization.py:1-80

Key tuning knobs (documented in the function signature) include max_tokens, max_tokens_before_summary, max_summary_tokens, a token_counter, and prompt templates for the initial summary, the running update, and the final composition. The implementation explicitly notes that tool-call continuations are summarized atomically, and that the last max_tokens worth of messages are summarized if input exceeds budget. Source: src/langmem/short_term/summarization.py:1-160

Integration Patterns

A typical deployment wires three pieces together: an InMemoryStore (or any BaseStore) configured with an embedding index, a ReAct agent that exposes create_manage_memory_tool and create_search_memory_tool for the hot path, and a background create_memory_store_manager that runs after each turn to enrich the store asynchronously. Source: README.md:31-58

For optimization, callers pass a list of AnnotatedTrajectory objects (messages plus optional feedback) to either optimizer kind. The result is a new Prompt value that can be persisted and reloaded on the next agent run. Source: src/langmem/prompts/types.py:32-50

Common Failure Modes

Missing store in tool path: tools that cannot resolve a BaseStore raise ConfigurationError; ensure a store is passed explicitly or available via get_store(). Source: src/langmem/knowledge/tools.py:1-120
Non-JSON-serializable memory content: the helper _ensure_json_serializable falls back to model_dump(mode="json") or stringification, so prefer Pydantic models or primitives. Source: src/langmem/knowledge/tools.py:1-160
Token budget overruns in summarization: the summarizer trims to the most recent max_tokens worth of messages to fit the LLM context, so very old context can be lost by design. Source: src/langmem/short_term/summarization.py:1-160
Schema-only updates in create_memory_manager: by default extraction is conservative; explicit enable_inserts, enable_updates, enable_deletes phases are gated by MemoryPhase. Source: src/langmem/knowledge/__init__.py:1-31

Long-Term Memory: Extraction, Tools, and Store Managers

Related topics: LangMem Overview and Architecture, Short-Term Memory, Reflection, and Graph Workflows

Section Related Pages

Continue reading this section for the full explanation and source context.

Section creatememorymanager

Continue reading this section for the full explanation and source context.

Section createthreadextractor

Continue reading this section for the full explanation and source context.

Section creatememorysearcher

Continue reading this section for the full explanation and source context.

Long-Term Memory: Extraction, Tools, and Store Managers

Purpose and Scope

The langmem.knowledge subpackage is the long-term memory core of LangMem. It provides utilities for extracting, consolidating, storing, and retrieving semantic knowledge derived from agent conversations. The package is intentionally split into two layers:

Functional transformations — pure runnables that operate on messages and emit structured Pydantic models. These are storage-agnostic and can be used anywhere a LangChain Runnable is accepted.
Stateful operations — components that wrap a BaseStore (typically LangGraph's persistent store) and combine extraction, search, and write-back into a single unit.

The top-level package re-exports both layers so that downstream code only needs from langmem import ... to access the full memory toolkit. Source: src/langmem/__init__.py:1-31.

The package's own docstring summarizes the public surface as: create_memory_manager, create_thread_extractor, create_memory_store_manager, create_manage_memory_tool, and create_search_memory_tool. Source: src/langmem/knowledge/__init__.py:1-23.

Functional Extraction Primitives

`create_memory_manager`

create_memory_manager(model, schemas=None, ...) returns a MemoryManager — a Runnable[MemoryState, list[ExtractedMemory]] that consumes conversation messages (optionally together with previously stored memories) and emits a deduplicated set of structured memory objects. The extraction loop runs up to max_steps iterations, allowing the model to insert, update, or delete memories in successive passes until it signals completion by calling the Done tool. Source: src/langmem/knowledge/extraction.py:113-194.

The built-in prompt instructs the model to "Attend to novel information that deviates from existing memories and expectations… Consolidate and compress redundant memories to maintain information-density… Remove incorrect or redundant memories while maintaining internal consistency." Source: src/langmem/knowledge/extraction.py:85-101. This is the core MemoryPhase extension point: callers can pass instructions, enable_inserts, enable_updates, enable_deletes, and phases to shape the manager's behavior per memory kind.

`create_thread_extractor`

create_thread_extractor(model, schema=None) produces an asynchronous summarizer that returns a Pydantic object describing the conversation. When no schema is provided, the default SummarizeThread schema is used, with title and summary fields. The function is overloaded on schema so that custom Pydantic models are preserved in the return type. Source: src/langmem/knowledge/extraction.py:42-110. Internally it builds a ChatPromptTemplate with a system instruction and a user message wrapping the merged conversation via utils.get_conversation.

`create_memory_searcher`

create_memory_searcher(model, prompt=..., namespace=...) composes LLM-driven query generation, vector search against the configured namespace, and result ranking into a single pipeline. The namespace defaults to ("memories", "{langgraph_user_id}"), where {langgraph_user_id} is resolved at runtime from config["configurable"]. Source: src/langmem/knowledge/extraction.py:209-263.

Stateful Store Managers and Tools

`create_memory_store_manager`

create_memory_store_manager wraps MemoryManager and attaches it to a BaseStore. Its constructor accepts model, query_model (a cheaper model used for retrieval), query_limit, namespace, store, phases, and an optional default/default_factory providing a baseline memory value. Source: src/langmem/knowledge/extraction.py:197-308. The resulting MemoryStoreManager exposes both .ainvoke(messages) and a .search(config=...) convenience method for inspecting the persisted memories.

The docstring shows the canonical LangGraph entrypoint pattern: the manager runs in the background after the agent replies, while the user-facing call returns immediately. Source: src/langmem/knowledge/extraction.py:230-308. The default parameter is useful for evolving prompt preferences or a "system" memory that is always present even before any conversation occurs.

Memory Tools for In-Conversation Use

Two StructuredTool factories let an agent manage its own memory mid-conversation:

create_manage_memory_tool(namespace=..., store=None, name=..., instructions=...) — exposes a manage_memory tool that creates, updates, or deletes stored memories based on a JSON payload. Source: src/langmem/knowledge/tools.py:1-130.
create_search_memory_tool(namespace=..., store=None, ...) — exposes a search_memory tool that returns both serialized memories and the raw memory objects, supporting query, limit, offset, and filter parameters. Source: src/langmem/knowledge/tools.py:131-200.

Both tools resolve their BaseStore through a private _get_store helper that prefers the caller-supplied store argument and otherwise falls back to get_store() from the active LangGraph context. A _ToolWithRequired subclass guarantees that the tool's JSON schema always carries a required list so that LLMs do not omit mandatory fields. Source: src/langmem/knowledge/tools.py:40-100.

The README demonstrates integrating these tools with create_react_agent: the system prompt is rendered with a <memories> block that contains results from store.search, and the agent is given create_manage_memory_tool to write new entries. Source: README.md:30-90.

Data Flow and Configuration

The diagram below summarizes the end-to-end flow when an agent uses both the in-conversation tools and the background store manager.

sequenceDiagram
    participant U as User
    participant A as Agent (create_react_agent)
    participant T as Memory Tools (manage/search)
    participant S as BaseStore (InMemoryStore or external)
    participant M as MemoryStoreManager
    participant LLM as LLM (extraction/embedding)

    U->>A: send message
    A->>S: search("memories", user_id)
    S-->>A: relevant memories
    A->>LLM: prompt + memories + user message
    LLM-->>A: tool calls (manage_memory, ...)
    A->>T: manage_memory(action=...)
    T->>S: put/update/delete
    A-->>U: response
    par Background
        A->>M: ainvoke(messages)
        M->>S: search existing memories
        S-->>M: candidates
        M->>LLM: extract/update/delete loop (max_steps)
        LLM-->>M: structured memories
        M->>S: put consolidated memories
    end

Key Configuration Parameters

Symbol	Purpose	Default	Source
`namespace`	Tuple organizing memories in `BaseStore`; supports `{langgraph_user_id}` placeholders	`("memories", "{langgraph_user_id}")`	extraction.py:209-263
`query_model`	Cheaper model for search query generation	`None` (uses `model`)	extraction.py:197-230
`query_limit`	Max candidate memories returned by the search step	`5`	extraction.py:197-230
`enable_inserts` / `enable_updates` / `enable_deletes`	Phase-level toggles for the extraction loop	`True / True / False`	extraction.py:85-115
`default` / `default_factory`	Baseline memory returned when no candidates exist	`None`	extraction.py:197-260
`store`	Caller-provided `BaseStore` (overrides `get_store()`)	`None`	tools.py:40-70

Custom-Store Usage

The standalone example shows that create_memory_store_manager does not require LangGraph's runtime — a custom InMemoryStore(index={...}) with OpenAI embeddings can be passed directly via the store= keyword. The example defines a PreferenceMemory Pydantic schema and invokes the manager with ("project", "{langgraph_user_id}") as the namespace. Source: examples/standalone_examples/custom_store_example.py:1-55.

Failure Modes and Common Pitfalls

Missing BaseStore context. The tools raise errors.ConfigurationError("Could not get store") when neither an explicit store is provided nor an active LangGraph context is available. Source: src/langmem/knowledge/tools.py:40-60.
Unresolvable namespace placeholders. If {langgraph_user_id} is not present in config["configurable"], the namespace template fails to render. Always supply a complete config dict when invoking managers outside a graph.
Loop runaway in extraction. The max_steps cap (default 1 for MemoryStoreManagerInput, configurable via max_steps in MemoryState) is the only safety net. A poorly prompted model that never calls Done will simply stop at the cap, which is usually fine, but noisy. Source: extraction.py:113-194.
Schema/JSON incompatibility. The _ensure_json_serializable helper falls back to str(content) if a Pydantic model cannot be dumped, which silently loses structure. Prefer passing Pydantic models whose model_dump(mode="json") succeeds. Source: tools.py:55-75.
Confusion with short-term summarization. summarize_messages (in src/langmem/short_term/summarization.py) compresses the live message list to fit context windows; it does not persist anything to BaseStore. Long-term persistence requires the knowledge module. Source: src/langmem/short_term/summarization.py:1-60.

Prompt Optimization and Learning

Related topics: LangMem Overview and Architecture, Short-Term Memory, Reflection, and Graph Workflows

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Gradient Optimizer

Continue reading this section for the full explanation and source context.

Section Meta-Prompt Optimizer

Continue reading this section for the full explanation and source context.

Section Prompt Memory Optimizer

Continue reading this section for the full explanation and source context.

Prompt Optimization and Learning

Overview and Purpose

langmem's prompt optimization subsystem provides algorithms that automatically improve LLM system prompts from observed conversation trajectories and explicit feedback. The module exposes two top-level factories — create_prompt_optimizer and create_multi_prompt_optimizer — both re-exported from langmem.prompts (src/langmem/prompts/__init__.py:1-26) and from the package root (src/langmem/__init__.py:1-30). The factories dispatch to one of three strategies: gradient, metaprompt, or prompt_memory (src/langmem/prompts/optimization.py:1-120). This makes it possible to bootstrap better prompts over time without hand-tuning, mirroring the same trajectory-plus-feedback loop used for memory extraction.

The subsystem is intentionally built on top of LangChain Runnable objects, so each optimizer can be ainvoked asynchronously, composed in LangGraph nodes, or stored alongside memory tools. Internally, optimizers wrap a model with create_extractor and schema-constrained tool calls so that the LLM must return structured updates rather than free-form text (src/langmem/prompts/metaprompt.py:1-60).

Core Data Types

All optimizers operate on a small, well-defined set of types defined in src/langmem/prompts/types.py:

Prompt — a TypedDict with required name and prompt fields, plus optional update_instructions and when_to_update strings that guide the optimizer per-prompt (src/langmem/prompts/types.py:1-30).
AnnotatedTrajectory — a NamedTuple pairing a list of AnyMessage history with optional feedback (e.g. a score or developer critique) (src/langmem/prompts/types.py:30-60).
OptimizerInput and MultiPromptOptimizerInput — the typed payloads passed to single- and multi-prompt optimizers, each carrying trajectories and a prompt (or list of prompts) to be improved.

These types flow directly into the reflection prompts hard-coded in src/langmem/prompts/prompt.py, which use <current_prompt>, <trajectory>, <feedback>, and <instructions> XML-tag delimiters to constrain the LLM's reasoning (src/langmem/prompts/prompt.py:1-30).

Optimization Strategies

create_prompt_optimizer is a thin dispatcher. The kind keyword selects the algorithm, and the config argument is narrowed by the kind (src/langmem/prompts/optimization.py:60-120). The three strategies differ in cost, interpretability, and how aggressively they reason over failures.

Strategy	LLM Calls	Reflection	Best For	Config Type
`gradient`	4–10	Multi-step critique → apply	Complex failure analysis	`GradientOptimizerConfig`
`metaprompt`	2–5	Single-step reflection per pass	Balanced speed/quality	`MetapromptOptimizerConfig`
`prompt_memory`	1	None — pattern extraction only	Cheap, periodic updates	None

Gradient Optimizer

The gradient strategy separates "what to improve" from "how to apply it." It first runs a critique pass against each trajectory to identify failure modes, then asks a second pass to translate those critiques into a concrete prompt edit (src/langmem/prompts/gradient.py:1-40). The system prompt DEFAULT_GRADIENT_METAPROMPT enumerates failure categories (correctness, completeness, style, tone, alignment) and instructs the model to recommend only minimally invasive changes (src/langmem/prompts/gradient.py:40-60). Iteration is bounded by max_reflection_steps and min_reflection_steps.

Meta-Prompt Optimizer

MetaPromptOptimizer collapses the gradient pipeline into a single LLM call that both thinks and critiques before producing an updated prompt (src/langmem/prompts/metaprompt.py:1-60). It exposes a think static method for scratchpad reasoning and a critique tool that returns an OptimizedPromptOutput schema. The get_prompt_extraction_schema helper in src/langmem/prompts/utils.py ensures that any f-string variables in the original prompt (detected via regex r"\{(.+?)\}") are preserved in the optimized version, using a VarHealer pipeline to repair malformed braces (src/langmem/prompts/utils.py:1-40).

Prompt Memory Optimizer

PromptMemoryMultiple is the lightest strategy, useful for stateless batched updates where a single LLM call must absorb many trajectories at once (src/langmem/prompts/stateless.py:1-60). It serializes each trajectory as a <trajectory i>...</trajectory i> / <feedback i>...</feedback i> block and asks the model to produce a GeneralResponse (a TypedDict with logic, update_prompt, and new_prompt) (src/langmem/prompts/prompt.py:30-50). The default model is Claude 3.5 Sonnet unless the caller passes a model string or instance (src/langmem/prompts/stateless.py:30-50).

Multi-Prompt Optimization

When several prompts are coupled (for example, a planner + executor pair in a multi-agent system), use create_multi_prompt_optimizer. It first classifies which prompts warrant an update using a Classify Pydantic model that validates choices against the supplied prompt names, then dispatches per-prompt updates concurrently with asyncio.gather (src/langmem/prompts/optimization.py:60-120). The MultiPromptOptimizer class wraps a single-prompt optimizer of the same kind and reuses it internally, guaranteeing that all prompts in a chain are updated with the same algorithmic guarantees (src/langmem/prompts/optimization.py:60-90).

A typical usage loop looks like this:

from langmem import create_multi_prompt_optimizer

optimizer = create_multi_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest", kind="metaprompt"
)
trajectories = [(messages, {"feedback": "Response should include a code example"})]
prompts = [
    {"name": "explain", "prompt": "Explain the concept"},
    {"name": "example", "prompt": "Provide a practical example"},
]
better_prompts = await optimizer(trajectories, prompts)

This pattern is documented in the module's docstring (src/langmem/prompts/__init__.py:1-20) and matches the single-prompt example in the same source (src/langmem/prompts/optimization.py:60-120).

Common Failure Modes

Over-eager edits. Both gradient and metaprompt optimizers are explicitly told to recommend changes only when there is evidence of failure; nonetheless, low-quality feedback can cause unnecessary rewrites (src/langmem/prompts/gradient.py:40-60).
Lost template variables. If a prompt contains {var} placeholders, the optimizer must preserve them; the get_prompt_extraction_schema helper in src/langmem/prompts/utils.py enforces this with a regex scan and a model_validator that runs VarHealer on the candidate output.
Invalid prompt names. In multi-prompt mode, the classifier uses a model_validator to reject names outside the supplied set, raising ValueError with the offending entries (src/langmem/prompts/optimization.py:60-90).
Unsupported kind. The dispatcher raises NotImplementedError for any value not in {gradient, metaprompt, prompt_memory} (src/langmem/prompts/optimization.py:60-120).

Short-Term Memory, Reflection, and Graph Workflows

Related topics: LangMem Overview and Architecture, Long-Term Memory: Extraction, Tools, and Store Managers, Prompt Optimization and Learning

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core data model

Continue reading this section for the full explanation and source context.

Section How summarizemessages works

Continue reading this section for the full explanation and source context.

Section Reflection executor

Continue reading this section for the full explanation and source context.

Short-Term Memory, Reflection, and Graph Workflows

LangMem provides three complementary mechanisms for keeping an agent's context lean, its behavior adaptive, and its execution flow composable: short-term memory (message summarization), reflection (background and prompt optimization), and graph workflows (LangGraph integrations). Together they let an agent forget old context intelligently, refine its own instructions, and run as part of a larger stateful graph.

1. Short-Term Memory: Message Summarization

The langmem.short_term package exposes a single high-level routine, summarize_messages, that compresses a long conversation into a running summary while preserving recent turns verbatim. Source: src/langmem/short_term/summarization.py:30-60.

Core data model

Three dataclasses describe the summarization state:

Dataclass	Purpose
`RunningSummary`	Carries the latest summary text, the set of message IDs already summarized, and the ID of the last message included.
`SummarizationResult`	Returns the trimmed message list (with a system summary) plus a `RunningSummary` for the next call.
`PreprocessedMessages`	Internal container holding messages to summarize, token counts, and any pre-existing system message.

Source: src/langmem/short_term/summarization.py:35-80.

How `summarize_messages` works

The function walks the message list, counts tokens with a pluggable token_counter (defaulting to an approximate counter), and decides which messages to fold into the summary. If the would-be summarized block exceeds max_tokens_before_summary, the routine calls _adjust_messages_before_summarization, which uses LangChain's trim_messages with start_on="human" and strategy="last" to keep only the most recent slice that still fits. Source: src/langmem/short_term/summarization.py:130-165.

A partial trigger condition: if the last message inside the budget is an AI tool call, the corresponding tool result messages are also summarized so the conversation stays coherent. Source: src/langmem/short_term/summarization.py:90-115.

The function returns a SummarizationResult whose messages list is suitable for direct LLM invocation, while running_summary can be threaded into the next call to avoid re-summarizing the same turns. Source: src/langmem/short_term/summarization.py:50-70.

2. Reflection: Background Memory and Prompt Optimization

Reflection in LangMem covers two concerns: (a) long-running background extraction/curation handled by ReflectionExecutor, and (b) prompt-level optimization that rewrites the system prompt from observed trajectories. Both are exported from the top-level package. Source: src/langmem/__init__.py:1-30.

Reflection executor

ReflectionExecutor is exposed as a public class from langmem.reflection and is listed in the package's __all__. Source: src/langmem/__init__.py:17. It is intended to run memory-management logic asynchronously in the background, decoupled from the request-handling hot path of the agent loop.

Prompt optimization strategies

create_prompt_optimizer returns a Runnable that takes trajectories (conversation + feedback) and a candidate prompt, and produces a refined prompt string. It supports three kind values:

"gradient" — separates "find weaknesses" from "recommend a patch," using a GradientOptimizerConfig.
"prompt_memory" — a single-shot meta-prompt; no extra config required.
"metaprompt" — multi-step reflection, configured by MetapromptOptimizerConfig with max_reflection_steps and min_reflection_steps parameters.

Source: src/langmem/prompts/optimization.py:120-180. All three variants share the same OptimizerInput schema (trajectories, prompt) and return a str, so they can be swapped without changing the surrounding graph. Source: src/langmem/prompts/optimization.py:80-100.

flowchart LR
    A[Conversation + Feedback] --> B[Prompt Optimizer]
    C[Current Prompt] --> B
    B --> D[Refined Prompt]
    D --> E[Agent Runtime]
    E --> F[New Trajectories]
    F --> B

3. Graph Workflows

The langmem.graphs package provides LangGraph-native building blocks that tie short-term memory, reflection, and knowledge extraction into a single composable graph.

Module layout

src/langmem/graphs/__init__.py re-exports the graph helpers as part of the public API.
src/langmem/graphs/auth.py contains authentication helpers used when a graph needs to identify a caller (for example, when scoping memory namespaces to langgraph_user_id).
src/langmem/graphs/prompts.py provides prompt templates and node functions suitable for use as LangGraph nodes, including the prompts referenced by the MemoryStoreManager and MemoryManager extraction pipelines.

Source: src/langmem/knowledge/extraction.py:140-170 (shows the namespace template ("memories", "{langgraph_user_id}") consumed by graph nodes).

Composition with the rest of LangMem

A typical workflow wires the short-term summarizer in front of a chat model, attaches create_manage_memory_tool and create_search_memory_tool to the agent, and runs a ReflectionExecutor as a background node that periodically calls create_memory_store_manager against the graph's BaseStore. The summarize_messages call returns a SummarizationResult whose messages list can be fed straight into the model node, and the RunningSummary can be stashed in graph state for the next turn. Source: src/langmem/short_term/summarization.py:50-80, src/langmem/knowledge/tools.py:1-40.

4. Common Failure Modes and Configuration Notes

Token-budget overflow. If n_tokens_to_summarize > max_tokens_to_summarize, _adjust_messages_before_summarization trims to the last slice; if trimming produces an empty list, a warning is emitted via warnings.warn. Source: src/langmem/short_term/summarization.py:155-170.
Missed tool-result pairing. AI tool calls whose tool messages fall outside max_tokens_before_summary may be dropped together with their results, leaving an orphan tool call in the recent context.
max_summary_tokens is advisory only. It estimates the budget; to actually cap the summary length, callers must pre-bind the model: model.bind(max_tokens=max_summary_tokens). Source: src/langmem/short_term/summarization.py:60-75.
Reflection outside LangGraph. When ReflectionExecutor or the store-backed tools cannot resolve a BaseStore (for example, no get_store() context), they raise a ConfigurationError from _get_store. Source: src/langmem/knowledge/tools.py:40-55.
Namespace placeholders. namespace tuples containing "{langgraph_user_id}" are resolved at runtime from config["configurable"]; missing keys will fail at lookup time. Source: src/langmem/knowledge/extraction.py:150-165.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 10 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/langchain-ai/langmem/issues/154

2. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/langchain-ai/langmem/issues/156

3. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | github_repo:920242883 | https://github.com/langchain-ai/langmem

4. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | github_repo:920242883 | https://github.com/langchain-ai/langmem

5. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem

6. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | github_repo:920242883 | https://github.com/langchain-ai/langmem

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | github_repo:920242883 | https://github.com/langchain-ai/langmem

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/langchain-ai/langmem/issues/164

9. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem

10. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 5

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using langmem with real data or production workflows.

Persistence? - github / github_issue
Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06) - github / github_issue
Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06) - github / github_issue
Enhance error message when summarization fails due to missing HumanMessa - github / github_issue
Configuration risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

langmem

LangMem Overview and Architecture

Related Pages

LangMem Overview and Architecture

Purpose and Scope

Module Layout and Public API

High-Level Architecture

Knowledge Extraction and Storage

Prompt Optimization

Short-Term Memory

Integration Patterns

Common Failure Modes

See Also

Long-Term Memory: Extraction, Tools, and Store Managers

Related Pages

Long-Term Memory: Extraction, Tools, and Store Managers

Purpose and Scope

Functional Extraction Primitives

`create_memory_manager`

`create_thread_extractor`

`create_memory_searcher`

Stateful Store Managers and Tools

`create_memory_store_manager`

Memory Tools for In-Conversation Use

Data Flow and Configuration

Key Configuration Parameters

Custom-Store Usage

Failure Modes and Common Pitfalls

See Also

Prompt Optimization and Learning

Related Pages

Prompt Optimization and Learning

Overview and Purpose

Core Data Types

Optimization Strategies

Gradient Optimizer

Meta-Prompt Optimizer

Prompt Memory Optimizer

Multi-Prompt Optimization

Common Failure Modes

See Also

Short-Term Memory, Reflection, and Graph Workflows

Related Pages

Short-Term Memory, Reflection, and Graph Workflows

1. Short-Term Memory: Message Summarization

Core data model

How `summarize_messages` works

2. Reflection: Background Memory and Prompt Optimization

Reflection executor

Prompt optimization strategies

3. Graph Workflows

Module layout

Composition with the rest of LangMem

4. Common Failure Modes and Configuration Notes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Installation risk: Installation risk requires verification

2. Security or permission risk: Security or permission risk requires verification

3. Configuration risk: Configuration risk requires verification

4. Capability evidence risk: Capability evidence risk requires verification

5. Maintenance risk: Maintenance risk requires verification

6. Security or permission risk: Security or permission risk requires verification

7. Security or permission risk: Security or permission risk requires verification

8. Security or permission risk: Security or permission risk requires verification

9. Maintenance risk: Maintenance risk requires verification

10. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence