Doramagic Project Pack · Human Manual

langmem

Related topics: System Architecture, Core Concepts, Installation and Setup

Home - LangMem Overview

Related topics: System Architecture, Core Concepts, Installation and Setup

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Memory Architecture Overview

Continue reading this section for the full explanation and source context.

Section Memory Types

Continue reading this section for the full explanation and source context.

Section Data Models

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Core Concepts, Installation and Setup

Home - LangMem Overview

LangMem is a library for memory management and prompt optimization in LLM applications. It provides tools for extracting, storing, and retrieving structured memories, as well as optimizing prompts based on conversation trajectories and feedback.

Core Concepts

LangMem operates across two primary domains: Long-Term Memory (knowledge extraction and storage) and Short-Term Memory (conversation summarization), with a complementary Prompt Optimization system for improving LLM instructions.

Memory Architecture Overview

graph TD
    A[User Conversation] --> B[Memory Manager]
    B --> C[Long-Term Memory Store]
    B --> D[Short-Term Summarization]
    
    E[Search/Retrieval] --> C
    F[Prompt Optimizer] --> G[Optimized Prompts]
    
    C --> H[Structured Memories]
    D --> I[Running Summary]

Components

Memory Types

LangMem provides several specialized memory components for different use cases.

ComponentFile LocationPurpose
MemoryManagersrc/langmem/knowledge/extraction.pyExtracts and manages long-term memories from conversations
MemoryStoreManagersrc/langmem/knowledge/extraction.pyManages memories with persistent storage (LangGraph BaseStore)
SummarizationNodesrc/langmem/short_term/summarization.pyProvides running summaries for short-term context
GradientPromptOptimizersrc/langmem/prompts/gradient.pyOptimizes prompts using gradient-based reflection

Data Models

LangMem uses TypedDict classes for type-safe data structures.

#### Prompt Structure

class Prompt(TypedDict, total=False):
    name: Required[str]
    prompt: Required[str]
    update_instructions: str | None
    when_to_update: str | None

Source: src/langmem/prompts/types.py:7-22

#### Annotated Trajectory

class AnnotatedTrajectory(typing.NamedTuple):
    messages: typing.Sequence[AnyMessage]
    feedback: dict[str, typing.Any] | str

Source: src/langmem/prompts/types.py:24-43

Memory Management

Creating a Memory Manager

from langmem import create_memory_manager

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    enable_inserts=True,
    enable_updates=True,
    enable_deletes=True,
)

Source: src/langmem/knowledge/extraction.py

Memory Store with LangGraph Integration

The MemoryStoreManager integrates with LangGraph's BaseStore for persistent memory storage.

from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    query_model="anthropic:claude-3-5-haiku-latest",
    query_limit=10,
    namespace=("memories", "{langgraph_user_id}"),
)

Source: src/langmem/knowledge/extraction.py

Search Flow with Query Model

sequenceDiagram
    participant Client
    participant Manager
    participant QueryLLM
    participant Store
    participant MainLLM

    Client->>Manager: messages
    Manager->>QueryLLM: generate search query
    QueryLLM-->>Manager: optimized query
    Manager->>Store: find memories
    Store-->>Manager: memories
    Manager->>MainLLM: analyze & extract
    MainLLM-->>Manager: memory updates
    Manager->>Store: apply changes
    Manager-->>Client: result

Source: src/langmem/knowledge/extraction.py

Short-Term Memory

Summarization Node

The SummarizationNode provides running summaries for managing conversation context within a LangGraph workflow.

from langmem.short_term import SummarizationNode, RunningSummary

summarization_node = SummarizationNode(
    model=summarization_model,
    max_tokens=256,
    max_tokens_before_summary=256,
    max_summary_tokens=128,
)

Source: src/langmem/short_term/summarization.py

State Update Format

The summarization node returns updates in this format:

{
    "output_messages_key": "<list of updated messages>",
    "context": {"running_summary": "<RunningSummary object>"}
}

Source: src/langmem/short_term/summarization.py

Prompt Optimization

LangMem provides multiple prompt optimization strategies through the create_prompt_optimizer and create_multi_prompt_optimizer functions.

Optimization Strategies

StrategyDescriptionConfiguration
gradientHypothesis-driven optimization with reflection loopsmax_reflection_steps, min_reflection_steps
metapromptMeta-learning based on conversation patternsOptional reflection step control
prompt_memoryLearns from successful conversation patternsNo additional config

Source: src/langmem/prompts/optimization.py

Single Prompt Optimization

from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")

trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
    {"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)

Source: src/langmem/prompts/optimization.py

Multi-Prompt Optimization

from langmem import create_multi_prompt_optimizer

optimizer = create_multi_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="metaprompt",
    config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)

better_prompts = await optimizer.ainvoke({
    "trajectories": trajectories,
    "prompts": prompts
})

Source: src/langmem/prompts/optimization.py

Gradient Optimizer Workflow

graph TD
    A[Current Prompt] --> B[Generate Hypotheses]
    B --> C[Hypothesis Analysis]
    C --> D{Reflection Loop}
    D -->|Within steps| E[Generate Recommendations]
    E --> F[Apply Adjustments]
    F --> D
    D -->|Complete| G[Optimized Prompt]

Source: src/langmem/prompts/gradient.py

Memory Tools

LangMem provides standalone tools for memory management in agent workflows.

Create Manage Memory Tool

from langmem import create_manage_memory_tool
from langgraph.prebuilt import create_react_agent

agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[
        create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
    ],
    store=store,
)

Source: src/langmem/knowledge/tools.py

Create Search Memory Tool

from langmem import create_search_memory_tool

search_tool = create_search_memory_tool(
    namespace=("project_memories", "{langgraph_user_id}"),
)

memories, _ = await search_tool.ainvoke(
    {"query": "Python preferences", "limit": 5}
)

Source: src/langmem/knowledge/tools.py

Tool Configuration Options

ParameterTypeDescription
namespacetuple[str, ...]Hierarchical path for memory organization
actions_permittedlist[str]Limit actions (create, update, delete)
schemaBaseModelCustom memory schema
query_limitintMaximum results to retrieve (default: 10)

Source: src/langmem/knowledge/tools.py

Usage Patterns

Standalone Usage

LangMem can be used independently of LangGraph:

from langmem import create_memory_manager
from langmem.schemas import PreferenceMemory

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
)

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

memories = await manager.ainvoke({"messages": conversation})

Source: examples/standalone_examples/README.md

LangGraph Integration

from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore

@entrypoint(store=store)
async def my_agent(message: str):
    response = {"role": "assistant", "content": "I'll remember that"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response

Source: src/langmem/knowledge/extraction.py

Configuration

Memory Namespaces

Namespaces use runtime configuration with placeholders:

namespace=("memories", "{langgraph_user_id}")

# Runtime config
config = {"configurable": {"langgraph_user_id": "user123"}}
# Results in: ("memories", "user123")

Source: src/langmem/knowledge/extraction.py, src/langmem/knowledge/tools.py

Default Memory Values

Provide fallback values when no memories are found:

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    default="Use a concise and professional tone in all responses.",
)

Source: src/langmem/knowledge/extraction.py

API Reference

Core Functions

FunctionReturn TypePurpose
create_memory_managerMemoryManagerExtract memories from conversations
create_memory_searcherRunnableSearch for relevant memories
create_memory_store_managerMemoryStoreManagerMemory with persistent storage
create_prompt_optimizerRunnableOptimize single prompts
create_multi_prompt_optimizerRunnableOptimize multiple prompts
create_manage_memory_toolBaseToolMemory management tool for agents
create_search_memory_toolBaseToolMemory search tool for agents

Source: src/langmem/knowledge/extraction.py, src/langmem/prompts/optimization.py, src/langmem/knowledge/tools.py

Installation and Setup

uv venv
source .venv/bin/activate
uv sync

Set your API key:

export OPENAI_API_KEY=your_api_key_here

Source: examples/standalone_examples/README.md

Summary

LangMem provides a comprehensive toolkit for managing both long-term and short-term memory in LLM applications:

  • Long-Term Memory: Extract, store, search, and update structured memories using memory managers and tools
  • Short-Term Memory: Summarize conversations with the SummarizationNode for efficient context management
  • Prompt Optimization: Improve prompts using gradient, metaprompt, or memory-based strategies
  • Agent Integration: Tools work seamlessly with LangGraph's prebuilt agents and store infrastructure

Source: https://github.com/langchain-ai/langmem / Human Manual

Installation and Setup

Related topics: Home - LangMem Overview, LangGraph Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Using pip

Continue reading this section for the full explanation and source context.

Section Using uv (Recommended)

Continue reading this section for the full explanation and source context.

Section Development Installation

Continue reading this section for the full explanation and source context.

Related topics: Home - LangMem Overview, LangGraph Integration

Installation and Setup

Overview

LangMem is a Python library for memory management and prompt optimization in LLM applications. The library provides components for short-term summarization, long-term memory storage, and prompt optimization. This page covers the complete installation process, dependencies, environment configuration, and setup for both basic and advanced usage scenarios.

Prerequisites

Before installing LangMem, ensure your environment meets the following requirements:

RequirementMinimum VersionNotes
Python3.10+Required for modern typing features
pip/uvLatest recommendedPackage manager for installation
API KeysProvider-specificOpenAI, Anthropic, or other LLM providers

LangMem depends on the LangChain and LangGraph ecosystems. The library is designed to integrate seamlessly with LangGraph's state management and memory store abstractions. Source: src/langmem/knowledge/extraction.py

Installation Methods

Using pip

Install LangMem directly from PyPI:

pip install langmem

For faster dependency resolution and better workspace management:

uv pip install langmem

Development Installation

For contributors or those wanting the latest unreleased features:

# Clone the repository
git clone https://github.com/langchain-ai/langmem.git
cd langmem

# Create virtual environment
uv venv
source .venv/bin/activate

# Install with all dependencies
uv sync

Source: examples/standalone_examples/README.md

Core Dependencies

LangMem relies on several key packages from the Python AI ecosystem:

PackagePurposeImport Usage
langchain-coreBase chat models and message typesfrom langchain_core.messages import AnyMessage
langgraphState management and store abstractionsfrom langgraph.store.memory import InMemoryStore
pydanticData validation and schema definitionsclass UserProfile(BaseModel)
typing_extensionsEnhanced typing supportfrom typing_extensions import Required, TypedDict

The library uses TypedDict with Required for type-safe prompt and trajectory definitions. Source: src/langmem/prompts/types.py

Optional Dependencies

Depending on your use case, you may need additional packages:

# For OpenAI integration
pip install langchain-openai

# For Anthropic integration
pip install langchain-anthropic

# For vector store with embeddings
pip install langchain-openai  # includes embedding support

Environment Configuration

API Key Setup

LangMem requires API access to language model providers. Set your API keys as environment variables:

# For OpenAI
export OPENAI_API_KEY=your_api_key_here

# For Anthropic
export ANTHROPIC_API_KEY=your_api_key_here

Alternatively, pass API keys directly when configuring models:

from langmem import create_memory_manager

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",  # Model identifier
    schemas=[PreferenceMemory],
)

Source: examples/standalone_examples/README.md

Runtime Configuration

LangMem uses runtime configuration through RunnableConfig for namespace and store management:

from langgraph.config import get_config, get_store

# Configure namespace with user-specific identifiers
config = {"configurable": {"langgraph_user_id": "user123"}}

# Access the store within LangGraph context
store = get_store()

Project Structure

Understanding the module organization helps with imports and customization:

langmem/
├── knowledge/           # Long-term memory management
│   ├── extraction.py     # Memory extraction and management
│   └── tools.py          # Memory tools for agents
├── prompts/             # Prompt optimization
│   ├── types.py         # TypedDict definitions
│   ├── optimization.py  # Prompt optimization logic
│   ├── gradient.py      # Gradient-based optimization
│   └── prompt.py        # Prompt templates
└── short_term/          # Short-term memory
    └── summarization.py # Conversation summarization

Source: src/langmem/prompts/types.py

Quick Start Setup

1. Basic Memory Manager Setup

from langmem import create_memory_manager
from pydantic import BaseModel

# Define your memory schema
class PreferenceMemory(BaseModel):
    preference: str
    context: str | None = None

# Create the memory manager
manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
)

# Process a conversation
conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

memories = await manager.ainvoke({"messages": conversation})

Source: src/langmem/knowledge/extraction.py

2. Memory Store with Vector Embeddings

from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore

# Create store with embedding configuration
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

# Create store manager with namespace
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    query_model="anthropic:claude-3-5-haiku-latest",
    query_limit=10,
    namespace=("memories", "{langgraph_user_id}"),
)

Source: src/langmem/knowledge/extraction.py

3. Standalone Example Setup

For use outside of LangGraph:

# custom_store_example.py
from langmem import create_memory_manager
from pydantic import BaseModel

class PreferenceMemory(BaseModel):
    category: str
    preference: str
    context: str

manager = create_memory_manager(
    "openai:gpt-4o",
    schemas=[PreferenceMemory],
)

# Process and store memories
conversation = [
    {"role": "user", "content": "User prefers dark mode in all applications."},
]
memories = await manager.ainvoke({"messages": conversation})

Source: examples/standalone_examples/README.md

Integration Setup

LangGraph Agent Integration

from langgraph.prebuilt import create_react_agent
from langmem import create_memory_store_manager, create_manage_memory_tool

# Create memory manager
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
)

# Create agent with memory tool
agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[
        create_manage_memory_tool(
            namespace=("memories", "{langgraph_user_id}"),
            actions_permitted=["create", "update"],
        ),
    ],
    store=store,
)

Source: src/langmem/knowledge/tools.py

Prompt Optimizer Setup

from langmem import create_prompt_optimizer

# Initialize optimizer
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")

# Optimize a prompt with conversation history
trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
    {"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)

Source: src/langmem/prompts/optimization.py

Summarization Node Setup

from langmem.short_term import SummarizationNode, RunningSummary
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")
summarization_model = model.bind(max_tokens=128)

summarization_node = SummarizationNode(
    model=summarization_model,
    max_tokens=256,
    max_tokens_before_summary=256,
    max_summary_tokens=128,
)

Source: src/langmem/short_term/summarization.py

Configuration Options

Memory Manager Configuration

ParameterTypeDefaultDescription
model`str \BaseChatModel`RequiredLanguage model for memory processing
schemaslist[type]RequiredPydantic models for memory structure
instructionsstrNoneCustom instructions for extraction
enable_insertsboolTrueAllow creating new memories
enable_updatesboolTrueAllow updating existing memories
enable_deletesboolTrueAllow deleting memories

Memory Store Manager Configuration

ParameterTypeDefaultDescription
query_model`str \BaseChatModel`NoneSeparate model for search queries
query_limitint10Number of memories to retrieve
defaultAnyNoneDefault memory value if none found
default_factoryCallableNoneFactory for default memory creation
namespacetuple[str, ...]("memories", "{langgraph_user_id}")Storage namespace

Prompt Optimizer Configuration

ParameterTypeDefaultDescription
kindLiteral["metaprompt", "prompt_memory"]RequiredOptimization strategy
max_reflection_stepsint3Maximum reflection iterations
min_reflection_stepsint1Minimum reflection iterations

Verification and Testing

After installation, verify your setup:

# Check version
python -c "import langmem; print(langmem.__version__)"

# Run standalone examples
cd examples/standalone_examples
uv run custom_store_example.py

Expected output:

Starting custom store example...
Processing conversation...
Stored memories:
Memory 31cf472f-3491-4f0c-82ec-09b4fe409cfd:
Content: {'category': 'User Preference', 'preference': 'Dark Mode', ...}
Example completed.

Source: examples/standalone_examples/README.md

Troubleshooting

Common Installation Issues

Missing dependencies:

# Reinstall with all dependencies
uv sync
# or
pip install langmem[all]

API key not found:

# Verify environment variable is set
import os
print(os.environ.get("OPENAI_API_KEY"))

LangGraph store not initialized:

# Ensure store is passed to agent
agent = create_react_agent(
    model,
    tools=[...],
    store=store,  # Must be provided
)

Import Errors

If you encounter import errors, ensure all required packages are installed:

pip install langchain-core langgraph pydantic typing-extensions

Next Steps

After completing installation and setup:

  1. Review the Memory Management guide
  2. Explore Prompt Optimization
  3. Try the Standalone Examples
  4. Integrate with your existing LangGraph application

Source: https://github.com/langchain-ai/langmem / Human Manual

System Architecture

Related topics: Core Concepts, Memory Tools - Hot Path Management, Prompt Optimization

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. Prompts Module

Continue reading this section for the full explanation and source context.

Section 2. Knowledge Module

Continue reading this section for the full explanation and source context.

Section 3. Short-term Module

Continue reading this section for the full explanation and source context.

Related topics: Core Concepts, Memory Tools - Hot Path Management, Prompt Optimization

System Architecture

LangMem is a library designed to enhance AI agents with memory capabilities and prompt optimization. The system architecture consists of three primary modules: Knowledge (long-term memory), Short-term (session summarization), and Prompts (optimization). These modules work together to enable AI agents to store, retrieve, and optimize information over time.

Overview

LangMem provides a layered architecture that separates concerns across memory management, prompt optimization, and state summarization. The library integrates with LangGraph's store infrastructure and supports both synchronous and asynchronous operations.

graph TD
    A[AI Agent] --> B[LangMem Core]
    B --> C[Prompts Module]
    B --> D[Knowledge Module]
    B --> E[Short-term Module]
    C --> F[Prompt Optimization]
    C --> G[Multi-Prompt Optimization]
    D --> H[Memory Manager]
    D --> I[Memory Tools]
    D --> J[Store Manager]
    E --> K[Summarization Node]
    J --> L[LangGraph BaseStore]
    H --> L

Core Modules

1. Prompts Module

The Prompts module handles prompt management and optimization strategies. It defines core types and provides factories for creating prompt optimizers.

#### Key Components

ComponentFilePurpose
Prompttypes.pyTypedDict for structured prompt management
AnnotatedTrajectorytypes.pyNamedTuple for conversation history with feedback
PromptOptimizerInputtypes.pyInput schema for single prompt optimization
MultiPromptOptimizerInputtypes.pyInput schema for multi-prompt optimization
INSTRUCTION_REFLECTION_PROMPTprompt.pyTemplate for prompt reflection
create_prompt_optimizeroptimization.pyFactory for single prompt optimizer
create_multi_prompt_optimizeroptimization.pyFactory for multi-prompt optimizer

Source: src/langmem/prompts/types.py:1-94 Source: src/langmem/prompts/optimization.py

#### Data Flow: Prompt Optimization

sequenceDiagram
    participant U as User
    participant O as Prompt Optimizer
    participant M as Memory
    participant P as Prompt Store

    U->>O: Trajectories + Current Prompt
    O->>M: Extract Patterns
    M-->>O: Success Patterns
    O->>P: Apply Optimization
    P-->>O: Optimized Prompt
    O-->>U: Updated Prompt

2. Knowledge Module

The Knowledge module implements long-term memory management using LangGraph's BaseStore. It supports extraction, storage, search, and manipulation of memories.

#### Key Components

ComponentFilePurpose
create_memory_managerextraction.pyCreates a memory manager for extraction and synthesis
create_memory_searcherextraction.pyCreates a search pipeline with automatic query generation
create_memory_store_managerextraction.pyCreates a store-based memory manager
create_manage_memory_tooltools.pyCreates a LangGraph tool for memory CRUD operations
create_search_memory_tooltools.pyCreates a search tool for memory retrieval

Source: src/langmem/knowledge/extraction.py Source: src/langmem/knowledge/tools.py

#### Memory Manager Architecture

graph TD
    A[Input: Messages + Existing Memories] --> B[Memory Manager]
    B --> C[Extract Tool Calls]
    C --> D{Done?}
    D -->|No| E[Invoke Extractor]
    E --> F[Process Responses]
    F --> G{More Steps?}
    G -->|Yes| D
    G -->|No| H[Update Memories]
    D -->|Yes| H
    H --> I[Return Updated Memories]

#### Factory Functions

FunctionReturn TypeDescription
create_memory_managerMemoryManagerCore extraction and synthesis with configurable schemas
create_memory_searcherRunnable[MessagesState, Awaitable[list[SearchItem]]]Search pipeline with query generation
create_memory_store_managerMemoryStoreManagerDirect store operations with search
create_manage_memory_toolToolLangGraph tool for CRUD operations
create_search_memory_toolToolLangGraph tool for memory search

Source: src/langmem/knowledge/extraction.py

3. Short-term Module

The Short-term module provides session-level summarization to compress conversation history into maintainable state.

#### Key Components

ComponentFilePurpose
SummarizationNodesummarization.pyLangGraph node for message summarization
RunningSummarysummarization.pyState container for running summaries

Source: src/langmem/short_term/summarization.py

Type System

Prompt TypedDict

class Prompt(TypedDict, total=False):
    name: Required[str]
    prompt: Required[str]
    update_instructions: str | None
    when_to_update: str | None
FieldTypeRequiredDescription
namestrYesUnique identifier for the prompt
promptstrYesThe actual prompt content
update_instructions`str \None`NoGuidelines for modifying the prompt
when_to_update`str \None`NoDependencies between prompts during optimization

Source: src/langmem/prompts/types.py:9-38

AnnotatedTrajectory

class AnnotatedTrajectory(typing.NamedTuple):
    messages: typing.Sequence[AnyMessage] | str
    feedback: str | None = None
FieldTypeDescription
messages`Sequence[AnyMessage] \str`Conversation history
feedback`str \None`Optional feedback for optimization

Source: src/langmem/prompts/types.py:40-65

Memory Management Workflow

Extraction Pipeline

The memory manager uses a multi-step extraction process that iteratively invokes an extractor tool until completion:

sequenceDiagram
    participant C as Client
    participant M as Memory Manager
    participant E as Extractor
    participant S as Store

    C->>M: Messages + Existing Memories
    M->>E: Invoke with tools
    E-->>M: Response with tool calls
    M->>M: Process results
    M->>S: Apply changes
    M-->>C: Updated memories

Source: src/langmem/knowledge/extraction.py

Search Pipeline

The searcher generates optimized queries and retrieves semantically similar memories:

sequenceDiagram
    participant C as Client
    participant S as Searcher
    participant Q as Query LLM
    participant T as Store

    C->>S: Query context
    S->>Q: Generate search query
    Q-->>S: Optimized query
    S->>T: Search memories
    T-->>S: Results
    S-->>C: Ranked memories

Tool Integration

LangMem provides LangGraph-native tools that connect to the BaseStore:

Manage Memory Tool

create_manage_memory_tool(
    namespace=("memories", "{langgraph_user_id}"),
    schema=PreferenceMemory,
    actions_permitted=["create", "update", "delete"],
    instructions="Update user preferences based on shared information."
)

Source: src/langmem/knowledge/tools.py

Search Memory Tool

create_search_memory_tool(
    namespace=("memories", "{langgraph_user_id}"),
)

Namespace Configuration

Memories are organized using hierarchical namespaces that support runtime configuration:

PatternDescription
("memories", "{langgraph_user_id}")User-specific memories
("memories", "{langgraph_user_id}", "user_profile")User profile memories
("project_memories", "{langgraph_user_id}")Project-scoped memories

Source: src/langmem/knowledge/extraction.py

Entry Points

The library exposes a clean public API through __init__.py files in each module:

ModuleExports
langmemMemory creation, extraction, and optimization
langmem.promptsPrompt types and optimization
langmem.knowledgeMemory managers and tools
langmem.short_termSummarization components

Source: src/langmem/__init__.py Source: src/langmem/prompts/__init__.py Source: src/langmem/knowledge/__init__.py Source: src/langmem/short_term/__init__.py

Integration with LangGraph

LangMem is designed to work seamlessly with LangGraph through:

  1. BaseStore Integration: Memory operations use LangGraph's BaseStore interface
  2. Tool Protocol: All tools follow LangGraph's tool conventions
  3. Runnable Interface: Managers implement Runnable for composable pipelines
  4. Checkpoint Compatibility: Summarization nodes integrate with LangGraph's state management
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

@entrypoint(store=store)
async def my_agent(message: str):
    manager = create_memory_store_manager("anthropic:claude-3-5-sonnet-latest")
    response = {"role": "assistant", "content": "I'll remember that"}
    await manager.ainvoke({"messages": [{"role": "user", "content": message}, response]})
    return response

Source: src/langmem/knowledge/extraction.py

Source: https://github.com/langchain-ai/langmem / Human Manual

Core Concepts

Related topics: System Architecture, Memory Tools - Hot Path Management, Background Memory Manager

Section Related Pages

Continue reading this section for the full explanation and source context.

Section The Prompt Type

Continue reading this section for the full explanation and source context.

Section AnnotatedTrajectory

Continue reading this section for the full explanation and source context.

Section OptimizerInput Types

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Tools - Hot Path Management, Background Memory Manager

Core Concepts

LangMem is a library for memory management and prompt optimization in LLM applications. This page explains the foundational concepts that underpin the library's architecture, including type systems, memory management strategies, and prompt optimization approaches.

Overview

LangMem provides two primary capabilities:

  1. Memory Management - Storing, retrieving, and managing conversation context and user preferences
  2. Prompt Optimization - Improving LLM prompts based on conversation trajectories and feedback

The library is designed to integrate with LangGraph while also supporting standalone usage in custom applications. Source: src/langmem/knowledge/extraction.py:1-50

Type System

LangMem defines a robust type system for managing prompts and conversation data. These types serve as the foundation for all optimization and memory operations.

The Prompt Type

The Prompt TypedDict represents a structured prompt with metadata for optimization control.

class Prompt(TypedDict, total=False):
    name: Required[str]
    prompt: Required[str]
    update_instructions: str | None
    when_to_update: str | None
FieldTypeRequiredDescription
namestrYesUnique identifier for the prompt
promptstrYesThe actual prompt content
update_instructions`str \None`NoGuidelines for modifying the prompt
when_to_update`str \None`NoDependencies or triggers for updates

Source: src/langmem/prompts/types.py:10-40

Example usage:

from langmem import Prompt

prompt = Prompt(
    name="extract_entities",
    prompt="Extract key entities from the text:",
    update_instructions="Make minimal changes, only address where errors have occurred.",
    when_to_update="If there seem to be errors in recall of named entities.",
)

AnnotatedTrajectory

The AnnotatedTrajectory NamedTuple captures conversation history with optional feedback for optimization.

class AnnotatedTrajectory(typing.NamedTuple):
    messages: typing.Sequence[AnyMessage]
    feedback: dict[str, str | int | bool] | str | None = None
FieldTypeDescription
messagesSequence[AnyMessage]List of conversation messages
feedback`dict \str \None`Optional feedback for analysis

Source: src/langmem/prompts/types.py:56-70

OptimizerInput Types

LangMem provides two input types for prompt optimization:

#### Single Prompt Optimization

class OptimizerInput(TypedDict):
    trajectories: typing.Sequence[AnnotatedTrajectory] | str
    prompt: str | Prompt

#### Multi-Prompt Optimization

class MultiPromptOptimizerInput(TypedDict):
    trajectories: typing.Sequence[AnnotatedTrajectory] | str
    prompts: list[Prompt]

Source: src/langmem/prompts/types.py:73-120

Memory Management

LangMem provides a hierarchical memory management system for storing and retrieving conversation context.

Architecture Overview

graph TD
    A[Conversation Messages] --> B[Memory Manager]
    B --> C[Memory Store]
    D[User Query] --> E[Memory Searcher]
    E --> F[Retrieved Memories]
    C --> F
    F --> G[LLM Response]

MemoryManager

The MemoryManager class handles in-memory operations for memory extraction and updates.

Creation via factory function:

from langmem import create_memory_manager

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    enable_inserts=True,
    enable_updates=True,
    enable_deletes=True,
)

Source: src/langmem/knowledge/extraction.py:200-250

Supported Operations:

OperationDescription
ainvokeAsynchronously process messages and update memories
ainvoke({"messages": conversation, "max_steps": 3})Set max reflection steps for extraction

MemoryStoreManager

The MemoryStoreManager extends memory capabilities with persistent storage integration using LangGraph's BaseStore.

Creation:

from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    query_model="anthropic:claude-3-5-haiku-latest",
    query_limit=10,
    namespace=("memories", "{langgraph_user_id}"),
    store=store,
)

Source: src/langmem/knowledge/extraction.py:50-100

Namespace Configuration:

Namespaces use runtime configuration with placeholders:

FormatDescription
("memories", "{langgraph_user_id}")User-specific memories

Memory Search Pipeline

The create_memory_searcher function creates a search pipeline with automatic query generation.

from langmem import create_memory_searcher

searcher = create_memory_searcher(
    "anthropic:claude-3-5-sonnet-latest",
    prompt="Search for distinct memories relevant to different aspects of the provided context.",
    namespace=("memories", "{langgraph_user_id}"),
)

Source: src/langmem/knowledge/extraction.py:280-320

Memory Search Flow

sequenceDiagram
    participant Client
    participant Manager
    participant QueryLLM
    participant Store
    participant MainLLM

    Client->>Manager: messages
    Manager->>QueryLLM: generate search query
    QueryLLM-->>Manager: optimized query
    Manager->>Store: find memories
    Store-->>Manager: memories
    Manager->>MainLLM: analyze & extract
    MainLLM-->>Manager: memory updates
    Manager->>Store: apply changes
    Manager-->>Client: result

Memory Tools for LangGraph

LangMem provides pre-built tools for integration with LangGraph's create_react_agent.

#### Manage Memory Tool

from langmem import create_manage_memory_tool

tool = create_manage_memory_tool(
    namespace=("memories", "{langgraph_user_id}"),
    schema=PreferenceMemory,
    actions_permitted=["create", "update", "delete"],
)

Source: src/langmem/knowledge/tools.py:50-100

#### Search Memory Tool

from langmem import create_search_memory_tool

search_tool = create_search_memory_tool(
    namespace=("project_memories", "{langgraph_user_id}"),
)

Source: src/langmem/knowledge/tools.py:200-250

Memory Layer

The MemoryLayer class provides a declarative API for composing memory capabilities in prompts.

class MemoryLayer(Runnable):
    __slots__ = (
        "name",
        "namespace",
        "kind",
        "update_instructions",
        "schemas",
        "limit",
        "_manager_tool",
        "_search_tool",
    )

Source: src/langmem/prompts/_layers.py:20-35

Prompt Optimization

LangMem provides multiple strategies for optimizing prompts based on conversation history and feedback.

Optimization Strategies

StrategyDescription
metapromptUses reflection-based optimization with configurable steps
prompt_memoryLearns from past successful patterns
instruction_reflectionDirectly modifies prompts based on instructions

Single Prompt Optimizer

from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="metaprompt",
    config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)

# Usage
trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
    {"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)

Source: src/langmem/prompts/optimization.py:80-120

Multi-Prompt Optimizer

For optimizing multiple related prompts together:

from langmem import create_multi_prompt_optimizer

optimizer = create_multi_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="prompt_memory",
)

prompts = [
    {"name": "explain", "prompt": "Explain the concept"},
    {"name": "example", "prompt": "Provide a practical example"},
]

better_prompts = await optimizer(trajectories, prompts)

Source: src/langmem/prompts/optimization.py:150-200

Meta-Prompt Optimization Flow

graph TD
    A[Current Prompt + Trajectory] --> B[Reflection Steps]
    B --> C{More iterations?}
    C -->|Yes| D[Apply Instructions]
    D --> B
    C -->|No| E[Final Prompt]
    
    F[Max Steps Config] --> B
    G[Min Steps Config] --> B

Instruction Reflection Prompt

The instruction reflection mechanism uses structured prompts:

INSTRUCTION_REFLECTION_PROMPT = """You are helping an AI agent improve. You can do this by changing their system prompt.

These is their current prompt:
<current_prompt>
{current_prompt}
</current_prompt>

Here was the agent's trajectory:
<trajectory>
{trajectory}
</trajectory>

Here is the user's feedback:

<feedback>
{feedback}
</feedback>

Here are instructions for updating the agent's prompt:

<instructions>
{instructions}
</instructions>


Based on this, return an updated prompt"""

Source: src/langmem/prompts/prompt.py:1-30

Response Schema

The optimization returns a structured response:

class GeneralResponse(TypedDict):
    logic: str
    update_prompt: bool
    new_prompt: str

Source: src/langmem/prompts/prompt.py:35-40

Integration Patterns

Standalone Usage

LangMem can be used independently of LangGraph:

from langmem import create_memory_manager, create_prompt_optimizer

# Memory management
manager = create_memory_manager("anthropic:claude-3-5-sonnet-latest")
memories = await manager.ainvoke({"messages": conversation})

# Prompt optimization
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")
improved = await optimizer.ainvoke({"trajectories": trajectories, "prompt": base})

Source: examples/standalone_examples/README.md:1-50

LangGraph Integration

LangMem integrates with LangGraph's agent and store infrastructure:

from langgraph.prebuilt import create_react_agent
from langgraph.func import entrypoint

agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}"))],
    store=store,
)

Source: src/langmem/knowledge/tools.py:60-80

Configuration Options

#### Memory Manager Configuration

ParameterTypeDefaultDescription
model`str \BaseChatModel`RequiredLanguage model for memory processing
schemaslist[type[BaseModel]]NonePydantic schemas for memory validation
instructionsstrNoneCustom instructions for the manager
enable_insertsboolTrueAllow creating new memories
enable_updatesboolTrueAllow updating existing memories
enable_deletesboolTrueAllow deleting memories
query_model`str \BaseChatModel`NoneSeparate model for search queries
query_limitint10Maximum memories to retrieve

Source: src/langmem/knowledge/extraction.py:50-120

#### Prompt Optimizer Configuration

ParameterTypeDefaultDescription
kindLiteral["metaprompt", "prompt_memory", "instruction_reflection"]RequiredOptimization strategy
configdictNoneStrategy-specific configuration
max_reflection_stepsint3Maximum reflection iterations
min_reflection_stepsint1Minimum reflection iterations

Source: src/langmem/prompts/optimization.py:80-120

Summary

LangMem's core concepts provide a comprehensive framework for:

  1. Structured Prompt Management - Using TypedDict types for prompts with metadata for optimization control
  2. Memory Storage and Retrieval - Persisting conversation context with namespace-based organization
  3. Automatic Memory Extraction - Using LLMs to extract and synthesize memories from conversations
  4. Multi-Strategy Prompt Optimization - Improving prompts through reflection, memory patterns, or instruction following

These concepts work together to enable intelligent, self-improving LLM applications that maintain context and continuously refine their behavior.

Source: https://github.com/langchain-ai/langmem / Human Manual

Memory Tools - Hot Path Management

Related topics: Core Concepts, Background Memory Manager, LangGraph Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Responsibilities

Continue reading this section for the full explanation and source context.

Related topics: Core Concepts, Background Memory Manager, LangGraph Integration

Memory Tools - Hot Path Management

Overview

Memory Tools in LangMem provide real-time, interactive capabilities for managing persistent memories during conversation execution. Unlike the background extraction pipeline (which processes conversation history asynchronously), Hot Path Management enables direct manipulation and retrieval of memories within the active conversation flow.

The hot path refers to the synchronous execution path where memories are created, updated, deleted, or searched in real-time as part of agent/tool interactions. This approach allows AI assistants to:

  • Persist newly discovered user preferences immediately
  • Update outdated memories when corrections occur
  • Delete irrelevant or incorrect memories
  • Search and retrieve relevant context on-demand

Source: src/langmem/knowledge/tools.py

Architecture

The Memory Tools system consists of three primary components that operate on the hot path:

graph TD
    A[Agent / Workflow] --> B[Manage Memory Tool]
    A --> C[Search Memory Tool]
    A --> D[Memory Searcher]
    B --> E[LangGraph BaseStore]
    C --> E
    D --> E
    E --> F[Namespace: memories, {user_id}]
    E --> G[Namespace: project_memories, {user_id}]

Component Responsibilities

ComponentPurposeSync/Async
create_manage_memory_toolCRUD operations for memoriesBoth
create_search_memory_toolQuery-based memory retrievalBoth
create_memory_searcherLLM-powered query generation + searchAsync

Source: src/langmem/knowledge/tools.py and src/langmem/knowledge/extraction.py

Source: https://github.com/langchain-ai/langmem / Human Manual

Background Memory Manager

Related topics: Core Concepts, Memory Tools - Hot Path Management, Short-term Memory and Summarization

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section Component Interactions

Continue reading this section for the full explanation and source context.

Section ReflectionExecutor

Continue reading this section for the full explanation and source context.

Related topics: Core Concepts, Memory Tools - Hot Path Management, Short-term Memory and Summarization

Background Memory Manager

The Background Memory Manager is a core component of langmem that enables asynchronous, non-blocking memory extraction and storage within agent workflows. It allows memory operations to be executed in background threads or on separate servers, ensuring that the primary agent response latency is not impacted by memory processing overhead.

Overview

The Background Memory Manager is primarily implemented through the ReflectionExecutor class, which orchestrates memory enrichment operations asynchronously. This approach decouples the memory management from the main agent thread, enabling production-ready applications where memory persistence happens transparently after user interactions are acknowledged. Source: src/langmem/knowledge/extraction.py

The system is designed to work seamlessly with LangGraph's BaseStore abstraction, allowing memories to be stored, retrieved, updated, and deleted through semantic search capabilities. The background execution model ensures that even complex memory extraction and synthesis operations do not block the agent's response to users.

Architecture

High-Level Architecture

The Background Memory Manager follows a producer-consumer pattern where the main agent produces memory enrichment tasks and the ReflectionExecutor consumes them asynchronously.

graph TD
    A[User Message] --> B[Agent Processing]
    B --> C[User Response]
    C -->|Schedule Enrichment| D[ReflectionExecutor]
    D -->|after_seconds=0| E[Background Thread]
    E --> F[MemoryStoreManager]
    F --> G[BaseStore]
    F --> H[Query LLM]
    G --> I[Vector Search]
    I --> F
    H -->|Generate Query| F
    F --> J[Memory Updates]
    J --> G

Component Interactions

The system consists of several interconnected components that work together to provide background memory management:

ComponentRoleSource
ReflectionExecutorSchedules and executes memory operations in backgroundextraction.py
MemoryStoreManagerOrchestrates memory CRUD operations with searchextraction.py
MemoryManagerCore extraction logic with multi-step synthesisextraction.py
BaseStoreLangGraph's storage abstraction for memoriestools.py

Core Components

ReflectionExecutor

The ReflectionExecutor class is the primary mechanism for background memory processing. It wraps a memory manager and provides async execution capabilities.

reflection = ReflectionExecutor(manager, store=store)

Source: src/langmem/knowledge/extraction.py

#### Key Responsibilities

  • Decoupling memory operations from the main agent thread
  • Managing store configuration for memory persistence
  • Providing async invoke capabilities for memory enrichment
  • Supporting background scheduling with configurable delays

MemoryStoreManager

The MemoryStoreManager extends MemoryManager with additional search capabilities and tighter integration with LangGraph's BaseStore. Source: src/langmem/knowledge/extraction.py

#### Factory Function

The primary entry point for creating a memory store manager is create_memory_store_manager:

def create_memory_store_manager(
    model: str | BaseChatModel,
    schemas: list[type[BaseModel]] | None = None,
    default: str | BaseModel | None = None,
    default_factory: typing.Callable[[], BaseModel] | None = None,
    instructions: str | None = None,
    enable_inserts: bool = True,
    enable_deletes: bool = True,
    query_model: str | BaseChatModel | None = None,
    query_limit: int = 3,
    namespace: tuple[str, ...] = ("memories", "{langgraph_user_id}"),
    store: BaseStore | None = None,
    phases: tuple[str, ...] | None = None,
) -> MemoryStoreManager

Source: src/langmem/knowledge/extraction.py

MemoryManager

The foundational class that handles multi-step memory extraction and synthesis. It supports configurable extraction phases and tool-based memory operations.

return MemoryManager(
    model,
    schemas=schemas,
    instructions=instructions,
    enable_inserts=enable_inserts,
    enable_updates=enable_updates,
    enable_deletes=enable_deletes,
)

Source: src/langmem/knowledge/extraction.py

Configuration Options

MemoryStoreManager Parameters

ParameterTypeDefaultDescription
model`str \BaseChatModel`RequiredMain language model for memory processing
schemas`list[type[BaseModel]] \None`NonePydantic models defining memory structure
default`str \BaseModel \None`NoneDefault memory when none found
default_factoryCallable[[], BaseModel]NoneFactory function for default memory
instructions`str \None`NoneCustom instructions for memory management
enable_insertsboolTrueAllow creating new memories
enable_deletesboolTrueAllow deleting memories
query_model`str \BaseChatModel \None`NoneSeparate model for search query generation
query_limitint3Maximum memories to retrieve per search
namespacetuple[str, ...]("memories", "{langgraph_user_id}")Storage namespace structure
store`BaseStore \None`NoneLangGraph BaseStore instance
phases`tuple[str, ...] \None`NoneCustom extraction phases

Source: src/langmem/knowledge/extraction.py

Namespace Configuration

Memory namespaces use runtime configuration with placeholders for dynamic values:

namespace=("memories", "{langgraph_user_id}")

The {langgraph_user_id} placeholder is populated from the LangGraph config at runtime. This enables per-user memory isolation while using a single store. Source: src/langmem/knowledge/extraction.py

Workflows

Standard Background Enrichment Workflow

sequenceDiagram
    participant Agent
    participant Background
    participant Store
    participant QueryLLM
    participant MainLLM

    Agent->>Agent: Process user message
    Agent-->>User: Send response
    Agent->>Background: Schedule enrichment (after_seconds=0)
    Note over Background: Memory processing happens<br/>in background thread
    Background->>QueryLLM: Generate search query
    QueryLLM-->>Background: Optimized query
    Background->>Store: Find relevant memories
    Store-->>Background: Existing memories
    Background->>MainLLM: Analyze conversation + memories
    MainLLM-->>Background: Memory updates (insert/update/delete)
    Background->>Store: Apply changes

Multi-Step Extraction Workflow

The memory manager supports multi-step extraction and synthesis for complex memory scenarios:

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    query_model="anthropic:claude-3-5-haiku-latest",
    query_limit=10,
)

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

# Background execution
config = {"configurable": {"langgraph_user_id": "user123"}}
await manager.ainvoke(
    {"messages": conversation},
    config=config,
)

Source: src/langmem/knowledge/extraction.py

Query Model Architecture

When a separate query model is configured, the system uses a two-model approach for efficient memory retrieval:

graph LR
    A[Messages] --> B[QueryModel]
    B --> C[Search Query]
    C --> D[Vector Store]
    D --> E[Retrieved Memories]
    E --> F[MainModel]
    F --> G[Memory Analysis]
    G --> H[Store Updates]

This architecture separates the lightweight query generation from the heavy analysis workload, optimizing cost and latency.

Integration Patterns

With LangGraph create_react_agent

from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{user_id}"),
)

reflection = ReflectionExecutor(manager, store=store)

agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[...],
    store=store,
)

Source: src/langmem/knowledge/extraction.py

With @entrypoint Decorator

from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
)

reflection = ReflectionExecutor(manager, store=store)

@entrypoint(store=store)
async def my_agent(message: str):
    response = {"role": "assistant", "content": "I'll remember that"}
    await reflection.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response

Source: src/langmem/knowledge/extraction.py

Memory Tools

The background memory system also provides complementary tools for explicit memory management within agent conversations.

create_manage_memory_tool

Creates a tool for direct memory CRUD operations by agents:

def create_manage_memory_tool(
    namespace: tuple[str, ...] | str,
    *,
    instructions: str = "Proactively call this tool when you:\n\n"
    "1. Identify a new USER preference.\n"
    "2. Receive an explicit USER request to remember something.\n",
    schema: typing.Type = str,
    actions_permitted: tuple[Literal["create", "update", "delete"], ...] = ("create", "update", "delete"),
    store: BaseStore | None = None,
    name: str = "manage_memory",
)

Source: src/langmem/knowledge/tools.py

create_search_memory_tool

Creates a tool for semantic memory search within conversations:

def create_search_memory_tool(
    namespace: tuple[str, ...] | str,
    *,
    instructions: str = _MEMORY_SEARCH_INSTRUCTIONS,
    store: BaseStore | None = None,
    response_format: Literal["content", "content_and_artifact"] = "content",
    name: str = "search_memory",
)

Source: src/langmem/knowledge/tools.py

Data Models

Prompt Structure

Memory management uses structured prompts defined in src/langmem/prompts/types.py:

class Prompt(TypedDict, total=False):
    """TypedDict for structured prompt management and optimization."""
    name: Required[str]
    prompt: Required[str]
    update_instructions: str | None
    when_to_update: str | None

Source: src/langmem/prompts/types.py

Annotated Trajectory

Conversation histories with feedback for prompt optimization:

class AnnotatedTrajectory(typing.NamedTuple):
    """Conversation history with optional feedback for prompt optimization."""
    messages: list[AnyMessage]
    feedback: dict[str, typing.Any] | None

Source: src/langmem/prompts/types.py

Best Practices

  1. Use Separate Query Models: Configure query_model with a faster, cheaper model (e.g., Haiku) to reduce costs while keeping the main model for quality analysis.
  1. Namespace Isolation: Always use user-specific namespaces like ("memories", "{langgraph_user_id}") to ensure proper data isolation.
  1. Background Scheduling: Schedule enrichment with after_seconds=0 in production to maintain responsive user interactions.
  1. Schema Definition: Define explicit Pydantic schemas for memories to ensure consistent structure and enable better extraction quality.
  1. Default Values: Provide default or default_factory values for critical memory types to ensure graceful handling when no memories are found.

Example: Standalone Usage

The Background Memory Manager can be used independently of LangGraph:

from langmem import create_memory_store_manager
from langchain_openai import ChatOpenAI

# Configure store
from langgraph.store.memory import InMemoryStore

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

# Create manager
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schema=PreferenceMemory,
    namespace=("memories", "user123"),
    store=store,
)

# Process conversation
conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

await manager.ainvoke({"messages": conversation})

# Search for memories
results = manager.search(query="app preferences")
print(results)

Source: examples/standalone_examples/README.md

Summary

The Background Memory Manager provides a robust framework for asynchronous memory operations in agent applications. Key features include:

  • Decoupled Processing: Memory operations execute in background threads without blocking agent responses
  • Flexible Storage: Integration with LangGraph's BaseStore for vector-based memory retrieval
  • Multi-Step Extraction: Configurable phases for complex memory synthesis
  • Semantic Search: Automatic query generation with optional dedicated query models
  • Tool Integration: Complementary tools for explicit memory management within conversations

The system is production-ready and scales from simple single-user applications to complex multi-tenant deployments through its namespace-based isolation and background execution model.

Source: https://github.com/langchain-ai/langmem / Human Manual

Short-term Memory and Summarization

Related topics: Background Memory Manager, Memory Tools - Hot Path Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section SummarizationNode

Continue reading this section for the full explanation and source context.

Section RunningSummary

Continue reading this section for the full explanation and source context.

Section SummarizationResult

Continue reading this section for the full explanation and source context.

Related topics: Background Memory Manager, Memory Tools - Hot Path Management

Short-term Memory and Summarization

Overview

The short-term memory module in LangMem provides utilities for managing conversation history through summarization. As conversations grow longer, they exceed LLM context windows. This module enables efficient compression of message histories by generating summaries while preserving critical information. Source: src/langmem/short_term/__init__.py:1-12.

The module exposes a functional API (summarize_messages, asummarize_messages) for quick integration and a class-based API (SummarizationNode) for use within LangGraph workflows. Both interfaces ultimately produce a SummarizationResult containing the summarized messages and a RunningSummary tracking the compressed state. Source: src/langmem/short_term/summarization.py.

Architecture

The summarization system operates on a sliding window principle: messages accumulate until a token threshold is reached, at which point older messages are summarized while recent messages pass through unchanged. Source: src/langmem/short_term/summarization.py.

graph TD
    A[Input Messages] --> B{Tokens > max_tokens_before_summary?}
    B -->|No| C[Pass through unchanged]
    B -->|Yes| D[Identify summarization window]
    D --> E[Generate summary via LLM]
    E --> F[Replace window with summary message]
    F --> G[Update RunningSummary in context]
    G --> H[Output updated messages]
    C --> H

Core Components

SummarizationNode

The SummarizationNode class implements a LangGraph-compatible node that summarizes message histories. It can be integrated directly into a LangGraph workflow. Source: src/langmem/short_term/summarization.py.

#### Constructor Parameters

ParameterTypeDescription
modelBaseChatModelThe language model used for generating summaries
max_tokensintMaximum tokens in the final output; enforced after summarization
max_tokens_before_summary`int \None`Token threshold to trigger summarization; defaults to max_tokens
max_summary_tokensintToken budget allocated for the summary itself
token_counter`Callable \None`Custom function to count tokens; defaults to approximate counting
initial_summary_prompt`str \None`Prompt template for generating the first summary
existing_summary_prompt`str \None`Prompt template for updating an existing running summary
final_prompt`str \None`Prompt template combining summary with remaining messages
input_messages_keystrKey in state containing messages to summarize
output_messages_keystrKey for output messages after summarization
namestrName identifier for this node

Source: src/langmem/short_term/summarization.py.

#### State Update Format

The node returns a LangGraph state update in the following structure:

{
    "output_messages_key": "<list of updated messages ready to be input to the LLM after summarization, including a message with a summary (if any)>",
    "context": {"running_summary": "<RunningSummary object>"}
}

Source: src/langmem/short_term/summarization.py.

RunningSummary

The RunningSummary class maintains a cumulative summary of conversation history. It is stored in the graph's context state and updated incrementally as new summaries are generated. Source: src/langmem/short_term/summarization.py.

SummarizationResult

A result object containing the summarized messages and updated running summary after processing. Source: src/langmem/short_term/__init__.py:1-12.

Token Management Behavior

Threshold Triggers

Summarization is triggered when the token count of accumulated messages exceeds max_tokens_before_summary. This parameter defaults to the same value as max_tokens if not explicitly provided, allowing the summarization LLM to process the full token budget. Source: src/langmem/short_term/summarization.py.

Token Budget Enforcement

When the number of tokens to be summarized exceeds max_tokens, only the last max_tokens are summarized. This prevents exceeding the context window of the summarization LLM, which is assumed to be capped at max_tokens. Source: src/langmem/short_term/summarization.py.

Tool Call Handling

If the last message within the summarization window is an AI message containing tool calls, all subsequent corresponding tool result messages are also included in the summarization. This ensures tool call and result pairs are summarized together as logical units. Source: src/langmem/short_term/summarization.py.

Summary Token Budget

The max_summary_tokens parameter controls the token budget for the summary itself. Critically, this parameter is not passed to the summary-generating LLM to limit output length. It is used solely for estimating the maximum allowed token budget during processing. To enforce a length limit, bind the model directly: model.bind(max_tokens=max_summary_tokens). Source: src/langmem/short_term/summarization.py.

Usage Patterns

Basic Integration in LangGraph

from typing import Any, TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import AnyMessage
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.checkpoint.memory import InMemorySaver
from langmem.short_term import SummarizationNode, RunningSummary

model = ChatOpenAI(model="gpt-4o")
summarization_model = model.bind(max_tokens=128)

class State(MessagesState):
    context: dict[str, Any]

class LLMInputState(TypedDict):
    summarized_messages: list[AnyMessage]
    context: dict[str, Any]

summarization_node = SummarizationNode(
    model=summarization_model,
    max_tokens=256,
    max_tokens_before_summary=256,
    max_summary_tokens=128,
)

def call_model(state: LLMInputState):
    response = model.invoke(state["summarized_messages"])
    return {"messages": [response]}

checkpointer = InMemorySaver()
workflow = StateGraph(State)
workflow.add_node(call_model)
workflow.add_node("summarize", summarization_node)
workflow.add_edge(START, "summarize")

Source: src/langmem/short_term/summarization.py.

Functional API

For simpler use cases outside of LangGraph, the module provides synchronous and asynchronous functions:

from langmem.short_term import summarize_messages, asummarize_messages

# Synchronous usage
result = summarize_messages(
    messages=conversation_history,
    model=summarization_model,
    max_tokens=256
)

# Asynchronous usage
result = await asummarize_messages(
    messages=conversation_history,
    model=summarization_model,
    max_tokens=256
)

Source: src/langmem/short_term/__init__.py:1-12.

Workflow Integration

The following diagram illustrates how SummarizationNode integrates into a typical LangGraph workflow:

graph LR
    A[User Messages] --> B[MessagesState]
    B --> C[LLM Node]
    C --> D[Model Response]
    D --> E{Summarization Needed?}
    E -->|Yes| F[SummarizationNode]
    E -->|No| G[Return Response]
    F --> H[Update RunningSummary]
    H --> I[Compressed Messages]
    I --> J[Next Turn]
    G --> J

Configuration Recommendations

Scenariomax_tokensmax_tokens_before_summarymax_summary_tokens
Aggressive compression512768128
Balanced10241536256
High fidelity20483072512

When using smaller max_tokens values, set max_tokens_before_summary higher to allow the summarization LLM more content to work with. Source: src/langmem/short_term/summarization.py.

Public API Summary

SymbolTypeDescription
summarize_messagesFunctionSynchronous message summarization
asummarize_messagesFunctionAsynchronous message summarization
SummarizationNodeClassLangGraph-compatible summarization node
SummarizationResultClassResult container for summarization output
RunningSummaryClassCumulative summary state tracker

Source: src/langmem/short_term/__init__.py:1-12.

Source: https://github.com/langchain-ai/langmem / Human Manual

Prompt Optimization

Related topics: Reflection Executor, Core Concepts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Class Hierarchy

Continue reading this section for the full explanation and source context.

Section 1. Prompt Memory Optimizer

Continue reading this section for the full explanation and source context.

Related topics: Reflection Executor, Core Concepts

Prompt Optimization

Overview

Prompt Optimization in LangMem is a system for automatically improving AI prompts based on conversation history and feedback. It analyzes trajectories (user-assistant conversations) and feedback to generate enhanced prompts that produce better responses.

The optimization system supports three distinct approaches:

ApproachComplexityLLM CallsBest For
Prompt MemorySimplest1Quick improvements, learning basic patterns
MetapromptModerate2-5Balanced speed and quality
GradientHighest4-10Thorough analysis, complex patterns

Source: src/langmem/prompts/optimization.py:1-50

Architecture

System Components

graph TD
    A[User Input] --> B[Optimizer Factory]
    B --> C{Select Kind}
    C -->|gradient| D[Gradient Prompt Optimizer]
    C -->|metaprompt| E[Metaprompt Optimizer]
    C -->|prompt_memory| F[Prompt Memory Optimizer]
    
    D --> G[Reflection Loop]
    G --> H[Extract Hypotheses]
    H --> I[Generate Recommendations]
    I --> J[Apply Updates]
    
    E --> K[Meta Prompt Processing]
    K --> J
    
    F --> L[Memory Pattern Extraction]
    L --> J
    
    J --> M[Optimized Prompt Output]

Class Hierarchy

The system is built on LangChain's Runnable interface, providing both sync and async invocation patterns:

  • PromptOptimizer - Single prompt optimization (returns str)
  • MultiPromptOptimizer - Multiple prompt optimization (returns list[Prompt])

Source: src/langmem/prompts/optimization.py:150-180

Optimizer Types

1. Prompt Memory Optimizer

The simplest optimization approach that learns from conversation history:

  1. Extracts successful patterns from past interactions
  2. Identifies improvement areas from feedback
  3. Applies learned patterns to new prompts
from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="prompt_memory"
)

trajectories = [
    {
        "messages": [
            {"role": "user", "content": "Tell me about the solar system"},
            {"role": "assistant", "content": "The solar system consists of..."},
        ],
        "feedback": {"clarity": "needs more structure"},
    }
]

better_prompt = await optimizer.ainvoke(
    {"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)

Source: src/langmem/prompts/optimization.py:100-130

2. Metaprompt Optimizer

A balanced approach using reflection-based prompt generation:

Configuration Options:

ParameterTypeDefaultDescription
max_reflection_stepsint3Maximum meta-learning steps
min_reflection_stepsint1Minimum meta-learning steps
metapromptstrSee defaultCustom meta-prompt template
from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="metaprompt",
    config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)

Source: src/langmem/prompts/optimization.py:60-80

3. Gradient Prompt Optimizer

The most thorough optimization approach, using a hypothesis-recommendation cycle:

graph LR
    A[Current Prompt] --> B[Generate Hypotheses]
    B --> C[Extract Recommendations]
    C --> D{Sufficient Analysis?}
    D -->|No| B
    D -->|Yes| E[Apply Updates]
    E --> F[Optimized Prompt]

Process Flow:

  1. Hypothesis Generation: Analyzes trajectory to identify why the prompt underperforms
  2. Recommendation Extraction: Generates specific adjustment recommendations
  3. Reflection Loop: Iterates up to max_reflection_steps for deeper analysis
  4. Prompt Update: Applies minimal, necessary changes to the prompt
from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="gradient",
    config={
        "max_reflection_steps": 5,
        "min_reflection_steps": 2,
    }
)

Source: src/langmem/prompts/gradient.py:1-80

Data Models

Prompt

The Prompt TypedDict defines structured prompt management:

class Prompt(TypedDict, total=False):
    name: Required[str]                    # Unique identifier
    prompt: Required[str]                  # The actual prompt text
    update_instructions: str | None        # Guidelines for modification
    when_to_update: str | None             # Dependencies during optimization

Example:

prompt = Prompt(
    name="extract_entities",
    prompt="Extract key entities from the text:",
    update_instructions="Make minimal changes, only address where errors occurred.",
    when_to_update="If there seem to be errors in recall of named entities.",
)

Source: src/langmem/prompts/types.py:1-50

AnnotatedTrajectory

Represents conversation history with optional feedback:

class AnnotatedTrajectory(typing.NamedTuple):
    messages: typing.Sequence[AnyMessage]      # Conversation messages
    feedback: dict[str, str | int | bool] | str | None  # Improvement feedback

Example:

trajectory = AnnotatedTrajectory(
    messages=[
        {"role": "user", "content": "What pizza is good around here?"},
        {"role": "assistant", "content": "Try LangPizza™️"},
        {"role": "user", "content": "Stop advertising to me."},
        {"role": "assistant", "content": "BUT YOU'LL LOVE IT!"},
    ],
    feedback={
        "developer_feedback": "too pushy",
        "score": 0,
    },
)

Source: src/langmem/prompts/types.py:50-100

OptimizerInput

Input structure for single-prompt optimization:

class OptimizerInput(TypedDict):
    trajectories: typing.Sequence[AnnotatedTrajectory] | str
    prompt: str | Prompt

Source: src/langmem/prompts/types.py:100-150

MultiPromptOptimizerInput

Input structure for optimizing multiple prompts together:

class MultiPromptOptimizerInput(TypedDict):
    trajectories: typing.Sequence[AnnotatedTrajectory] | str
    prompts: list[Prompt]

This maintains consistency across related prompts during optimization.

Source: src/langmem/prompts/types.py:150-200

API Reference

Factory Functions

#### create_prompt_optimizer

Creates a single-prompt optimizer.

def create_prompt_optimizer(
    model: str | BaseChatModel,
    /,
    *,
    kind: typing.Literal["gradient", "prompt_memory", "metaprompt"] = "gradient",
    config: typing.Optional[dict] = None,
) -> Runnable[prompt_types.OptimizerInput, str]

Parameters:

ParameterTypeRequiredDescription
model`str \BaseChatModel`YesModel identifier or instance
kindLiteralNoOptimization strategy (default: "gradient")
config`dict \None`NoOptimization configuration

Source: src/langmem/prompts/optimization.py:50-80

#### create_multi_prompt_optimizer

Creates an optimizer for managing multiple prompts together.

def create_multi_prompt_optimizer(
    model: str | BaseChatModel,
    /,
    *,
    kind: typing.Literal["gradient", "prompt_memory", "metaprompt"] = "gradient",
    config: typing.Optional[dict] = None,
) -> MultiPromptOptimizer

Parameters:

ParameterTypeRequiredDescription
model`str \BaseChatModel`YesModel identifier or instance
kindLiteralNoOptimization strategy (default: "gradient")
config`dict \None`NoOptimization configuration

Source: src/langmem/prompts/optimization.py:130-160

Gradient Optimizer Config

class GradientOptimizerConfig(TypedDict, total=False):
    gradient_prompt: str        # Custom gradient analysis prompt
    metaprompt: str             # Custom update application prompt
    max_reflection_steps: int   # Maximum iteration count
    min_reflection_steps: int   # Minimum iteration count

Source: src/langmem/prompts/gradient.py:40-60

Prompt Templates

Instruction Reflection Prompt

Used by the prompt memory optimizer for basic reflection:

INSTRUCTION_REFLECTION_PROMPT = """You are helping an AI agent improve. You can do this by changing their system prompt.

These is their current prompt:
<current_prompt>
{current_prompt}
</current_prompt>

Here was the agent's trajectory:
<trajectory>
{trajectory}
</trajectory>

Here is the user's feedback:

<feedback>
{feedback}
</feedback>

Here are instructions for updating the agent's prompt:

<instructions>
{instructions}
</instructions>

Based on this, return an updated prompt"""

Source: src/langmem/prompts/prompt.py:1-40

Gradient Metaprompt

Used by the gradient optimizer for hypothesis generation:

DEFAULT_GRADIENT_METAPROMPT = """You are optimizing a prompt to handle its target task more effectively.

<current_prompt>
{current_prompt}
</current_prompt>

We hypothesize the current prompt underperforms for these reasons:

<hypotheses>
{hypotheses}
</hypotheses>

Based on these hypotheses, we recommend the following adjustments:

<recommendations>
{recommendations}
</recommendations>

Respond with the updated prompt. Remember to ONLY make changes that are clearly necessary."""

Source: src/langmem/prompts/gradient.py:15-50

Usage Examples

Single Prompt Optimization with Feedback

from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")

conversation = [
    {"role": "user", "content": "How do I write a bash script?"},
    {"role": "assistant", "content": "Let me explain bash scripting..."},
]
feedback = "Response should include a code example"

trajectories = [(conversation, {"feedback": feedback})]
better_prompt = await optimizer(trajectories, "You are a coding assistant")

Multi-Prompt Optimization

from langmem import create_multi_prompt_optimizer

optimizer = create_multi_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="prompt_memory"
)

conversation = [
    {"role": "user", "content": "Tell me about this image"},
    {"role": "assistant", "content": "I see a dog playing in a park"},
]

trajectories = [(conversation, "Vision model wasn't used for breed detection")]

prompts = [
    {
        "name": "vision_extract",
        "prompt": "Extract visual details from the image",
    },
    {
        "name": "vision_classify",
        "prompt": "Classify specific attributes in the image",
    },
]

better_prompts = await optimizer.ainvoke(
    {"trajectories": trajectories, "prompts": prompts}
)

Choosing Optimization Strategy

Use CaseRecommended KindRationale
Quick prototypingprompt_memorySingle LLM call, minimal cost
Production with moderate trafficmetaprompt2-5 calls, balanced improvement
High-stakes, complex tasksgradient4-10 calls, thorough analysis

Source: src/langmem/prompts/optimization.py:20-45

Source: https://github.com/langchain-ai/langmem / Human Manual

Reflection Executor

Related topics: Prompt Optimization, Core Concepts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Components

Continue reading this section for the full explanation and source context.

Section Basic Setup with InMemoryStore

Continue reading this section for the full explanation and source context.

Section Integration with LangGraph Agent

Continue reading this section for the full explanation and source context.

Related topics: Prompt Optimization, Core Concepts

Reflection Executor

The ReflectionExecutor is a core component in LangMem that enables asynchronous, background processing of memory enrichment operations. It decouples the memory management workflow from the main agent execution thread, allowing AI agents to respond to users immediately while memory processing occurs in the background.

Overview

The ReflectionExecutor class serves as a bridge between a MemoryManager or MemoryStoreManager and the LangGraph BaseStore. It provides a mechanism to schedule and execute memory enrichment after the main agent has already produced a response, ensuring that users receive immediate feedback while the system continuously improves its understanding of conversation context.

Source: src/langmem/__init__.py

Architecture

The ReflectionExecutor operates within a broader architecture that separates concerns between agent execution and memory processing:

graph TD
    A[User Input] --> B[Main Agent]
    B --> C[User Response]
    C --> D[ReflectionExecutor]
    D --> E[Memory Manager / Store Manager]
    E --> F[BaseStore]
    
    B -.->|processes immediately| C
    D -.->|background processing| F

Components

ComponentTypePurpose
ReflectionExecutorClassSchedules and executes background memory enrichment
MemoryManagerClassExtracts, updates, and deletes memories from conversations
MemoryStoreManagerClassManages memory storage with vector search capabilities
BaseStoreInterfaceLangGraph's persistence layer for memories

Source: src/langmem/knowledge/extraction.py

Usage Patterns

Basic Setup with InMemoryStore

The most common pattern initializes a ReflectionExecutor with a memory store manager and a configured store:

from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{user_id}"),
)

reflection = ReflectionExecutor(manager, store=store)

Source: src/langmem/knowledge/extraction.py

Integration with LangGraph Agent

The ReflectionExecutor is designed to work seamlessly with LangGraph's create_react_agent:

from langgraph.prebuilt import create_react_agent

agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[
        create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
    ],
    store=store,
)

Source: src/langmem/knowledge/tools.py

Execution Flow

The following sequence diagram illustrates how ReflectionExecutor interacts with other components during background enrichment:

sequenceDiagram
    participant Agent
    participant Background
    participant Store
    participant Manager

    Agent->>Agent: process message
    Agent-->>User: response
    Agent->>Background: schedule enrichment<br/>(after_seconds=0)
    Note over Background,Store: Memory processing happens<br/>in background thread
    Background->>Manager: invoke with messages
    Manager->>Manager: extract & analyze memories
    Manager->>Store: store/update memories
    Store-->>Manager: confirmation
    Manager-->>Background: enrichment complete

Source: src/langmem/knowledge/extraction.py

Configuration

Memory Store Manager Configuration

When creating the memory manager for use with ReflectionExecutor, several configuration options control memory behavior:

ParameterTypeDefaultDescription
model`str \BaseChatModel`RequiredLanguage model for memory processing
schemaslist[type]RequiredPydantic schemas defining memory structure
namespacetuple[str, ...]("memories", "{langgraph_user_id}")Hierarchical path for memory storage
enable_insertsboolTrueAllow creating new memories
enable_updatesboolTrueAllow modifying existing memories
enable_deletesboolTrueAllow removing outdated memories
query_model`str \BaseChatModel`NoneSeparate model for search query generation
query_limitint5Maximum memories to retrieve

Source: src/langmem/knowledge/extraction.py

Namespace Template

Namespaces support runtime placeholders that are resolved from the LangGraph configuration:

namespace=("memories", "{langgraph_user_id}")

This resolves to ["memories", "user123"] when config["configurable"]["langgraph_user_id"] equals "user123".

Source: src/langmem/knowledge/extraction.py

Memory Processing Phases

The MemoryStoreManager processes memories through distinct phases, each callable independently or combined:

graph LR
    A[messages] --> B[Recall Phase]
    B --> C[Enrich Phase]
    C --> D[Update Phase]
PhasePurpose
RecallRetrieve relevant existing memories using semantic search
EnrichExtract new information from the conversation
UpdateApply changes to the store (insert, update, delete)

Source: src/langmem/knowledge/types.py

Background Execution Strategy

The ReflectionExecutor supports immediate background execution via the after_seconds parameter:

await reflection.ainvoke(
    {"messages": conversation, "existing": memories},
    after_seconds=0,
)

Setting after_seconds=0 schedules execution on the next event loop iteration, ensuring the main agent response is not delayed. For less time-sensitive applications, a positive value defers execution, reducing resource contention during peak load periods.

Class Signature

The ReflectionExecutor class implements the following interface:

class ReflectionExecutor:
    def __init__(
        self,
        manager: MemoryManager | MemoryStoreManager,
        *,
        store: BaseStore | None = None,
    ) -> None:
        ...
ParameterTypeDescription
manager`MemoryManager \MemoryStoreManager`The memory processing component
store`BaseStore \None`Optional explicit store; otherwise uses context

Source: src/langmem/reflection.py

Async Support

The ReflectionExecutor provides full async support through its ainvoke method, making it compatible with LangGraph's async entrypoints and workflows:

@entrypoint(store=store)
async def my_agent(message: str):
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await reflection.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response

Source: src/langmem/knowledge/extraction.py

Best Practices

  1. Use with persistent stores in production: While InMemoryStore is suitable for development, production deployments should use persistent stores like PostgreSQL or Redis with vector search capabilities.
  1. Separate query models for efficiency: When working with large memory stores, use a faster, cheaper model for query generation and a more capable model for memory analysis.
  1. Configure appropriate namespaces: Always include user-specific namespaces to ensure memory isolation between users.
  1. Set reasonable query limits: Balance between recall completeness and processing speed by tuning query_limit based on your use case.
  1. Leverage background execution: Schedule memory enrichment with minimal delay (after_seconds=0) to keep the system responsive while continuously improving memory quality.

Source: https://github.com/langchain-ai/langmem / Human Manual

LangGraph Integration

Related topics: Memory Tools - Hot Path Management, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Overview

Continue reading this section for the full explanation and source context.

Section Search Memory Tool

Continue reading this section for the full explanation and source context.

Section Manage Memory Tool

Continue reading this section for the full explanation and source context.

Related topics: Memory Tools - Hot Path Management, System Architecture

LangGraph Integration

LangMem provides comprehensive integration with LangGraph, enabling memory management capabilities within agentic workflows. This integration allows AI applications to store, retrieve, search, and manage conversational memories using LangGraph's BaseStore architecture.

Overview

LangMem's LangGraph integration serves as a bridge between LangGraph's store infrastructure and memory management functionality. It provides:

  • Tools for Agents: Pre-built tools that agents can invoke to search and manage memories
  • Store Managers: Components that handle automatic memory extraction and storage
  • Configuration Support: Runtime namespace resolution using configurable parameters
  • Async/Await Support: Full support for both synchronous and asynchronous operations

Source: src/langmem/knowledge/tools.py:1-50

Architecture

The integration follows a layered architecture where LangMem tools and managers connect to LangGraph's BaseStore implementation:

graph TD
    A[LangGraph Agent] --> B[LangMem Tools/Managers]
    B --> C[BaseStore]
    C --> D[InMemoryStore]
    C --> E[Persistent Store]
    
    F[Config] --> B
    F --> C

Component Overview

ComponentPurposeFile Location
create_search_memory_toolSearch for memories within agent contextsrc/langmem/knowledge/tools.py
create_manage_memory_toolCRUD operations for memoriessrc/langmem/knowledge/tools.py
create_memory_store_managerAutomatic memory extraction and storagesrc/langmem/knowledge/extraction.py
Prebuilt Graphsoptimize_prompts, extract_memoriessrc/langmem/graphs/

Source: src/langmem/graphs/__init__.py

Memory Tools

LangMem provides two primary tools that agents can invoke during execution.

Search Memory Tool

The create_search_memory_tool function creates a tool that searches for relevant memories based on a query.

from langmem import create_search_memory_tool
from langgraph.store.memory import InMemoryStore

search_tool = create_search_memory_tool(
    namespace=("project_memories", "{langgraph_user_id}"),
)

Parameters:

ParameterTypeDescriptionDefault
namespacetuple[str, ...]Hierarchical path for organizing memories("memories", "{langgraph_user_id}")
promptstrCustom prompt for search behaviorContext-based default

Source: src/langmem/knowledge/tools.py:80-120

Manage Memory Tool

The create_manage_memory_tool function creates a tool that supports creating, updating, and deleting memories:

from langmem import create_manage_memory_tool

memory_tool = create_manage_memory_tool(
    namespace=("project_memories", "{langgraph_user_id}"),
)

Supported Operations:

OperationDescription
insertStore new memories with generated keys
updateModify existing memory content
upsertInsert or update based on key existence
deleteRemove memories from the store

Source: src/langmem/knowledge/tools.py:200-280

Namespace Configuration

Namespaces in LangMem use a template system that allows runtime population of values from LangGraph's configuration.

Template Syntax

Placeholders use curly brace notation that maps to configurable values:

namespace=("memories", "{langgraph_user_id}")

Runtime Resolution

At runtime, these placeholders are resolved from the configurable section:

config = {"configurable": {"langgraph_user_id": "user-123"}}
# Results in namespace: ("memories", "user-123")

Namespace Examples

Use CaseNamespace TemplateRuntime Config
Per-user memories("memories", "{langgraph_user_id}"){"langgraph_user_id": "user-123"}
Team memories("memories", "{team_id}"){"team_id": "team-x"}
Project memories("project_memories", "{project_id}"){"project_id": "proj-1"}

Source: src/langmem/knowledge/tools.py:140-180

Store Configuration

LangMem requires a BaseStore implementation to be configured in the LangGraph entrypoint or graph.

InMemoryStore Example

from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

@entrypoint(store=store)
async def workflow(state: dict, *, previous=None):
    # Store is automatically available via get_store()
    ...

Configuration in langgraph.json

The langgraph.json file defines the default store configuration for deployed graphs:

{
  "store": {
    "index": {
      "embed": "openai:text-embedding-3-small",
      "dims": 1536,
      "fields": ["$"]
    }
  }
}

Source: langgraph.json:1-20

Prebuilt Graphs

LangMem includes prebuilt LangGraph graphs for common memory operations.

Extract Memories Graph

Located at src/langmem/graphs/semantic.py, this graph combines memory storage with automatic extraction:

from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from langmem import create_memory_store_manager

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
)

@entrypoint(store=store)
async def graph(message: str):
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response

Graph Endpoint: ./src/langmem/graphs/semantic.py:graph

Source: src/langmem/graphs/semantic.py:1-40

Optimize Prompts Graph

Used for prompt optimization workflows with memory-backed feedback.

Graph Endpoint: ./src/langmem/graphs/prompts.py:optimize_prompts

Source: src/langmem/graphs/__init__.py

Integration with create_react_agent

LangMem tools integrate seamlessly with LangGraph's prebuilt create_react_agent:

from langgraph.prebuilt import create_react_agent
from langgraph.config import get_config, get_store
from langmem import create_manage_memory_tool

def prompt(state):
    config = get_config()
    memories = get_store().search(
        ("memories", config["configurable"]["langgraph_user_id"]),
    )
    system_prompt = f"""You are a helpful assistant.
<memories>
{memories}
</memories>
"""
    system_message = {"role": "system", "content": system_prompt}
    return [system_message, *state["messages"]]

agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[
        create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
    ],
    store=store,
)

Source: src/langmem/knowledge/tools.py:300-350

Memory Store Manager

The create_memory_store_manager function creates a manager that handles automatic memory extraction and storage using LangGraph's store infrastructure.

Query Model Architecture

The manager supports using a separate (faster) model for search query generation:

sequenceDiagram
    participant Client
    participant Manager
    participant QueryLLM
    participant Store
    participant MainLLM

    Client->>Manager: messages
    Manager->>QueryLLM: generate search query
    QueryLLM-->>Manager: optimized query
    Manager->>Store: find memories
    Store-->>Manager: memories
    Manager->>MainLLM: analyze & extract
    MainLLM-->>Manager: memory updates
    Manager->>Store: apply changes
    Manager-->>Client: result

Configuration Options

ParameterTypeDescriptionDefault
model`str \BaseChatModel`Main model for memory processingRequired
query_model`str \BaseChatModel`Faster model for search queriesSame as model
query_limitintNumber of memories to retrieve10
namespacetuple[str, ...]Memory namespace template("memories", "{langgraph_user_id}")
schemaslist[type[BaseModel]]Pydantic schemas for memoriesNone
enable_insertsboolAllow creating new memoriesTrue
enable_updatesboolAllow updating existing memoriesTrue
enable_deletesboolAllow deleting memoriesTrue

Source: src/langmem/knowledge/extraction.py:200-300

Authentication

LangMem graphs support authentication via the auth endpoint defined in langgraph.json:

{
  "auth": {
    "path": "./src/langmem/graphs/auth.py:auth"
  }
}

The auth function handles authentication for deployed LangGraph applications.

Source: src/langmem/graphs/auth.py

Complete Workflow Example

graph LR
    A[User Input] --> B[Agent]
    B --> C[Memory Search Tool]
    C --> D[BaseStore]
    D --> E[Vector Index]
    B --> F[Response]
    B --> G[Memory Manage Tool]
    G --> D
    F --> H[User]

Full Implementation

from langmem import create_search_memory_tool, create_manage_memory_tool
from langgraph.store.memory import InMemoryStore
from langgraph.prebuilt import create_react_agent

# Configure store
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

# Create tools
search_tool = create_search_memory_tool(
    namespace=("memories", "{langgraph_user_id}"),
)
manage_tool = create_manage_memory_tool(
    namespace=("memories", "{langgraph_user_id}"),
)

# Create agent with memory tools
agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest",
    tools=[search_tool, manage_tool],
    store=store,
)

# Invoke with user context
config = {"configurable": {"langgraph_user_id": "user-123"}}
result = agent.invoke(
    {"messages": [{"role": "user", "content": "I prefer dark mode"}]},
    config=config,
)

Summary

LangMem's LangGraph integration provides a complete solution for memory management in agentic applications:

  1. Tools enable agents to search and manage memories during conversation
  2. Managers automate memory extraction and storage
  3. Namespace templates allow flexible per-user/per-conversation organization
  4. Store abstraction supports multiple storage backends
  5. Prebuilt graphs accelerate common use cases

All integration points are designed to work seamlessly with LangGraph's configuration system, enabling production-ready deployments with proper authentication, store configuration, and multi-tenant support.

Source: https://github.com/langchain-ai/langmem / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high GRAPH_RECURSION_LIMIT

First-time setup may fail or require extra isolation and rollback planning.

high Persistence?

First-time setup may fail or require extra isolation and rollback planning.

high Enhance error message when summarization fails due to missing HumanMessage in trimmed window

The project may affect permissions, credentials, data exposure, or host boundaries.

medium Configuration risk needs validation

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

Doramagic Pitfall Log

Doramagic extracted 11 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: GRAPH_RECURSION_LIMIT

  • Severity: high
  • Finding: Installation risk is backed by a source signal: GRAPH_RECURSION_LIMIT. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/133

2. Installation risk: Persistence?

  • Severity: high
  • Finding: Installation risk is backed by a source signal: Persistence?. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/154

3. Security or permission risk: Enhance error message when summarization fails due to missing HumanMessage in trimmed window

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: Enhance error message when summarization fails due to missing HumanMessage in trimmed window. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/156

4. Configuration risk: Configuration risk needs validation

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.host_targets | github_repo:920242883 | https://github.com/langchain-ai/langmem | host_targets=claude, chatgpt

5. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:920242883 | https://github.com/langchain-ai/langmem | README/documentation is current enough for a first validation pass.

6. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | last_activity_observed missing

7. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: downstream_validation.risk_items | github_repo:920242883 | https://github.com/langchain-ai/langmem | no_demo; severity=medium

8. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: risks.scoring_risks | github_repo:920242883 | https://github.com/langchain-ai/langmem | no_demo; severity=medium

9. Security or permission risk: Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06)

  • Severity: medium
  • Finding: Security or permission risk is backed by a source signal: Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06). Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/164

10. Maintenance risk: issue_or_pr_quality=unknown

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | issue_or_pr_quality=unknown

11. Maintenance risk: release_recency=unknown

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | release_recency=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 6

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using langmem with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence