Doramagic Project Pack · Human Manual
langmem
Related topics: System Architecture, Core Concepts, Installation and Setup
Home - LangMem Overview
Related topics: System Architecture, Core Concepts, Installation and Setup
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Core Concepts, Installation and Setup
Home - LangMem Overview
LangMem is a library for memory management and prompt optimization in LLM applications. It provides tools for extracting, storing, and retrieving structured memories, as well as optimizing prompts based on conversation trajectories and feedback.
Core Concepts
LangMem operates across two primary domains: Long-Term Memory (knowledge extraction and storage) and Short-Term Memory (conversation summarization), with a complementary Prompt Optimization system for improving LLM instructions.
Memory Architecture Overview
graph TD
A[User Conversation] --> B[Memory Manager]
B --> C[Long-Term Memory Store]
B --> D[Short-Term Summarization]
E[Search/Retrieval] --> C
F[Prompt Optimizer] --> G[Optimized Prompts]
C --> H[Structured Memories]
D --> I[Running Summary]Components
Memory Types
LangMem provides several specialized memory components for different use cases.
| Component | File Location | Purpose |
|---|---|---|
MemoryManager | src/langmem/knowledge/extraction.py | Extracts and manages long-term memories from conversations |
MemoryStoreManager | src/langmem/knowledge/extraction.py | Manages memories with persistent storage (LangGraph BaseStore) |
SummarizationNode | src/langmem/short_term/summarization.py | Provides running summaries for short-term context |
GradientPromptOptimizer | src/langmem/prompts/gradient.py | Optimizes prompts using gradient-based reflection |
Data Models
LangMem uses TypedDict classes for type-safe data structures.
#### Prompt Structure
class Prompt(TypedDict, total=False):
name: Required[str]
prompt: Required[str]
update_instructions: str | None
when_to_update: str | None
Source: src/langmem/prompts/types.py:7-22
#### Annotated Trajectory
class AnnotatedTrajectory(typing.NamedTuple):
messages: typing.Sequence[AnyMessage]
feedback: dict[str, typing.Any] | str
Source: src/langmem/prompts/types.py:24-43
Memory Management
Creating a Memory Manager
from langmem import create_memory_manager
manager = create_memory_manager(
"anthropic:claude-3-5-sonnet-latest",
schemas=[PreferenceMemory],
enable_inserts=True,
enable_updates=True,
enable_deletes=True,
)
Source: src/langmem/knowledge/extraction.py
Memory Store with LangGraph Integration
The MemoryStoreManager integrates with LangGraph's BaseStore for persistent memory storage.
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
query_model="anthropic:claude-3-5-haiku-latest",
query_limit=10,
namespace=("memories", "{langgraph_user_id}"),
)
Source: src/langmem/knowledge/extraction.py
Search Flow with Query Model
sequenceDiagram
participant Client
participant Manager
participant QueryLLM
participant Store
participant MainLLM
Client->>Manager: messages
Manager->>QueryLLM: generate search query
QueryLLM-->>Manager: optimized query
Manager->>Store: find memories
Store-->>Manager: memories
Manager->>MainLLM: analyze & extract
MainLLM-->>Manager: memory updates
Manager->>Store: apply changes
Manager-->>Client: resultSource: src/langmem/knowledge/extraction.py
Short-Term Memory
Summarization Node
The SummarizationNode provides running summaries for managing conversation context within a LangGraph workflow.
from langmem.short_term import SummarizationNode, RunningSummary
summarization_node = SummarizationNode(
model=summarization_model,
max_tokens=256,
max_tokens_before_summary=256,
max_summary_tokens=128,
)
Source: src/langmem/short_term/summarization.py
State Update Format
The summarization node returns updates in this format:
{
"output_messages_key": "<list of updated messages>",
"context": {"running_summary": "<RunningSummary object>"}
}
Source: src/langmem/short_term/summarization.py
Prompt Optimization
LangMem provides multiple prompt optimization strategies through the create_prompt_optimizer and create_multi_prompt_optimizer functions.
Optimization Strategies
| Strategy | Description | Configuration |
|---|---|---|
gradient | Hypothesis-driven optimization with reflection loops | max_reflection_steps, min_reflection_steps |
metaprompt | Meta-learning based on conversation patterns | Optional reflection step control |
prompt_memory | Learns from successful conversation patterns | No additional config |
Source: src/langmem/prompts/optimization.py
Single Prompt Optimization
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")
trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
{"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)
Source: src/langmem/prompts/optimization.py
Multi-Prompt Optimization
from langmem import create_multi_prompt_optimizer
optimizer = create_multi_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="metaprompt",
config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)
better_prompts = await optimizer.ainvoke({
"trajectories": trajectories,
"prompts": prompts
})
Source: src/langmem/prompts/optimization.py
Gradient Optimizer Workflow
graph TD
A[Current Prompt] --> B[Generate Hypotheses]
B --> C[Hypothesis Analysis]
C --> D{Reflection Loop}
D -->|Within steps| E[Generate Recommendations]
E --> F[Apply Adjustments]
F --> D
D -->|Complete| G[Optimized Prompt]Source: src/langmem/prompts/gradient.py
Memory Tools
LangMem provides standalone tools for memory management in agent workflows.
Create Manage Memory Tool
from langmem import create_manage_memory_tool
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[
create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
],
store=store,
)
Source: src/langmem/knowledge/tools.py
Create Search Memory Tool
from langmem import create_search_memory_tool
search_tool = create_search_memory_tool(
namespace=("project_memories", "{langgraph_user_id}"),
)
memories, _ = await search_tool.ainvoke(
{"query": "Python preferences", "limit": 5}
)
Source: src/langmem/knowledge/tools.py
Tool Configuration Options
| Parameter | Type | Description |
|---|---|---|
namespace | tuple[str, ...] | Hierarchical path for memory organization |
actions_permitted | list[str] | Limit actions (create, update, delete) |
schema | BaseModel | Custom memory schema |
query_limit | int | Maximum results to retrieve (default: 10) |
Source: src/langmem/knowledge/tools.py
Usage Patterns
Standalone Usage
LangMem can be used independently of LangGraph:
from langmem import create_memory_manager
from langmem.schemas import PreferenceMemory
manager = create_memory_manager(
"anthropic:claude-3-5-sonnet-latest",
schemas=[PreferenceMemory],
)
conversation = [
{"role": "user", "content": "I prefer dark mode in all my apps"},
{"role": "assistant", "content": "I'll remember that preference"},
]
memories = await manager.ainvoke({"messages": conversation})
Source: examples/standalone_examples/README.md
LangGraph Integration
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
@entrypoint(store=store)
async def my_agent(message: str):
response = {"role": "assistant", "content": "I'll remember that"}
await manager.ainvoke(
{"messages": [{"role": "user", "content": message}, response]}
)
return response
Source: src/langmem/knowledge/extraction.py
Configuration
Memory Namespaces
Namespaces use runtime configuration with placeholders:
namespace=("memories", "{langgraph_user_id}")
# Runtime config
config = {"configurable": {"langgraph_user_id": "user123"}}
# Results in: ("memories", "user123")
Source: src/langmem/knowledge/extraction.py, src/langmem/knowledge/tools.py
Default Memory Values
Provide fallback values when no memories are found:
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
default="Use a concise and professional tone in all responses.",
)
Source: src/langmem/knowledge/extraction.py
API Reference
Core Functions
| Function | Return Type | Purpose |
|---|---|---|
create_memory_manager | MemoryManager | Extract memories from conversations |
create_memory_searcher | Runnable | Search for relevant memories |
create_memory_store_manager | MemoryStoreManager | Memory with persistent storage |
create_prompt_optimizer | Runnable | Optimize single prompts |
create_multi_prompt_optimizer | Runnable | Optimize multiple prompts |
create_manage_memory_tool | BaseTool | Memory management tool for agents |
create_search_memory_tool | BaseTool | Memory search tool for agents |
Source: src/langmem/knowledge/extraction.py, src/langmem/prompts/optimization.py, src/langmem/knowledge/tools.py
Installation and Setup
uv venv
source .venv/bin/activate
uv sync
Set your API key:
export OPENAI_API_KEY=your_api_key_here
Source: examples/standalone_examples/README.md
Summary
LangMem provides a comprehensive toolkit for managing both long-term and short-term memory in LLM applications:
- Long-Term Memory: Extract, store, search, and update structured memories using memory managers and tools
- Short-Term Memory: Summarize conversations with the SummarizationNode for efficient context management
- Prompt Optimization: Improve prompts using gradient, metaprompt, or memory-based strategies
- Agent Integration: Tools work seamlessly with LangGraph's prebuilt agents and store infrastructure
Source: https://github.com/langchain-ai/langmem / Human Manual
Installation and Setup
Related topics: Home - LangMem Overview, LangGraph Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Home - LangMem Overview, LangGraph Integration
Installation and Setup
Overview
LangMem is a Python library for memory management and prompt optimization in LLM applications. The library provides components for short-term summarization, long-term memory storage, and prompt optimization. This page covers the complete installation process, dependencies, environment configuration, and setup for both basic and advanced usage scenarios.
Prerequisites
Before installing LangMem, ensure your environment meets the following requirements:
| Requirement | Minimum Version | Notes |
|---|---|---|
| Python | 3.10+ | Required for modern typing features |
| pip/uv | Latest recommended | Package manager for installation |
| API Keys | Provider-specific | OpenAI, Anthropic, or other LLM providers |
LangMem depends on the LangChain and LangGraph ecosystems. The library is designed to integrate seamlessly with LangGraph's state management and memory store abstractions. Source: src/langmem/knowledge/extraction.py
Installation Methods
Using pip
Install LangMem directly from PyPI:
pip install langmem
Using uv (Recommended)
For faster dependency resolution and better workspace management:
uv pip install langmem
Development Installation
For contributors or those wanting the latest unreleased features:
# Clone the repository
git clone https://github.com/langchain-ai/langmem.git
cd langmem
# Create virtual environment
uv venv
source .venv/bin/activate
# Install with all dependencies
uv sync
Source: examples/standalone_examples/README.md
Core Dependencies
LangMem relies on several key packages from the Python AI ecosystem:
| Package | Purpose | Import Usage |
|---|---|---|
langchain-core | Base chat models and message types | from langchain_core.messages import AnyMessage |
langgraph | State management and store abstractions | from langgraph.store.memory import InMemoryStore |
pydantic | Data validation and schema definitions | class UserProfile(BaseModel) |
typing_extensions | Enhanced typing support | from typing_extensions import Required, TypedDict |
The library uses TypedDict with Required for type-safe prompt and trajectory definitions. Source: src/langmem/prompts/types.py
Optional Dependencies
Depending on your use case, you may need additional packages:
# For OpenAI integration
pip install langchain-openai
# For Anthropic integration
pip install langchain-anthropic
# For vector store with embeddings
pip install langchain-openai # includes embedding support
Environment Configuration
API Key Setup
LangMem requires API access to language model providers. Set your API keys as environment variables:
# For OpenAI
export OPENAI_API_KEY=your_api_key_here
# For Anthropic
export ANTHROPIC_API_KEY=your_api_key_here
Alternatively, pass API keys directly when configuring models:
from langmem import create_memory_manager
manager = create_memory_manager(
"anthropic:claude-3-5-sonnet-latest", # Model identifier
schemas=[PreferenceMemory],
)
Source: examples/standalone_examples/README.md
Runtime Configuration
LangMem uses runtime configuration through RunnableConfig for namespace and store management:
from langgraph.config import get_config, get_store
# Configure namespace with user-specific identifiers
config = {"configurable": {"langgraph_user_id": "user123"}}
# Access the store within LangGraph context
store = get_store()
Project Structure
Understanding the module organization helps with imports and customization:
langmem/
├── knowledge/ # Long-term memory management
│ ├── extraction.py # Memory extraction and management
│ └── tools.py # Memory tools for agents
├── prompts/ # Prompt optimization
│ ├── types.py # TypedDict definitions
│ ├── optimization.py # Prompt optimization logic
│ ├── gradient.py # Gradient-based optimization
│ └── prompt.py # Prompt templates
└── short_term/ # Short-term memory
└── summarization.py # Conversation summarization
Source: src/langmem/prompts/types.py
Quick Start Setup
1. Basic Memory Manager Setup
from langmem import create_memory_manager
from pydantic import BaseModel
# Define your memory schema
class PreferenceMemory(BaseModel):
preference: str
context: str | None = None
# Create the memory manager
manager = create_memory_manager(
"anthropic:claude-3-5-sonnet-latest",
schemas=[PreferenceMemory],
)
# Process a conversation
conversation = [
{"role": "user", "content": "I prefer dark mode in all my apps"},
{"role": "assistant", "content": "I'll remember that preference"},
]
memories = await manager.ainvoke({"messages": conversation})
Source: src/langmem/knowledge/extraction.py
2. Memory Store with Vector Embeddings
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
# Create store with embedding configuration
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
# Create store manager with namespace
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
query_model="anthropic:claude-3-5-haiku-latest",
query_limit=10,
namespace=("memories", "{langgraph_user_id}"),
)
Source: src/langmem/knowledge/extraction.py
3. Standalone Example Setup
For use outside of LangGraph:
# custom_store_example.py
from langmem import create_memory_manager
from pydantic import BaseModel
class PreferenceMemory(BaseModel):
category: str
preference: str
context: str
manager = create_memory_manager(
"openai:gpt-4o",
schemas=[PreferenceMemory],
)
# Process and store memories
conversation = [
{"role": "user", "content": "User prefers dark mode in all applications."},
]
memories = await manager.ainvoke({"messages": conversation})
Source: examples/standalone_examples/README.md
Integration Setup
LangGraph Agent Integration
from langgraph.prebuilt import create_react_agent
from langmem import create_memory_store_manager, create_manage_memory_tool
# Create memory manager
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
namespace=("memories", "{langgraph_user_id}"),
)
# Create agent with memory tool
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[
create_manage_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
actions_permitted=["create", "update"],
),
],
store=store,
)
Source: src/langmem/knowledge/tools.py
Prompt Optimizer Setup
from langmem import create_prompt_optimizer
# Initialize optimizer
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")
# Optimize a prompt with conversation history
trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
{"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)
Source: src/langmem/prompts/optimization.py
Summarization Node Setup
from langmem.short_term import SummarizationNode, RunningSummary
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o")
summarization_model = model.bind(max_tokens=128)
summarization_node = SummarizationNode(
model=summarization_model,
max_tokens=256,
max_tokens_before_summary=256,
max_summary_tokens=128,
)
Source: src/langmem/short_term/summarization.py
Configuration Options
Memory Manager Configuration
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Required | Language model for memory processing |
schemas | list[type] | Required | Pydantic models for memory structure | |
instructions | str | None | Custom instructions for extraction | |
enable_inserts | bool | True | Allow creating new memories | |
enable_updates | bool | True | Allow updating existing memories | |
enable_deletes | bool | True | Allow deleting memories |
Memory Store Manager Configuration
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
query_model | `str \ | BaseChatModel` | None | Separate model for search queries |
query_limit | int | 10 | Number of memories to retrieve | |
default | Any | None | Default memory value if none found | |
default_factory | Callable | None | Factory for default memory creation | |
namespace | tuple[str, ...] | ("memories", "{langgraph_user_id}") | Storage namespace |
Prompt Optimizer Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
kind | Literal["metaprompt", "prompt_memory"] | Required | Optimization strategy |
max_reflection_steps | int | 3 | Maximum reflection iterations |
min_reflection_steps | int | 1 | Minimum reflection iterations |
Verification and Testing
After installation, verify your setup:
# Check version
python -c "import langmem; print(langmem.__version__)"
# Run standalone examples
cd examples/standalone_examples
uv run custom_store_example.py
Expected output:
Starting custom store example...
Processing conversation...
Stored memories:
Memory 31cf472f-3491-4f0c-82ec-09b4fe409cfd:
Content: {'category': 'User Preference', 'preference': 'Dark Mode', ...}
Example completed.
Source: examples/standalone_examples/README.md
Troubleshooting
Common Installation Issues
Missing dependencies:
# Reinstall with all dependencies
uv sync
# or
pip install langmem[all]
API key not found:
# Verify environment variable is set
import os
print(os.environ.get("OPENAI_API_KEY"))
LangGraph store not initialized:
# Ensure store is passed to agent
agent = create_react_agent(
model,
tools=[...],
store=store, # Must be provided
)
Import Errors
If you encounter import errors, ensure all required packages are installed:
pip install langchain-core langgraph pydantic typing-extensions
Next Steps
After completing installation and setup:
- Review the Memory Management guide
- Explore Prompt Optimization
- Try the Standalone Examples
- Integrate with your existing LangGraph application
Source: https://github.com/langchain-ai/langmem / Human Manual
System Architecture
Related topics: Core Concepts, Memory Tools - Hot Path Management, Prompt Optimization
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Concepts, Memory Tools - Hot Path Management, Prompt Optimization
System Architecture
LangMem is a library designed to enhance AI agents with memory capabilities and prompt optimization. The system architecture consists of three primary modules: Knowledge (long-term memory), Short-term (session summarization), and Prompts (optimization). These modules work together to enable AI agents to store, retrieve, and optimize information over time.
Overview
LangMem provides a layered architecture that separates concerns across memory management, prompt optimization, and state summarization. The library integrates with LangGraph's store infrastructure and supports both synchronous and asynchronous operations.
graph TD
A[AI Agent] --> B[LangMem Core]
B --> C[Prompts Module]
B --> D[Knowledge Module]
B --> E[Short-term Module]
C --> F[Prompt Optimization]
C --> G[Multi-Prompt Optimization]
D --> H[Memory Manager]
D --> I[Memory Tools]
D --> J[Store Manager]
E --> K[Summarization Node]
J --> L[LangGraph BaseStore]
H --> LCore Modules
1. Prompts Module
The Prompts module handles prompt management and optimization strategies. It defines core types and provides factories for creating prompt optimizers.
#### Key Components
| Component | File | Purpose |
|---|---|---|
Prompt | types.py | TypedDict for structured prompt management |
AnnotatedTrajectory | types.py | NamedTuple for conversation history with feedback |
PromptOptimizerInput | types.py | Input schema for single prompt optimization |
MultiPromptOptimizerInput | types.py | Input schema for multi-prompt optimization |
INSTRUCTION_REFLECTION_PROMPT | prompt.py | Template for prompt reflection |
create_prompt_optimizer | optimization.py | Factory for single prompt optimizer |
create_multi_prompt_optimizer | optimization.py | Factory for multi-prompt optimizer |
Source: src/langmem/prompts/types.py:1-94 Source: src/langmem/prompts/optimization.py
#### Data Flow: Prompt Optimization
sequenceDiagram
participant U as User
participant O as Prompt Optimizer
participant M as Memory
participant P as Prompt Store
U->>O: Trajectories + Current Prompt
O->>M: Extract Patterns
M-->>O: Success Patterns
O->>P: Apply Optimization
P-->>O: Optimized Prompt
O-->>U: Updated Prompt2. Knowledge Module
The Knowledge module implements long-term memory management using LangGraph's BaseStore. It supports extraction, storage, search, and manipulation of memories.
#### Key Components
| Component | File | Purpose |
|---|---|---|
create_memory_manager | extraction.py | Creates a memory manager for extraction and synthesis |
create_memory_searcher | extraction.py | Creates a search pipeline with automatic query generation |
create_memory_store_manager | extraction.py | Creates a store-based memory manager |
create_manage_memory_tool | tools.py | Creates a LangGraph tool for memory CRUD operations |
create_search_memory_tool | tools.py | Creates a search tool for memory retrieval |
Source: src/langmem/knowledge/extraction.py Source: src/langmem/knowledge/tools.py
#### Memory Manager Architecture
graph TD
A[Input: Messages + Existing Memories] --> B[Memory Manager]
B --> C[Extract Tool Calls]
C --> D{Done?}
D -->|No| E[Invoke Extractor]
E --> F[Process Responses]
F --> G{More Steps?}
G -->|Yes| D
G -->|No| H[Update Memories]
D -->|Yes| H
H --> I[Return Updated Memories]#### Factory Functions
| Function | Return Type | Description |
|---|---|---|
create_memory_manager | MemoryManager | Core extraction and synthesis with configurable schemas |
create_memory_searcher | Runnable[MessagesState, Awaitable[list[SearchItem]]] | Search pipeline with query generation |
create_memory_store_manager | MemoryStoreManager | Direct store operations with search |
create_manage_memory_tool | Tool | LangGraph tool for CRUD operations |
create_search_memory_tool | Tool | LangGraph tool for memory search |
Source: src/langmem/knowledge/extraction.py
3. Short-term Module
The Short-term module provides session-level summarization to compress conversation history into maintainable state.
#### Key Components
| Component | File | Purpose |
|---|---|---|
SummarizationNode | summarization.py | LangGraph node for message summarization |
RunningSummary | summarization.py | State container for running summaries |
Source: src/langmem/short_term/summarization.py
Type System
Prompt TypedDict
class Prompt(TypedDict, total=False):
name: Required[str]
prompt: Required[str]
update_instructions: str | None
when_to_update: str | None
| Field | Type | Required | Description | |
|---|---|---|---|---|
name | str | Yes | Unique identifier for the prompt | |
prompt | str | Yes | The actual prompt content | |
update_instructions | `str \ | None` | No | Guidelines for modifying the prompt |
when_to_update | `str \ | None` | No | Dependencies between prompts during optimization |
Source: src/langmem/prompts/types.py:9-38
AnnotatedTrajectory
class AnnotatedTrajectory(typing.NamedTuple):
messages: typing.Sequence[AnyMessage] | str
feedback: str | None = None
| Field | Type | Description | |
|---|---|---|---|
messages | `Sequence[AnyMessage] \ | str` | Conversation history |
feedback | `str \ | None` | Optional feedback for optimization |
Source: src/langmem/prompts/types.py:40-65
Memory Management Workflow
Extraction Pipeline
The memory manager uses a multi-step extraction process that iteratively invokes an extractor tool until completion:
sequenceDiagram
participant C as Client
participant M as Memory Manager
participant E as Extractor
participant S as Store
C->>M: Messages + Existing Memories
M->>E: Invoke with tools
E-->>M: Response with tool calls
M->>M: Process results
M->>S: Apply changes
M-->>C: Updated memoriesSource: src/langmem/knowledge/extraction.py
Search Pipeline
The searcher generates optimized queries and retrieves semantically similar memories:
sequenceDiagram
participant C as Client
participant S as Searcher
participant Q as Query LLM
participant T as Store
C->>S: Query context
S->>Q: Generate search query
Q-->>S: Optimized query
S->>T: Search memories
T-->>S: Results
S-->>C: Ranked memoriesTool Integration
LangMem provides LangGraph-native tools that connect to the BaseStore:
Manage Memory Tool
create_manage_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
schema=PreferenceMemory,
actions_permitted=["create", "update", "delete"],
instructions="Update user preferences based on shared information."
)
Source: src/langmem/knowledge/tools.py
Search Memory Tool
create_search_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
)
Namespace Configuration
Memories are organized using hierarchical namespaces that support runtime configuration:
| Pattern | Description |
|---|---|
("memories", "{langgraph_user_id}") | User-specific memories |
("memories", "{langgraph_user_id}", "user_profile") | User profile memories |
("project_memories", "{langgraph_user_id}") | Project-scoped memories |
Source: src/langmem/knowledge/extraction.py
Entry Points
The library exposes a clean public API through __init__.py files in each module:
| Module | Exports |
|---|---|
langmem | Memory creation, extraction, and optimization |
langmem.prompts | Prompt types and optimization |
langmem.knowledge | Memory managers and tools |
langmem.short_term | Summarization components |
Source: src/langmem/__init__.py Source: src/langmem/prompts/__init__.py Source: src/langmem/knowledge/__init__.py Source: src/langmem/short_term/__init__.py
Integration with LangGraph
LangMem is designed to work seamlessly with LangGraph through:
- BaseStore Integration: Memory operations use LangGraph's
BaseStoreinterface - Tool Protocol: All tools follow LangGraph's tool conventions
- Runnable Interface: Managers implement
Runnablefor composable pipelines - Checkpoint Compatibility: Summarization nodes integrate with LangGraph's state management
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
@entrypoint(store=store)
async def my_agent(message: str):
manager = create_memory_store_manager("anthropic:claude-3-5-sonnet-latest")
response = {"role": "assistant", "content": "I'll remember that"}
await manager.ainvoke({"messages": [{"role": "user", "content": message}, response]})
return response
Source: https://github.com/langchain-ai/langmem / Human Manual
Core Concepts
Related topics: System Architecture, Memory Tools - Hot Path Management, Background Memory Manager
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Memory Tools - Hot Path Management, Background Memory Manager
Core Concepts
LangMem is a library for memory management and prompt optimization in LLM applications. This page explains the foundational concepts that underpin the library's architecture, including type systems, memory management strategies, and prompt optimization approaches.
Overview
LangMem provides two primary capabilities:
- Memory Management - Storing, retrieving, and managing conversation context and user preferences
- Prompt Optimization - Improving LLM prompts based on conversation trajectories and feedback
The library is designed to integrate with LangGraph while also supporting standalone usage in custom applications. Source: src/langmem/knowledge/extraction.py:1-50
Type System
LangMem defines a robust type system for managing prompts and conversation data. These types serve as the foundation for all optimization and memory operations.
The Prompt Type
The Prompt TypedDict represents a structured prompt with metadata for optimization control.
class Prompt(TypedDict, total=False):
name: Required[str]
prompt: Required[str]
update_instructions: str | None
when_to_update: str | None
| Field | Type | Required | Description | |
|---|---|---|---|---|
name | str | Yes | Unique identifier for the prompt | |
prompt | str | Yes | The actual prompt content | |
update_instructions | `str \ | None` | No | Guidelines for modifying the prompt |
when_to_update | `str \ | None` | No | Dependencies or triggers for updates |
Source: src/langmem/prompts/types.py:10-40
Example usage:
from langmem import Prompt
prompt = Prompt(
name="extract_entities",
prompt="Extract key entities from the text:",
update_instructions="Make minimal changes, only address where errors have occurred.",
when_to_update="If there seem to be errors in recall of named entities.",
)
AnnotatedTrajectory
The AnnotatedTrajectory NamedTuple captures conversation history with optional feedback for optimization.
class AnnotatedTrajectory(typing.NamedTuple):
messages: typing.Sequence[AnyMessage]
feedback: dict[str, str | int | bool] | str | None = None
| Field | Type | Description | ||
|---|---|---|---|---|
messages | Sequence[AnyMessage] | List of conversation messages | ||
feedback | `dict \ | str \ | None` | Optional feedback for analysis |
Source: src/langmem/prompts/types.py:56-70
OptimizerInput Types
LangMem provides two input types for prompt optimization:
#### Single Prompt Optimization
class OptimizerInput(TypedDict):
trajectories: typing.Sequence[AnnotatedTrajectory] | str
prompt: str | Prompt
#### Multi-Prompt Optimization
class MultiPromptOptimizerInput(TypedDict):
trajectories: typing.Sequence[AnnotatedTrajectory] | str
prompts: list[Prompt]
Source: src/langmem/prompts/types.py:73-120
Memory Management
LangMem provides a hierarchical memory management system for storing and retrieving conversation context.
Architecture Overview
graph TD
A[Conversation Messages] --> B[Memory Manager]
B --> C[Memory Store]
D[User Query] --> E[Memory Searcher]
E --> F[Retrieved Memories]
C --> F
F --> G[LLM Response]MemoryManager
The MemoryManager class handles in-memory operations for memory extraction and updates.
Creation via factory function:
from langmem import create_memory_manager
manager = create_memory_manager(
"anthropic:claude-3-5-sonnet-latest",
schemas=[PreferenceMemory],
enable_inserts=True,
enable_updates=True,
enable_deletes=True,
)
Source: src/langmem/knowledge/extraction.py:200-250
Supported Operations:
| Operation | Description |
|---|---|
ainvoke | Asynchronously process messages and update memories |
ainvoke({"messages": conversation, "max_steps": 3}) | Set max reflection steps for extraction |
MemoryStoreManager
The MemoryStoreManager extends memory capabilities with persistent storage integration using LangGraph's BaseStore.
Creation:
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
query_model="anthropic:claude-3-5-haiku-latest",
query_limit=10,
namespace=("memories", "{langgraph_user_id}"),
store=store,
)
Source: src/langmem/knowledge/extraction.py:50-100
Namespace Configuration:
Namespaces use runtime configuration with placeholders:
| Format | Description |
|---|---|
("memories", "{langgraph_user_id}") | User-specific memories |
Memory Search Pipeline
The create_memory_searcher function creates a search pipeline with automatic query generation.
from langmem import create_memory_searcher
searcher = create_memory_searcher(
"anthropic:claude-3-5-sonnet-latest",
prompt="Search for distinct memories relevant to different aspects of the provided context.",
namespace=("memories", "{langgraph_user_id}"),
)
Source: src/langmem/knowledge/extraction.py:280-320
Memory Search Flow
sequenceDiagram
participant Client
participant Manager
participant QueryLLM
participant Store
participant MainLLM
Client->>Manager: messages
Manager->>QueryLLM: generate search query
QueryLLM-->>Manager: optimized query
Manager->>Store: find memories
Store-->>Manager: memories
Manager->>MainLLM: analyze & extract
MainLLM-->>Manager: memory updates
Manager->>Store: apply changes
Manager-->>Client: resultMemory Tools for LangGraph
LangMem provides pre-built tools for integration with LangGraph's create_react_agent.
#### Manage Memory Tool
from langmem import create_manage_memory_tool
tool = create_manage_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
schema=PreferenceMemory,
actions_permitted=["create", "update", "delete"],
)
Source: src/langmem/knowledge/tools.py:50-100
#### Search Memory Tool
from langmem import create_search_memory_tool
search_tool = create_search_memory_tool(
namespace=("project_memories", "{langgraph_user_id}"),
)
Source: src/langmem/knowledge/tools.py:200-250
Memory Layer
The MemoryLayer class provides a declarative API for composing memory capabilities in prompts.
class MemoryLayer(Runnable):
__slots__ = (
"name",
"namespace",
"kind",
"update_instructions",
"schemas",
"limit",
"_manager_tool",
"_search_tool",
)
Source: src/langmem/prompts/_layers.py:20-35
Prompt Optimization
LangMem provides multiple strategies for optimizing prompts based on conversation history and feedback.
Optimization Strategies
| Strategy | Description |
|---|---|
metaprompt | Uses reflection-based optimization with configurable steps |
prompt_memory | Learns from past successful patterns |
instruction_reflection | Directly modifies prompts based on instructions |
Single Prompt Optimizer
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="metaprompt",
config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)
# Usage
trajectories = [(conversation, feedback)]
better_prompt = await optimizer.ainvoke(
{"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)
Source: src/langmem/prompts/optimization.py:80-120
Multi-Prompt Optimizer
For optimizing multiple related prompts together:
from langmem import create_multi_prompt_optimizer
optimizer = create_multi_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="prompt_memory",
)
prompts = [
{"name": "explain", "prompt": "Explain the concept"},
{"name": "example", "prompt": "Provide a practical example"},
]
better_prompts = await optimizer(trajectories, prompts)
Source: src/langmem/prompts/optimization.py:150-200
Meta-Prompt Optimization Flow
graph TD
A[Current Prompt + Trajectory] --> B[Reflection Steps]
B --> C{More iterations?}
C -->|Yes| D[Apply Instructions]
D --> B
C -->|No| E[Final Prompt]
F[Max Steps Config] --> B
G[Min Steps Config] --> BInstruction Reflection Prompt
The instruction reflection mechanism uses structured prompts:
INSTRUCTION_REFLECTION_PROMPT = """You are helping an AI agent improve. You can do this by changing their system prompt.
These is their current prompt:
<current_prompt>
{current_prompt}
</current_prompt>
Here was the agent's trajectory:
<trajectory>
{trajectory}
</trajectory>
Here is the user's feedback:
<feedback>
{feedback}
</feedback>
Here are instructions for updating the agent's prompt:
<instructions>
{instructions}
</instructions>
Based on this, return an updated prompt"""
Source: src/langmem/prompts/prompt.py:1-30
Response Schema
The optimization returns a structured response:
class GeneralResponse(TypedDict):
logic: str
update_prompt: bool
new_prompt: str
Source: src/langmem/prompts/prompt.py:35-40
Integration Patterns
Standalone Usage
LangMem can be used independently of LangGraph:
from langmem import create_memory_manager, create_prompt_optimizer
# Memory management
manager = create_memory_manager("anthropic:claude-3-5-sonnet-latest")
memories = await manager.ainvoke({"messages": conversation})
# Prompt optimization
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")
improved = await optimizer.ainvoke({"trajectories": trajectories, "prompt": base})
Source: examples/standalone_examples/README.md:1-50
LangGraph Integration
LangMem integrates with LangGraph's agent and store infrastructure:
from langgraph.prebuilt import create_react_agent
from langgraph.func import entrypoint
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}"))],
store=store,
)
Source: src/langmem/knowledge/tools.py:60-80
Configuration Options
#### Memory Manager Configuration
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Required | Language model for memory processing |
schemas | list[type[BaseModel]] | None | Pydantic schemas for memory validation | |
instructions | str | None | Custom instructions for the manager | |
enable_inserts | bool | True | Allow creating new memories | |
enable_updates | bool | True | Allow updating existing memories | |
enable_deletes | bool | True | Allow deleting memories | |
query_model | `str \ | BaseChatModel` | None | Separate model for search queries |
query_limit | int | 10 | Maximum memories to retrieve |
Source: src/langmem/knowledge/extraction.py:50-120
#### Prompt Optimizer Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
kind | Literal["metaprompt", "prompt_memory", "instruction_reflection"] | Required | Optimization strategy |
config | dict | None | Strategy-specific configuration |
max_reflection_steps | int | 3 | Maximum reflection iterations |
min_reflection_steps | int | 1 | Minimum reflection iterations |
Source: src/langmem/prompts/optimization.py:80-120
Summary
LangMem's core concepts provide a comprehensive framework for:
- Structured Prompt Management - Using TypedDict types for prompts with metadata for optimization control
- Memory Storage and Retrieval - Persisting conversation context with namespace-based organization
- Automatic Memory Extraction - Using LLMs to extract and synthesize memories from conversations
- Multi-Strategy Prompt Optimization - Improving prompts through reflection, memory patterns, or instruction following
These concepts work together to enable intelligent, self-improving LLM applications that maintain context and continuously refine their behavior.
Source: https://github.com/langchain-ai/langmem / Human Manual
Memory Tools - Hot Path Management
Related topics: Core Concepts, Background Memory Manager, LangGraph Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Concepts, Background Memory Manager, LangGraph Integration
Memory Tools - Hot Path Management
Overview
Memory Tools in LangMem provide real-time, interactive capabilities for managing persistent memories during conversation execution. Unlike the background extraction pipeline (which processes conversation history asynchronously), Hot Path Management enables direct manipulation and retrieval of memories within the active conversation flow.
The hot path refers to the synchronous execution path where memories are created, updated, deleted, or searched in real-time as part of agent/tool interactions. This approach allows AI assistants to:
- Persist newly discovered user preferences immediately
- Update outdated memories when corrections occur
- Delete irrelevant or incorrect memories
- Search and retrieve relevant context on-demand
Source: src/langmem/knowledge/tools.py
Architecture
The Memory Tools system consists of three primary components that operate on the hot path:
graph TD
A[Agent / Workflow] --> B[Manage Memory Tool]
A --> C[Search Memory Tool]
A --> D[Memory Searcher]
B --> E[LangGraph BaseStore]
C --> E
D --> E
E --> F[Namespace: memories, {user_id}]
E --> G[Namespace: project_memories, {user_id}]Component Responsibilities
| Component | Purpose | Sync/Async |
|---|---|---|
create_manage_memory_tool | CRUD operations for memories | Both |
create_search_memory_tool | Query-based memory retrieval | Both |
create_memory_searcher | LLM-powered query generation + search | Async |
Source: src/langmem/knowledge/tools.py and src/langmem/knowledge/extraction.py
Source: https://github.com/langchain-ai/langmem / Human Manual
Background Memory Manager
Related topics: Core Concepts, Memory Tools - Hot Path Management, Short-term Memory and Summarization
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Concepts, Memory Tools - Hot Path Management, Short-term Memory and Summarization
Background Memory Manager
The Background Memory Manager is a core component of langmem that enables asynchronous, non-blocking memory extraction and storage within agent workflows. It allows memory operations to be executed in background threads or on separate servers, ensuring that the primary agent response latency is not impacted by memory processing overhead.
Overview
The Background Memory Manager is primarily implemented through the ReflectionExecutor class, which orchestrates memory enrichment operations asynchronously. This approach decouples the memory management from the main agent thread, enabling production-ready applications where memory persistence happens transparently after user interactions are acknowledged. Source: src/langmem/knowledge/extraction.py
The system is designed to work seamlessly with LangGraph's BaseStore abstraction, allowing memories to be stored, retrieved, updated, and deleted through semantic search capabilities. The background execution model ensures that even complex memory extraction and synthesis operations do not block the agent's response to users.
Architecture
High-Level Architecture
The Background Memory Manager follows a producer-consumer pattern where the main agent produces memory enrichment tasks and the ReflectionExecutor consumes them asynchronously.
graph TD
A[User Message] --> B[Agent Processing]
B --> C[User Response]
C -->|Schedule Enrichment| D[ReflectionExecutor]
D -->|after_seconds=0| E[Background Thread]
E --> F[MemoryStoreManager]
F --> G[BaseStore]
F --> H[Query LLM]
G --> I[Vector Search]
I --> F
H -->|Generate Query| F
F --> J[Memory Updates]
J --> GComponent Interactions
The system consists of several interconnected components that work together to provide background memory management:
| Component | Role | Source |
|---|---|---|
ReflectionExecutor | Schedules and executes memory operations in background | extraction.py |
MemoryStoreManager | Orchestrates memory CRUD operations with search | extraction.py |
MemoryManager | Core extraction logic with multi-step synthesis | extraction.py |
BaseStore | LangGraph's storage abstraction for memories | tools.py |
Core Components
ReflectionExecutor
The ReflectionExecutor class is the primary mechanism for background memory processing. It wraps a memory manager and provides async execution capabilities.
reflection = ReflectionExecutor(manager, store=store)
Source: src/langmem/knowledge/extraction.py
#### Key Responsibilities
- Decoupling memory operations from the main agent thread
- Managing store configuration for memory persistence
- Providing async invoke capabilities for memory enrichment
- Supporting background scheduling with configurable delays
MemoryStoreManager
The MemoryStoreManager extends MemoryManager with additional search capabilities and tighter integration with LangGraph's BaseStore. Source: src/langmem/knowledge/extraction.py
#### Factory Function
The primary entry point for creating a memory store manager is create_memory_store_manager:
def create_memory_store_manager(
model: str | BaseChatModel,
schemas: list[type[BaseModel]] | None = None,
default: str | BaseModel | None = None,
default_factory: typing.Callable[[], BaseModel] | None = None,
instructions: str | None = None,
enable_inserts: bool = True,
enable_deletes: bool = True,
query_model: str | BaseChatModel | None = None,
query_limit: int = 3,
namespace: tuple[str, ...] = ("memories", "{langgraph_user_id}"),
store: BaseStore | None = None,
phases: tuple[str, ...] | None = None,
) -> MemoryStoreManager
Source: src/langmem/knowledge/extraction.py
MemoryManager
The foundational class that handles multi-step memory extraction and synthesis. It supports configurable extraction phases and tool-based memory operations.
return MemoryManager(
model,
schemas=schemas,
instructions=instructions,
enable_inserts=enable_inserts,
enable_updates=enable_updates,
enable_deletes=enable_deletes,
)
Source: src/langmem/knowledge/extraction.py
Configuration Options
MemoryStoreManager Parameters
| Parameter | Type | Default | Description | ||
|---|---|---|---|---|---|
model | `str \ | BaseChatModel` | Required | Main language model for memory processing | |
schemas | `list[type[BaseModel]] \ | None` | None | Pydantic models defining memory structure | |
default | `str \ | BaseModel \ | None` | None | Default memory when none found |
default_factory | Callable[[], BaseModel] | None | Factory function for default memory | ||
instructions | `str \ | None` | None | Custom instructions for memory management | |
enable_inserts | bool | True | Allow creating new memories | ||
enable_deletes | bool | True | Allow deleting memories | ||
query_model | `str \ | BaseChatModel \ | None` | None | Separate model for search query generation |
query_limit | int | 3 | Maximum memories to retrieve per search | ||
namespace | tuple[str, ...] | ("memories", "{langgraph_user_id}") | Storage namespace structure | ||
store | `BaseStore \ | None` | None | LangGraph BaseStore instance | |
phases | `tuple[str, ...] \ | None` | None | Custom extraction phases |
Source: src/langmem/knowledge/extraction.py
Namespace Configuration
Memory namespaces use runtime configuration with placeholders for dynamic values:
namespace=("memories", "{langgraph_user_id}")
The {langgraph_user_id} placeholder is populated from the LangGraph config at runtime. This enables per-user memory isolation while using a single store. Source: src/langmem/knowledge/extraction.py
Workflows
Standard Background Enrichment Workflow
sequenceDiagram
participant Agent
participant Background
participant Store
participant QueryLLM
participant MainLLM
Agent->>Agent: Process user message
Agent-->>User: Send response
Agent->>Background: Schedule enrichment (after_seconds=0)
Note over Background: Memory processing happens<br/>in background thread
Background->>QueryLLM: Generate search query
QueryLLM-->>Background: Optimized query
Background->>Store: Find relevant memories
Store-->>Background: Existing memories
Background->>MainLLM: Analyze conversation + memories
MainLLM-->>Background: Memory updates (insert/update/delete)
Background->>Store: Apply changesMulti-Step Extraction Workflow
The memory manager supports multi-step extraction and synthesis for complex memory scenarios:
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
query_model="anthropic:claude-3-5-haiku-latest",
query_limit=10,
)
conversation = [
{"role": "user", "content": "I prefer dark mode in all my apps"},
{"role": "assistant", "content": "I'll remember that preference"},
]
# Background execution
config = {"configurable": {"langgraph_user_id": "user123"}}
await manager.ainvoke(
{"messages": conversation},
config=config,
)
Source: src/langmem/knowledge/extraction.py
Query Model Architecture
When a separate query model is configured, the system uses a two-model approach for efficient memory retrieval:
graph LR
A[Messages] --> B[QueryModel]
B --> C[Search Query]
C --> D[Vector Store]
D --> E[Retrieved Memories]
E --> F[MainModel]
F --> G[Memory Analysis]
G --> H[Store Updates]This architecture separates the lightweight query generation from the heavy analysis workload, optimizing cost and latency.
Integration Patterns
With LangGraph create_react_agent
from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
namespace=("memories", "{user_id}"),
)
reflection = ReflectionExecutor(manager, store=store)
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[...],
store=store,
)
Source: src/langmem/knowledge/extraction.py
With @entrypoint Decorator
from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.func import entrypoint
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
namespace=("memories", "{langgraph_user_id}"),
)
reflection = ReflectionExecutor(manager, store=store)
@entrypoint(store=store)
async def my_agent(message: str):
response = {"role": "assistant", "content": "I'll remember that"}
await reflection.ainvoke(
{"messages": [{"role": "user", "content": message}, response]}
)
return response
Source: src/langmem/knowledge/extraction.py
Memory Tools
The background memory system also provides complementary tools for explicit memory management within agent conversations.
create_manage_memory_tool
Creates a tool for direct memory CRUD operations by agents:
def create_manage_memory_tool(
namespace: tuple[str, ...] | str,
*,
instructions: str = "Proactively call this tool when you:\n\n"
"1. Identify a new USER preference.\n"
"2. Receive an explicit USER request to remember something.\n",
schema: typing.Type = str,
actions_permitted: tuple[Literal["create", "update", "delete"], ...] = ("create", "update", "delete"),
store: BaseStore | None = None,
name: str = "manage_memory",
)
Source: src/langmem/knowledge/tools.py
create_search_memory_tool
Creates a tool for semantic memory search within conversations:
def create_search_memory_tool(
namespace: tuple[str, ...] | str,
*,
instructions: str = _MEMORY_SEARCH_INSTRUCTIONS,
store: BaseStore | None = None,
response_format: Literal["content", "content_and_artifact"] = "content",
name: str = "search_memory",
)
Source: src/langmem/knowledge/tools.py
Data Models
Prompt Structure
Memory management uses structured prompts defined in src/langmem/prompts/types.py:
class Prompt(TypedDict, total=False):
"""TypedDict for structured prompt management and optimization."""
name: Required[str]
prompt: Required[str]
update_instructions: str | None
when_to_update: str | None
Source: src/langmem/prompts/types.py
Annotated Trajectory
Conversation histories with feedback for prompt optimization:
class AnnotatedTrajectory(typing.NamedTuple):
"""Conversation history with optional feedback for prompt optimization."""
messages: list[AnyMessage]
feedback: dict[str, typing.Any] | None
Source: src/langmem/prompts/types.py
Best Practices
- Use Separate Query Models: Configure
query_modelwith a faster, cheaper model (e.g., Haiku) to reduce costs while keeping the main model for quality analysis.
- Namespace Isolation: Always use user-specific namespaces like
("memories", "{langgraph_user_id}")to ensure proper data isolation.
- Background Scheduling: Schedule enrichment with
after_seconds=0in production to maintain responsive user interactions.
- Schema Definition: Define explicit Pydantic schemas for memories to ensure consistent structure and enable better extraction quality.
- Default Values: Provide
defaultordefault_factoryvalues for critical memory types to ensure graceful handling when no memories are found.
Example: Standalone Usage
The Background Memory Manager can be used independently of LangGraph:
from langmem import create_memory_store_manager
from langchain_openai import ChatOpenAI
# Configure store
from langgraph.store.memory import InMemoryStore
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
# Create manager
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
schema=PreferenceMemory,
namespace=("memories", "user123"),
store=store,
)
# Process conversation
conversation = [
{"role": "user", "content": "I prefer dark mode in all my apps"},
{"role": "assistant", "content": "I'll remember that preference"},
]
await manager.ainvoke({"messages": conversation})
# Search for memories
results = manager.search(query="app preferences")
print(results)
Source: examples/standalone_examples/README.md
Summary
The Background Memory Manager provides a robust framework for asynchronous memory operations in agent applications. Key features include:
- Decoupled Processing: Memory operations execute in background threads without blocking agent responses
- Flexible Storage: Integration with LangGraph's
BaseStorefor vector-based memory retrieval - Multi-Step Extraction: Configurable phases for complex memory synthesis
- Semantic Search: Automatic query generation with optional dedicated query models
- Tool Integration: Complementary tools for explicit memory management within conversations
The system is production-ready and scales from simple single-user applications to complex multi-tenant deployments through its namespace-based isolation and background execution model.
Source: https://github.com/langchain-ai/langmem / Human Manual
Short-term Memory and Summarization
Related topics: Background Memory Manager, Memory Tools - Hot Path Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Background Memory Manager, Memory Tools - Hot Path Management
Short-term Memory and Summarization
Overview
The short-term memory module in LangMem provides utilities for managing conversation history through summarization. As conversations grow longer, they exceed LLM context windows. This module enables efficient compression of message histories by generating summaries while preserving critical information. Source: src/langmem/short_term/__init__.py:1-12.
The module exposes a functional API (summarize_messages, asummarize_messages) for quick integration and a class-based API (SummarizationNode) for use within LangGraph workflows. Both interfaces ultimately produce a SummarizationResult containing the summarized messages and a RunningSummary tracking the compressed state. Source: src/langmem/short_term/summarization.py.
Architecture
The summarization system operates on a sliding window principle: messages accumulate until a token threshold is reached, at which point older messages are summarized while recent messages pass through unchanged. Source: src/langmem/short_term/summarization.py.
graph TD
A[Input Messages] --> B{Tokens > max_tokens_before_summary?}
B -->|No| C[Pass through unchanged]
B -->|Yes| D[Identify summarization window]
D --> E[Generate summary via LLM]
E --> F[Replace window with summary message]
F --> G[Update RunningSummary in context]
G --> H[Output updated messages]
C --> HCore Components
SummarizationNode
The SummarizationNode class implements a LangGraph-compatible node that summarizes message histories. It can be integrated directly into a LangGraph workflow. Source: src/langmem/short_term/summarization.py.
#### Constructor Parameters
| Parameter | Type | Description | |
|---|---|---|---|
model | BaseChatModel | The language model used for generating summaries | |
max_tokens | int | Maximum tokens in the final output; enforced after summarization | |
max_tokens_before_summary | `int \ | None` | Token threshold to trigger summarization; defaults to max_tokens |
max_summary_tokens | int | Token budget allocated for the summary itself | |
token_counter | `Callable \ | None` | Custom function to count tokens; defaults to approximate counting |
initial_summary_prompt | `str \ | None` | Prompt template for generating the first summary |
existing_summary_prompt | `str \ | None` | Prompt template for updating an existing running summary |
final_prompt | `str \ | None` | Prompt template combining summary with remaining messages |
input_messages_key | str | Key in state containing messages to summarize | |
output_messages_key | str | Key for output messages after summarization | |
name | str | Name identifier for this node |
Source: src/langmem/short_term/summarization.py.
#### State Update Format
The node returns a LangGraph state update in the following structure:
{
"output_messages_key": "<list of updated messages ready to be input to the LLM after summarization, including a message with a summary (if any)>",
"context": {"running_summary": "<RunningSummary object>"}
}
Source: src/langmem/short_term/summarization.py.
RunningSummary
The RunningSummary class maintains a cumulative summary of conversation history. It is stored in the graph's context state and updated incrementally as new summaries are generated. Source: src/langmem/short_term/summarization.py.
SummarizationResult
A result object containing the summarized messages and updated running summary after processing. Source: src/langmem/short_term/__init__.py:1-12.
Token Management Behavior
Threshold Triggers
Summarization is triggered when the token count of accumulated messages exceeds max_tokens_before_summary. This parameter defaults to the same value as max_tokens if not explicitly provided, allowing the summarization LLM to process the full token budget. Source: src/langmem/short_term/summarization.py.
Token Budget Enforcement
When the number of tokens to be summarized exceeds max_tokens, only the last max_tokens are summarized. This prevents exceeding the context window of the summarization LLM, which is assumed to be capped at max_tokens. Source: src/langmem/short_term/summarization.py.
Tool Call Handling
If the last message within the summarization window is an AI message containing tool calls, all subsequent corresponding tool result messages are also included in the summarization. This ensures tool call and result pairs are summarized together as logical units. Source: src/langmem/short_term/summarization.py.
Summary Token Budget
The max_summary_tokens parameter controls the token budget for the summary itself. Critically, this parameter is not passed to the summary-generating LLM to limit output length. It is used solely for estimating the maximum allowed token budget during processing. To enforce a length limit, bind the model directly: model.bind(max_tokens=max_summary_tokens). Source: src/langmem/short_term/summarization.py.
Usage Patterns
Basic Integration in LangGraph
from typing import Any, TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import AnyMessage
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.checkpoint.memory import InMemorySaver
from langmem.short_term import SummarizationNode, RunningSummary
model = ChatOpenAI(model="gpt-4o")
summarization_model = model.bind(max_tokens=128)
class State(MessagesState):
context: dict[str, Any]
class LLMInputState(TypedDict):
summarized_messages: list[AnyMessage]
context: dict[str, Any]
summarization_node = SummarizationNode(
model=summarization_model,
max_tokens=256,
max_tokens_before_summary=256,
max_summary_tokens=128,
)
def call_model(state: LLMInputState):
response = model.invoke(state["summarized_messages"])
return {"messages": [response]}
checkpointer = InMemorySaver()
workflow = StateGraph(State)
workflow.add_node(call_model)
workflow.add_node("summarize", summarization_node)
workflow.add_edge(START, "summarize")
Source: src/langmem/short_term/summarization.py.
Functional API
For simpler use cases outside of LangGraph, the module provides synchronous and asynchronous functions:
from langmem.short_term import summarize_messages, asummarize_messages
# Synchronous usage
result = summarize_messages(
messages=conversation_history,
model=summarization_model,
max_tokens=256
)
# Asynchronous usage
result = await asummarize_messages(
messages=conversation_history,
model=summarization_model,
max_tokens=256
)
Source: src/langmem/short_term/__init__.py:1-12.
Workflow Integration
The following diagram illustrates how SummarizationNode integrates into a typical LangGraph workflow:
graph LR
A[User Messages] --> B[MessagesState]
B --> C[LLM Node]
C --> D[Model Response]
D --> E{Summarization Needed?}
E -->|Yes| F[SummarizationNode]
E -->|No| G[Return Response]
F --> H[Update RunningSummary]
H --> I[Compressed Messages]
I --> J[Next Turn]
G --> JConfiguration Recommendations
| Scenario | max_tokens | max_tokens_before_summary | max_summary_tokens |
|---|---|---|---|
| Aggressive compression | 512 | 768 | 128 |
| Balanced | 1024 | 1536 | 256 |
| High fidelity | 2048 | 3072 | 512 |
When using smaller max_tokens values, set max_tokens_before_summary higher to allow the summarization LLM more content to work with. Source: src/langmem/short_term/summarization.py.
Public API Summary
| Symbol | Type | Description |
|---|---|---|
summarize_messages | Function | Synchronous message summarization |
asummarize_messages | Function | Asynchronous message summarization |
SummarizationNode | Class | LangGraph-compatible summarization node |
SummarizationResult | Class | Result container for summarization output |
RunningSummary | Class | Cumulative summary state tracker |
Source: https://github.com/langchain-ai/langmem / Human Manual
Prompt Optimization
Related topics: Reflection Executor, Core Concepts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Reflection Executor, Core Concepts
Prompt Optimization
Overview
Prompt Optimization in LangMem is a system for automatically improving AI prompts based on conversation history and feedback. It analyzes trajectories (user-assistant conversations) and feedback to generate enhanced prompts that produce better responses.
The optimization system supports three distinct approaches:
| Approach | Complexity | LLM Calls | Best For |
|---|---|---|---|
| Prompt Memory | Simplest | 1 | Quick improvements, learning basic patterns |
| Metaprompt | Moderate | 2-5 | Balanced speed and quality |
| Gradient | Highest | 4-10 | Thorough analysis, complex patterns |
Source: src/langmem/prompts/optimization.py:1-50
Architecture
System Components
graph TD
A[User Input] --> B[Optimizer Factory]
B --> C{Select Kind}
C -->|gradient| D[Gradient Prompt Optimizer]
C -->|metaprompt| E[Metaprompt Optimizer]
C -->|prompt_memory| F[Prompt Memory Optimizer]
D --> G[Reflection Loop]
G --> H[Extract Hypotheses]
H --> I[Generate Recommendations]
I --> J[Apply Updates]
E --> K[Meta Prompt Processing]
K --> J
F --> L[Memory Pattern Extraction]
L --> J
J --> M[Optimized Prompt Output]Class Hierarchy
The system is built on LangChain's Runnable interface, providing both sync and async invocation patterns:
PromptOptimizer- Single prompt optimization (returnsstr)MultiPromptOptimizer- Multiple prompt optimization (returnslist[Prompt])
Source: src/langmem/prompts/optimization.py:150-180
Optimizer Types
1. Prompt Memory Optimizer
The simplest optimization approach that learns from conversation history:
- Extracts successful patterns from past interactions
- Identifies improvement areas from feedback
- Applies learned patterns to new prompts
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="prompt_memory"
)
trajectories = [
{
"messages": [
{"role": "user", "content": "Tell me about the solar system"},
{"role": "assistant", "content": "The solar system consists of..."},
],
"feedback": {"clarity": "needs more structure"},
}
]
better_prompt = await optimizer.ainvoke(
{"trajectories": trajectories, "prompt": "You are an astronomy expert"}
)
Source: src/langmem/prompts/optimization.py:100-130
2. Metaprompt Optimizer
A balanced approach using reflection-based prompt generation:
Configuration Options:
| Parameter | Type | Default | Description |
|---|---|---|---|
max_reflection_steps | int | 3 | Maximum meta-learning steps |
min_reflection_steps | int | 1 | Minimum meta-learning steps |
metaprompt | str | See default | Custom meta-prompt template |
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="metaprompt",
config={"max_reflection_steps": 3, "min_reflection_steps": 1},
)
Source: src/langmem/prompts/optimization.py:60-80
3. Gradient Prompt Optimizer
The most thorough optimization approach, using a hypothesis-recommendation cycle:
graph LR
A[Current Prompt] --> B[Generate Hypotheses]
B --> C[Extract Recommendations]
C --> D{Sufficient Analysis?}
D -->|No| B
D -->|Yes| E[Apply Updates]
E --> F[Optimized Prompt]Process Flow:
- Hypothesis Generation: Analyzes trajectory to identify why the prompt underperforms
- Recommendation Extraction: Generates specific adjustment recommendations
- Reflection Loop: Iterates up to
max_reflection_stepsfor deeper analysis - Prompt Update: Applies minimal, necessary changes to the prompt
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="gradient",
config={
"max_reflection_steps": 5,
"min_reflection_steps": 2,
}
)
Source: src/langmem/prompts/gradient.py:1-80
Data Models
Prompt
The Prompt TypedDict defines structured prompt management:
class Prompt(TypedDict, total=False):
name: Required[str] # Unique identifier
prompt: Required[str] # The actual prompt text
update_instructions: str | None # Guidelines for modification
when_to_update: str | None # Dependencies during optimization
Example:
prompt = Prompt(
name="extract_entities",
prompt="Extract key entities from the text:",
update_instructions="Make minimal changes, only address where errors occurred.",
when_to_update="If there seem to be errors in recall of named entities.",
)
Source: src/langmem/prompts/types.py:1-50
AnnotatedTrajectory
Represents conversation history with optional feedback:
class AnnotatedTrajectory(typing.NamedTuple):
messages: typing.Sequence[AnyMessage] # Conversation messages
feedback: dict[str, str | int | bool] | str | None # Improvement feedback
Example:
trajectory = AnnotatedTrajectory(
messages=[
{"role": "user", "content": "What pizza is good around here?"},
{"role": "assistant", "content": "Try LangPizza™️"},
{"role": "user", "content": "Stop advertising to me."},
{"role": "assistant", "content": "BUT YOU'LL LOVE IT!"},
],
feedback={
"developer_feedback": "too pushy",
"score": 0,
},
)
Source: src/langmem/prompts/types.py:50-100
OptimizerInput
Input structure for single-prompt optimization:
class OptimizerInput(TypedDict):
trajectories: typing.Sequence[AnnotatedTrajectory] | str
prompt: str | Prompt
Source: src/langmem/prompts/types.py:100-150
MultiPromptOptimizerInput
Input structure for optimizing multiple prompts together:
class MultiPromptOptimizerInput(TypedDict):
trajectories: typing.Sequence[AnnotatedTrajectory] | str
prompts: list[Prompt]
This maintains consistency across related prompts during optimization.
Source: src/langmem/prompts/types.py:150-200
API Reference
Factory Functions
#### create_prompt_optimizer
Creates a single-prompt optimizer.
def create_prompt_optimizer(
model: str | BaseChatModel,
/,
*,
kind: typing.Literal["gradient", "prompt_memory", "metaprompt"] = "gradient",
config: typing.Optional[dict] = None,
) -> Runnable[prompt_types.OptimizerInput, str]
Parameters:
| Parameter | Type | Required | Description | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Yes | Model identifier or instance |
kind | Literal | No | Optimization strategy (default: "gradient") | |
config | `dict \ | None` | No | Optimization configuration |
Source: src/langmem/prompts/optimization.py:50-80
#### create_multi_prompt_optimizer
Creates an optimizer for managing multiple prompts together.
def create_multi_prompt_optimizer(
model: str | BaseChatModel,
/,
*,
kind: typing.Literal["gradient", "prompt_memory", "metaprompt"] = "gradient",
config: typing.Optional[dict] = None,
) -> MultiPromptOptimizer
Parameters:
| Parameter | Type | Required | Description | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Yes | Model identifier or instance |
kind | Literal | No | Optimization strategy (default: "gradient") | |
config | `dict \ | None` | No | Optimization configuration |
Source: src/langmem/prompts/optimization.py:130-160
Gradient Optimizer Config
class GradientOptimizerConfig(TypedDict, total=False):
gradient_prompt: str # Custom gradient analysis prompt
metaprompt: str # Custom update application prompt
max_reflection_steps: int # Maximum iteration count
min_reflection_steps: int # Minimum iteration count
Source: src/langmem/prompts/gradient.py:40-60
Prompt Templates
Instruction Reflection Prompt
Used by the prompt memory optimizer for basic reflection:
INSTRUCTION_REFLECTION_PROMPT = """You are helping an AI agent improve. You can do this by changing their system prompt.
These is their current prompt:
<current_prompt>
{current_prompt}
</current_prompt>
Here was the agent's trajectory:
<trajectory>
{trajectory}
</trajectory>
Here is the user's feedback:
<feedback>
{feedback}
</feedback>
Here are instructions for updating the agent's prompt:
<instructions>
{instructions}
</instructions>
Based on this, return an updated prompt"""
Source: src/langmem/prompts/prompt.py:1-40
Gradient Metaprompt
Used by the gradient optimizer for hypothesis generation:
DEFAULT_GRADIENT_METAPROMPT = """You are optimizing a prompt to handle its target task more effectively.
<current_prompt>
{current_prompt}
</current_prompt>
We hypothesize the current prompt underperforms for these reasons:
<hypotheses>
{hypotheses}
</hypotheses>
Based on these hypotheses, we recommend the following adjustments:
<recommendations>
{recommendations}
</recommendations>
Respond with the updated prompt. Remember to ONLY make changes that are clearly necessary."""
Source: src/langmem/prompts/gradient.py:15-50
Usage Examples
Single Prompt Optimization with Feedback
from langmem import create_prompt_optimizer
optimizer = create_prompt_optimizer("anthropic:claude-3-5-sonnet-latest")
conversation = [
{"role": "user", "content": "How do I write a bash script?"},
{"role": "assistant", "content": "Let me explain bash scripting..."},
]
feedback = "Response should include a code example"
trajectories = [(conversation, {"feedback": feedback})]
better_prompt = await optimizer(trajectories, "You are a coding assistant")
Multi-Prompt Optimization
from langmem import create_multi_prompt_optimizer
optimizer = create_multi_prompt_optimizer(
"anthropic:claude-3-5-sonnet-latest",
kind="prompt_memory"
)
conversation = [
{"role": "user", "content": "Tell me about this image"},
{"role": "assistant", "content": "I see a dog playing in a park"},
]
trajectories = [(conversation, "Vision model wasn't used for breed detection")]
prompts = [
{
"name": "vision_extract",
"prompt": "Extract visual details from the image",
},
{
"name": "vision_classify",
"prompt": "Classify specific attributes in the image",
},
]
better_prompts = await optimizer.ainvoke(
{"trajectories": trajectories, "prompts": prompts}
)
Choosing Optimization Strategy
| Use Case | Recommended Kind | Rationale |
|---|---|---|
| Quick prototyping | prompt_memory | Single LLM call, minimal cost |
| Production with moderate traffic | metaprompt | 2-5 calls, balanced improvement |
| High-stakes, complex tasks | gradient | 4-10 calls, thorough analysis |
Source: https://github.com/langchain-ai/langmem / Human Manual
Reflection Executor
Related topics: Prompt Optimization, Core Concepts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Prompt Optimization, Core Concepts
Reflection Executor
The ReflectionExecutor is a core component in LangMem that enables asynchronous, background processing of memory enrichment operations. It decouples the memory management workflow from the main agent execution thread, allowing AI agents to respond to users immediately while memory processing occurs in the background.
Overview
The ReflectionExecutor class serves as a bridge between a MemoryManager or MemoryStoreManager and the LangGraph BaseStore. It provides a mechanism to schedule and execute memory enrichment after the main agent has already produced a response, ensuring that users receive immediate feedback while the system continuously improves its understanding of conversation context.
Source: src/langmem/__init__.py
Architecture
The ReflectionExecutor operates within a broader architecture that separates concerns between agent execution and memory processing:
graph TD
A[User Input] --> B[Main Agent]
B --> C[User Response]
C --> D[ReflectionExecutor]
D --> E[Memory Manager / Store Manager]
E --> F[BaseStore]
B -.->|processes immediately| C
D -.->|background processing| FComponents
| Component | Type | Purpose |
|---|---|---|
ReflectionExecutor | Class | Schedules and executes background memory enrichment |
MemoryManager | Class | Extracts, updates, and deletes memories from conversations |
MemoryStoreManager | Class | Manages memory storage with vector search capabilities |
BaseStore | Interface | LangGraph's persistence layer for memories |
Source: src/langmem/knowledge/extraction.py
Usage Patterns
Basic Setup with InMemoryStore
The most common pattern initializes a ReflectionExecutor with a memory store manager and a configured store:
from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
namespace=("memories", "{user_id}"),
)
reflection = ReflectionExecutor(manager, store=store)
Source: src/langmem/knowledge/extraction.py
Integration with LangGraph Agent
The ReflectionExecutor is designed to work seamlessly with LangGraph's create_react_agent:
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[
create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
],
store=store,
)
Source: src/langmem/knowledge/tools.py
Execution Flow
The following sequence diagram illustrates how ReflectionExecutor interacts with other components during background enrichment:
sequenceDiagram
participant Agent
participant Background
participant Store
participant Manager
Agent->>Agent: process message
Agent-->>User: response
Agent->>Background: schedule enrichment<br/>(after_seconds=0)
Note over Background,Store: Memory processing happens<br/>in background thread
Background->>Manager: invoke with messages
Manager->>Manager: extract & analyze memories
Manager->>Store: store/update memories
Store-->>Manager: confirmation
Manager-->>Background: enrichment completeSource: src/langmem/knowledge/extraction.py
Configuration
Memory Store Manager Configuration
When creating the memory manager for use with ReflectionExecutor, several configuration options control memory behavior:
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Required | Language model for memory processing |
schemas | list[type] | Required | Pydantic schemas defining memory structure | |
namespace | tuple[str, ...] | ("memories", "{langgraph_user_id}") | Hierarchical path for memory storage | |
enable_inserts | bool | True | Allow creating new memories | |
enable_updates | bool | True | Allow modifying existing memories | |
enable_deletes | bool | True | Allow removing outdated memories | |
query_model | `str \ | BaseChatModel` | None | Separate model for search query generation |
query_limit | int | 5 | Maximum memories to retrieve |
Source: src/langmem/knowledge/extraction.py
Namespace Template
Namespaces support runtime placeholders that are resolved from the LangGraph configuration:
namespace=("memories", "{langgraph_user_id}")
This resolves to ["memories", "user123"] when config["configurable"]["langgraph_user_id"] equals "user123".
Source: src/langmem/knowledge/extraction.py
Memory Processing Phases
The MemoryStoreManager processes memories through distinct phases, each callable independently or combined:
graph LR
A[messages] --> B[Recall Phase]
B --> C[Enrich Phase]
C --> D[Update Phase]| Phase | Purpose |
|---|---|
| Recall | Retrieve relevant existing memories using semantic search |
| Enrich | Extract new information from the conversation |
| Update | Apply changes to the store (insert, update, delete) |
Source: src/langmem/knowledge/types.py
Background Execution Strategy
The ReflectionExecutor supports immediate background execution via the after_seconds parameter:
await reflection.ainvoke(
{"messages": conversation, "existing": memories},
after_seconds=0,
)
Setting after_seconds=0 schedules execution on the next event loop iteration, ensuring the main agent response is not delayed. For less time-sensitive applications, a positive value defers execution, reducing resource contention during peak load periods.
Class Signature
The ReflectionExecutor class implements the following interface:
class ReflectionExecutor:
def __init__(
self,
manager: MemoryManager | MemoryStoreManager,
*,
store: BaseStore | None = None,
) -> None:
...
| Parameter | Type | Description | |
|---|---|---|---|
manager | `MemoryManager \ | MemoryStoreManager` | The memory processing component |
store | `BaseStore \ | None` | Optional explicit store; otherwise uses context |
Source: src/langmem/reflection.py
Async Support
The ReflectionExecutor provides full async support through its ainvoke method, making it compatible with LangGraph's async entrypoints and workflows:
@entrypoint(store=store)
async def my_agent(message: str):
response = {"role": "assistant", "content": "I'll remember that preference"}
await reflection.ainvoke(
{"messages": [{"role": "user", "content": message}, response]}
)
return response
Source: src/langmem/knowledge/extraction.py
Best Practices
- Use with persistent stores in production: While
InMemoryStoreis suitable for development, production deployments should use persistent stores like PostgreSQL or Redis with vector search capabilities.
- Separate query models for efficiency: When working with large memory stores, use a faster, cheaper model for query generation and a more capable model for memory analysis.
- Configure appropriate namespaces: Always include user-specific namespaces to ensure memory isolation between users.
- Set reasonable query limits: Balance between recall completeness and processing speed by tuning
query_limitbased on your use case.
- Leverage background execution: Schedule memory enrichment with minimal delay (
after_seconds=0) to keep the system responsive while continuously improving memory quality.
Source: https://github.com/langchain-ai/langmem / Human Manual
LangGraph Integration
Related topics: Memory Tools - Hot Path Management, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Memory Tools - Hot Path Management, System Architecture
LangGraph Integration
LangMem provides comprehensive integration with LangGraph, enabling memory management capabilities within agentic workflows. This integration allows AI applications to store, retrieve, search, and manage conversational memories using LangGraph's BaseStore architecture.
Overview
LangMem's LangGraph integration serves as a bridge between LangGraph's store infrastructure and memory management functionality. It provides:
- Tools for Agents: Pre-built tools that agents can invoke to search and manage memories
- Store Managers: Components that handle automatic memory extraction and storage
- Configuration Support: Runtime namespace resolution using configurable parameters
- Async/Await Support: Full support for both synchronous and asynchronous operations
Source: src/langmem/knowledge/tools.py:1-50
Architecture
The integration follows a layered architecture where LangMem tools and managers connect to LangGraph's BaseStore implementation:
graph TD
A[LangGraph Agent] --> B[LangMem Tools/Managers]
B --> C[BaseStore]
C --> D[InMemoryStore]
C --> E[Persistent Store]
F[Config] --> B
F --> CComponent Overview
| Component | Purpose | File Location |
|---|---|---|
create_search_memory_tool | Search for memories within agent context | src/langmem/knowledge/tools.py |
create_manage_memory_tool | CRUD operations for memories | src/langmem/knowledge/tools.py |
create_memory_store_manager | Automatic memory extraction and storage | src/langmem/knowledge/extraction.py |
| Prebuilt Graphs | optimize_prompts, extract_memories | src/langmem/graphs/ |
Source: src/langmem/graphs/__init__.py
Memory Tools
LangMem provides two primary tools that agents can invoke during execution.
Search Memory Tool
The create_search_memory_tool function creates a tool that searches for relevant memories based on a query.
from langmem import create_search_memory_tool
from langgraph.store.memory import InMemoryStore
search_tool = create_search_memory_tool(
namespace=("project_memories", "{langgraph_user_id}"),
)
Parameters:
| Parameter | Type | Description | Default |
|---|---|---|---|
namespace | tuple[str, ...] | Hierarchical path for organizing memories | ("memories", "{langgraph_user_id}") |
prompt | str | Custom prompt for search behavior | Context-based default |
Source: src/langmem/knowledge/tools.py:80-120
Manage Memory Tool
The create_manage_memory_tool function creates a tool that supports creating, updating, and deleting memories:
from langmem import create_manage_memory_tool
memory_tool = create_manage_memory_tool(
namespace=("project_memories", "{langgraph_user_id}"),
)
Supported Operations:
| Operation | Description |
|---|---|
insert | Store new memories with generated keys |
update | Modify existing memory content |
upsert | Insert or update based on key existence |
delete | Remove memories from the store |
Source: src/langmem/knowledge/tools.py:200-280
Namespace Configuration
Namespaces in LangMem use a template system that allows runtime population of values from LangGraph's configuration.
Template Syntax
Placeholders use curly brace notation that maps to configurable values:
namespace=("memories", "{langgraph_user_id}")
Runtime Resolution
At runtime, these placeholders are resolved from the configurable section:
config = {"configurable": {"langgraph_user_id": "user-123"}}
# Results in namespace: ("memories", "user-123")
Namespace Examples
| Use Case | Namespace Template | Runtime Config |
|---|---|---|
| Per-user memories | ("memories", "{langgraph_user_id}") | {"langgraph_user_id": "user-123"} |
| Team memories | ("memories", "{team_id}") | {"team_id": "team-x"} |
| Project memories | ("project_memories", "{project_id}") | {"project_id": "proj-1"} |
Source: src/langmem/knowledge/tools.py:140-180
Store Configuration
LangMem requires a BaseStore implementation to be configured in the LangGraph entrypoint or graph.
InMemoryStore Example
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
@entrypoint(store=store)
async def workflow(state: dict, *, previous=None):
# Store is automatically available via get_store()
...
Configuration in langgraph.json
The langgraph.json file defines the default store configuration for deployed graphs:
{
"store": {
"index": {
"embed": "openai:text-embedding-3-small",
"dims": 1536,
"fields": ["$"]
}
}
}
Source: langgraph.json:1-20
Prebuilt Graphs
LangMem includes prebuilt LangGraph graphs for common memory operations.
Extract Memories Graph
Located at src/langmem/graphs/semantic.py, this graph combines memory storage with automatic extraction:
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from langmem import create_memory_store_manager
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
manager = create_memory_store_manager(
"anthropic:claude-3-5-sonnet-latest",
namespace=("memories", "{langgraph_user_id}"),
)
@entrypoint(store=store)
async def graph(message: str):
response = {"role": "assistant", "content": "I'll remember that preference"}
await manager.ainvoke(
{"messages": [{"role": "user", "content": message}, response]}
)
return response
Graph Endpoint: ./src/langmem/graphs/semantic.py:graph
Source: src/langmem/graphs/semantic.py:1-40
Optimize Prompts Graph
Used for prompt optimization workflows with memory-backed feedback.
Graph Endpoint: ./src/langmem/graphs/prompts.py:optimize_prompts
Source: src/langmem/graphs/__init__.py
Integration with create_react_agent
LangMem tools integrate seamlessly with LangGraph's prebuilt create_react_agent:
from langgraph.prebuilt import create_react_agent
from langgraph.config import get_config, get_store
from langmem import create_manage_memory_tool
def prompt(state):
config = get_config()
memories = get_store().search(
("memories", config["configurable"]["langgraph_user_id"]),
)
system_prompt = f"""You are a helpful assistant.
<memories>
{memories}
</memories>
"""
system_message = {"role": "system", "content": system_prompt}
return [system_message, *state["messages"]]
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[
create_manage_memory_tool(namespace=("memories", "{langgraph_user_id}")),
],
store=store,
)
Source: src/langmem/knowledge/tools.py:300-350
Memory Store Manager
The create_memory_store_manager function creates a manager that handles automatic memory extraction and storage using LangGraph's store infrastructure.
Query Model Architecture
The manager supports using a separate (faster) model for search query generation:
sequenceDiagram
participant Client
participant Manager
participant QueryLLM
participant Store
participant MainLLM
Client->>Manager: messages
Manager->>QueryLLM: generate search query
QueryLLM-->>Manager: optimized query
Manager->>Store: find memories
Store-->>Manager: memories
Manager->>MainLLM: analyze & extract
MainLLM-->>Manager: memory updates
Manager->>Store: apply changes
Manager-->>Client: resultConfiguration Options
| Parameter | Type | Description | Default | |
|---|---|---|---|---|
model | `str \ | BaseChatModel` | Main model for memory processing | Required |
query_model | `str \ | BaseChatModel` | Faster model for search queries | Same as model |
query_limit | int | Number of memories to retrieve | 10 | |
namespace | tuple[str, ...] | Memory namespace template | ("memories", "{langgraph_user_id}") | |
schemas | list[type[BaseModel]] | Pydantic schemas for memories | None | |
enable_inserts | bool | Allow creating new memories | True | |
enable_updates | bool | Allow updating existing memories | True | |
enable_deletes | bool | Allow deleting memories | True |
Source: src/langmem/knowledge/extraction.py:200-300
Authentication
LangMem graphs support authentication via the auth endpoint defined in langgraph.json:
{
"auth": {
"path": "./src/langmem/graphs/auth.py:auth"
}
}
The auth function handles authentication for deployed LangGraph applications.
Source: src/langmem/graphs/auth.py
Complete Workflow Example
graph LR
A[User Input] --> B[Agent]
B --> C[Memory Search Tool]
C --> D[BaseStore]
D --> E[Vector Index]
B --> F[Response]
B --> G[Memory Manage Tool]
G --> D
F --> H[User]Full Implementation
from langmem import create_search_memory_tool, create_manage_memory_tool
from langgraph.store.memory import InMemoryStore
from langgraph.prebuilt import create_react_agent
# Configure store
store = InMemoryStore(
index={
"dims": 1536,
"embed": "openai:text-embedding-3-small",
}
)
# Create tools
search_tool = create_search_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
)
manage_tool = create_manage_memory_tool(
namespace=("memories", "{langgraph_user_id}"),
)
# Create agent with memory tools
agent = create_react_agent(
"anthropic:claude-3-5-sonnet-latest",
tools=[search_tool, manage_tool],
store=store,
)
# Invoke with user context
config = {"configurable": {"langgraph_user_id": "user-123"}}
result = agent.invoke(
{"messages": [{"role": "user", "content": "I prefer dark mode"}]},
config=config,
)
Summary
LangMem's LangGraph integration provides a complete solution for memory management in agentic applications:
- Tools enable agents to search and manage memories during conversation
- Managers automate memory extraction and storage
- Namespace templates allow flexible per-user/per-conversation organization
- Store abstraction supports multiple storage backends
- Prebuilt graphs accelerate common use cases
All integration points are designed to work seamlessly with LangGraph's configuration system, enabling production-ready deployments with proper authentication, store configuration, and multi-tenant support.
Source: https://github.com/langchain-ai/langmem / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
The project may affect permissions, credentials, data exposure, or host boundaries.
Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Doramagic Pitfall Log
Doramagic extracted 11 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: GRAPH_RECURSION_LIMIT
- Severity: high
- Finding: Installation risk is backed by a source signal: GRAPH_RECURSION_LIMIT. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/133
2. Installation risk: Persistence?
- Severity: high
- Finding: Installation risk is backed by a source signal: Persistence?. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/154
3. Security or permission risk: Enhance error message when summarization fails due to missing HumanMessage in trimmed window
- Severity: high
- Finding: Security or permission risk is backed by a source signal: Enhance error message when summarization fails due to missing HumanMessage in trimmed window. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/156
4. Configuration risk: Configuration risk needs validation
- Severity: medium
- Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.host_targets | github_repo:920242883 | https://github.com/langchain-ai/langmem | host_targets=claude, chatgpt
5. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:920242883 | https://github.com/langchain-ai/langmem | README/documentation is current enough for a first validation pass.
6. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | last_activity_observed missing
7. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:920242883 | https://github.com/langchain-ai/langmem | no_demo; severity=medium
8. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:920242883 | https://github.com/langchain-ai/langmem | no_demo; severity=medium
9. Security or permission risk: Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06)
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06). Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/langchain-ai/langmem/issues/164
10. Maintenance risk: issue_or_pr_quality=unknown
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | issue_or_pr_quality=unknown
11. Maintenance risk: release_recency=unknown
- Severity: low
- Finding: release_recency=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:920242883 | https://github.com/langchain-ai/langmem | release_recency=unknown
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using langmem with real data or production workflows.
- Persistence? - github / github_issue
- Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06) - github / github_issue
- Security: OWASP Agent Memory Guard for memory poisoning defense (ASI06) - github / github_issue
- Enhance error message when summarization fails due to missing HumanMessa - github / github_issue
- GRAPH_RECURSION_LIMIT - github / github_issue
- Configuration risk needs validation - GitHub / issue
Source: Project Pack community evidence and pitfall evidence