Doramagic Project Pack · Human Manual

openai-agents-python

Related topics: Installation and Setup, Agents, Tools

OpenAI Agents SDK Overview

Related topics: Installation and Setup, Agents, Tools

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Agent Definition

Continue reading this section for the full explanation and source context.

Section Agent Execution Flow

Continue reading this section for the full explanation and source context.

Related topics: Installation and Setup, Agents, Tools

OpenAI Agents SDK Overview

Introduction

The OpenAI Agents SDK is a Python framework designed to build multi-agent systems that can interact with users, execute tools, and delegate tasks to specialized sub-agents. The SDK provides a structured approach to orchestrating agent conversations, managing tool execution, handling handoffs between agents, and maintaining conversation state throughout the execution lifecycle.

The SDK's core responsibility is to manage the runtime execution of agents, handling the turn-based conversation flow, tool invocations, guardrail evaluations, and multi-agent handoffs within a single unified execution model. Sources: src/agents/__init__.py

Architecture Overview

The SDK follows a layered architecture that separates concerns between agent definition, runtime execution, and tool/mcp integrations.

graph TD
    A[User Input] --> B[Runner]
    B --> C[Agent]
    C --> D[Handoffs]
    C --> E[Tools]
    C --> F[Guardrails]
    D --> C
    D --> G[Sub-Agent]
    E --> H[MCP Servers]
    F --> I[Input/Output Guards]
    G --> C
    B --> J[Session Persistence]
    B --> K[Tracing]

Core Components

ComponentPurposeLocation
AgentDefines agent behavior, tools, handoffs, and instructionssrc/agents/__init__.py
RunnerExecutes agents and manages conversation flowsrc/agents/run.py
HandoffEnables transfer of control between agentssrc/agents/handoffs/__init__.py
MCPServerProvides Model Context Protocol server abstractionsrc/agents/mcp/server.py
ItemHelpersUtility for extracting content from conversation itemssrc/agents/items.py

Agent System

Agent Definition

Agents are the fundamental unit of computation in the SDK. An agent encapsulates:

  • Instructions: The system prompt that defines the agent's role and behavior
  • Tools: A list of callable tools the agent can invoke
  • Handoffs: Definitions for transferring control to other agents
  • Input Guardrails: Pre-processing validation before agent execution
  • Output Guardrails: Post-processing validation of agent responses
graph LR
    A[Agent] --> B[Instructions]
    A --> C[Tools]
    A --> D[Handoffs]
    A --> E[Guardrails]

Agent Execution Flow

The execution follows a turn-based model where each turn processes user input, generates model responses, executes tools, and evaluates handoffs until a final response is produced.

sequenceDiagram
    participant User
    participant Runner
    participant Agent
    participant Tools
    participant Handoffs

    User->>Runner: User Input
    Runner->>Agent: Process Turn
    Agent->>Agent: Generate Response
    alt Tool Call
        Agent->>Tools: Execute Tool
        Tools-->>Agent: Tool Result
    end
    alt Handoff
        Agent->>Handoffs: Request Handoff
        Handoffs->>Agent: Switch Agent
    end
    Agent-->>Runner: Final Output
    Runner-->>User: Response

Handoffs System

The handoff system enables agents to delegate conversations to other specialized agents while preserving conversation context. Each handoff defines:

PropertyTypeDescription
namestrUnique identifier for the handoff tool
tool_namestrName exposed to the model for invoking
tool_descriptionstrDescription shown to the model
input_json_schemadictJSON schema for handoff arguments
on_invoke_handoffCallableFunction that returns the target agent
input_filterHandoffInputFilterOptional filter for conversation context

Sources: src/agents/handoffs/__init__.py

Handoff Input Filtering

By default, the new agent receives the entire conversation history. The input_filter function allows customization of what context is passed to the target agent:

input_filter: HandoffInputFilter | None = None
"""A function that filters the inputs that are passed to the next agent."""

Turn Resolution

The turn resolution system handles the complexity of multi-step agent interactions within a single turn. This includes managing pre-step items, new step items, tool results, guardrail evaluations, and handoff transitions.

Turn Resolution States

stateDiagram-v2
    [*] --> InputGuardrails: Input Received
    InputGuardrails --> ModelResponse: Passed
    ModelResponse --> ToolExecution: Tool Call
    ModelResponse --> Handoff: Agent Switch
    ModelResponse --> FinalOutput: Direct Response
    ToolExecution --> ModelResponse: More Tools
    ToolExecution --> Handoff: Switch During Tool
    ToolExecution --> FinalOutput: Complete
    Handoff --> InputGuardrails: New Agent
    FinalOutput --> [*]

Key Resolution Functions

The turn resolution process evaluates several conditions:

  1. Tool Input Guardrail Results: Validation before tool execution
  2. Function Results: Output from tool invocations
  3. Tool Output Guardrail Results: Validation after tool execution
  4. Handoff Evaluation: Check for agent transfer requests

Sources: src/agents/run_internal/turn_resolution.py

Tool Execution and Guardrails

Guardrail System

The SDK implements a two-layer guardrail system:

Guardrail TypeTimingPurpose
Input GuardrailsBefore agent processes inputValidate and sanitize user input
Output GuardrailsAfter agent generates responseValidate response content

Tool Use Tracking

Tools are tracked throughout execution to maintain state and enable:

  • Streaming output collection
  • Refusal detection
  • Error handling
  • Output validation
graph TD
    A[Tool Call] --> B{Input Guardrails}
    B -->|Pass| C[Execute Tool]
    B -->|Fail| D[Reject]
    C --> E[Tool Result]
    E --> F{Output Guardrails}
    F -->|Pass| G[Continue]
    F -->|Fail| H[Error Response]

Model Context Protocol (MCP) Integration

The SDK provides a Python abstraction for MCP servers through the MCPServer base class. This enables agents to interact with external MCP-capable tools and services.

MCPServer Base Class

The MCPServer class provides the foundation for MCP protocol implementation with methods for:

  • Resources: list_resources(), list_resource_templates(), read_resource()
  • Tools: Tool invocation and management
  • Prompts: Server-provided prompt templates

Sources: src/agents/mcp/server.py

Require Approval Settings

MCP tools support granular approval controls:

SettingBehavior
RequireApprovalSetting.NEVERAlways auto-approve
RequireApprovalSetting.ALWAYSAlways require approval
RequireApprovalSetting.UNDETERMINEDUse default behavior

Session and State Management

Run State

The run_state object tracks execution context including:

  • Current agent
  • Conversation history
  • Generated items
  • Original input
  • Turn counters

Persistence

The SDK supports session persistence for maintaining state across multiple interactions:

session_persistence_enabled: bool
store: StoreSetting

Tracing and Visualization

Agent Visualization

The SDK includes visualization utilities for generating DOT-format diagrams of agent relationships:

FunctionPurpose
get_all_nodes()Generate node definitions for agent graph
get_all_edges()Generate edge definitions for handoff connections
graph TD
    A[User] --> B[Orchestrator Agent]
    B --> C[Research Agent]
    B --> D[Writer Agent]
    C --> E[Web Search Tool]
    D --> F[File Write Tool]
    B --> G[Analytics Agent]
    G --> H[Data Analysis Tool]

Sources: src/agents/extensions/visualization.py

Item Processing

Message Item Extraction

The SDK provides utilities for extracting content from conversation items:

MethodPurpose
text_message_output()Extract text from a single message output item
text_message_outputs()Extract concatenated text from multiple items
extract_refusal()Extract refusal content if model refused to respond
@classmethod
def extract_refusal(cls, message: TResponseOutputItem) -> str | None:
    """Extracts refusal content from a message, if any."""

Run Configuration

Key Configuration Options

ParameterTypeDescription
max_turnsintMaximum conversation turns
toolslist[Function]Available tools for the run
input_guardrailslist[InputGuardrail]Input validation
output_guardrailslist[OutputGuardrail]Output validation
tool_use_trackerToolUseTrackerTracks tool invocations
run_stateRunStateMutable execution state

Sources: src/agents/run.py

Example Workflow Patterns

Research Bot Architecture

A common pattern involves multiple specialized agents:

  1. Planner Agent: Decomposes user queries into search tasks
  2. Search Agent: Executes web searches in parallel
  3. Writer Agent: Synthesizes research into final reports
graph LR
    A[User Query] --> B[Planner Agent]
    B --> C[Search 1]
    B --> D[Search 2]
    B --> E[Search N]
    C --> F[Writer Agent]
    D --> F
    E --> F
    F --> G[Final Report]

Sandbox Agent Workflow

Sandbox agents provide isolated execution environments:

graph TD
    A[SandboxAgent] --> B[Workspace]
    A --> C[Manifest]
    C --> D[Skill Loading]
    B --> E[Artifact Management]
    E --> F[File System Access]
    D --> G[Tool Execution]

SDK Version

Current SDK version: 1.0.0 (semantic versioning)

Sources: src/agents/version.py

Summary

The OpenAI Agents SDK provides a comprehensive framework for building sophisticated multi-agent applications. Key capabilities include:

  • Multi-Agent Orchestration: Define and coordinate multiple agents with specialized roles
  • Handoff System: Seamlessly transfer control between agents while maintaining context
  • Tool Execution: Integrate tools with guardrail validation at input and output
  • MCP Integration: Connect to external Model Context Protocol servers
  • State Management: Track execution state with persistence support
  • Tracing: Monitor and visualize agent interactions and flows

The SDK abstracts the complexity of turn resolution, tool tracking, and handoff management, allowing developers to focus on defining agent behavior and tool integrations.

Sources: [src/agents/handoffs/__init__.py]()

Installation and Setup

Related topics: OpenAI Agents SDK Overview, Examples Index

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: OpenAI Agents SDK Overview, Examples Index

Installation and Setup

Overview

The openai-agents-python library provides a comprehensive multi-agent framework for building AI-powered applications. The installation and setup process involves managing dependencies, configuring environment variables, and optionally setting up sandbox backends for code execution capabilities.

This page covers the complete setup workflow from initial installation through runtime configuration.

Source: https://github.com/openai/openai-agents-python / Human Manual

Examples Index

Related topics: OpenAI Agents SDK Overview, Agents

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Basic Examples

Continue reading this section for the full explanation and source context.

Section Sandbox Examples

Continue reading this section for the full explanation and source context.

Section Multi-Agent Research Examples

Continue reading this section for the full explanation and source context.

Related topics: OpenAI Agents SDK Overview, Agents

Examples Index

Overview

The Examples Index serves as a comprehensive guide to the sample applications and demonstrations provided in the openai-agents-python repository. These examples are designed to showcase the capabilities of the Agents SDK across various use cases, from basic agent interactions to complex multi-agent workflows involving sandboxed execution environments, voice interfaces, and external tool integrations.

The examples directory structure organizes demonstrations by functional category, allowing developers to quickly locate relevant implementations for their specific requirements. Each example is designed to be runnable with minimal configuration, serving as both documentation and a starting point for custom implementations.

Example Categories

Basic Examples

The basic examples provide the foundational patterns for building agents with the SDK. These examples demonstrate core concepts with minimal complexity.

ExampleFilePurpose
Hello Worldexamples/basic/hello_world.pySimple agent that responds to user input
Agent as Toolexamples/agent_patterns/agents_as_tools.pyDemonstrates wrapping agents as tools for other agents

Sources: examples/basic/hello_world.py Sources: examples/agent_patterns/agents_as_tools.py

Sandbox Examples

Sandbox examples demonstrate the isolated workspace capabilities of the Agents SDK, enabling agents to execute code and manipulate files in a secure environment.

#### Small API Examples

ExampleCommandDescription
Basic Sandboxuv run python examples/sandbox/basic.pyCreates a sandbox session from a manifest, runs a SandboxAgent, and streams the result
Handoffsuv run python examples/sandbox/handoffs.pyUses handoffs with sandbox-backed agents
Workspace Capabilitiesuv run python examples/sandbox/sandbox_agent_capabilities.pyConfigures a sandbox agent with workspace capabilities
Sandbox with Toolsuv run python examples/sandbox/sandbox_agent_with_tools.pyCombines sandbox capabilities with host-defined tools
Agents as Toolsuv run python examples/sandbox/sandbox_agents_as_tools.pyExposes sandbox agents as tools for another agent
Remote Snapshotuv run python examples/sandbox/sandbox_agent_with_remote_snapshot.pyStarts from a remote snapshot

Sources: examples/sandbox/README.md:1-20

#### Sandbox Extensions

Sandbox extensions provide integrations with various cloud sandbox providers:

ProviderSetup CommandRun Command
E2Buv sync --extra e2buv run python examples/sandbox/basic.py --backend e2b
Modaluv sync --extra modaluv run python examples/sandbox/extensions/modal_runner.py --stream
Blaxeluv sync --extra blaxeluv run python examples/sandbox/extensions/blaxel_runner.py --stream
Verceluv sync --extra verceluv run python/examples/sandbox/extensions/vercel_runner.py --stream
Daytonauv sync --extra daytonauv run python examples/sandbox/extensions/daytona/daytona_runner.py --stream
Runloopuv sync --extra runloopPlatform-specific setup
TemporalTemporal CLI + justjust worker / just tui

Sources: examples/sandbox/extensions/README.md

Multi-Agent Research Examples

#### Research Bot

The research bot demonstrates a multi-agent system where agents collaborate to perform web research and synthesize findings into reports.

Architecture Flow:

graph TD
    A[User Input] --> B[Planner Agent]
    B --> C[Generate Search Queries]
    C --> D[Search Agent 1]
    C --> E[Search Agent 2]
    C --> F[Search Agent N]
    D --> G[Parallel Execution]
    E --> G
    F --> G
    G --> H[Writer Agent]
    H --> I[Final Report]

Key Components:

  • Planner Agent: Creates a research plan with search terms and rationale
  • Search Agent: Uses Web Search tool to search and summarize results
  • Writer Agent: Synthesizes summaries into a long-form markdown report

Sources: examples/research_bot/README.md

#### Financial Research Agent

The financial research agent demonstrates domain-specific research capabilities with access to specialized analysis tools.

Agent Configuration:

You are a senior financial analyst. You will be provided with the original query
and a set of raw search summaries. Your job is to synthesize these into a
long‑form markdown report with a short executive summary.

Available Tools:

  • fundamentals_analysis - Specialist write-up for fundamental analysis
  • risk_analysis - Specialist write-up for risk assessment

Sources: examples/financial_research_agent/README.md

Healthcare Support Example

A demonstration workflow that combines sandbox execution with human-in-the-loop approvals for healthcare-related tasks.

Workflow Components:

  • Orchestrator Agent: Coordinates the overall workflow
  • Benefits Subagent: Handles benefits-related queries
  • Sandbox Policy Agent: Executes policy validation in sandbox
  • Memory Recap Agent: Maintains conversation context

Key Files:

FilePurpose
main.pyStandalone CLI demo runner
workflow.pyShared workflow execution logic, sandbox setup, artifact copying, tracing
support_agents.pyAgent definitions
tools.pyLocal lookup tools and approval-gated human handoff
skills/prior-auth-packet-builder/SKILL.mdSandbox skill definition

Available Scenarios:

uv run python examples/sandbox/healthcare_support/main.py --list-scenarios
uv run python examples/sandbox/healthcare_support/main.py --scenario blue_cross_pt_benefits
uv run python examples/sandbox/healthcare_support/main.py --scenario messy_ambiguous_knee_case

Sources: examples/sandbox/healthcare_support/README.md

Voice Examples

Voice examples demonstrate real-time audio interaction capabilities with agents.

Architecture:

graph LR
    A[Audio Input] --> B[Voice Agent]
    B --> C[Streaming Response]
    C --> D[Audio Output]
    B --> E[Tool Calls]
    E --> F[External Services]

Run Command:

uv run python examples/voice/streamed/main.py

Sources: examples/voice/streamed/main.py

MCP Examples

Model Context Protocol (MCP) examples demonstrate integration with external MCP servers for extended tool capabilities.

#### Streamable HTTP Remote Example

Connects to DeepWiki over the Streamable HTTP transport to leverage external tools.

Run Command:

uv run python examples/mcp/streamable_http_remote_example/main.py

Prerequisites:

  • OPENAI_API_KEY set for model calls

Sources: examples/mcp/streamable_http_remote_example/README.md

Model Provider Examples

Model provider examples demonstrate routing models through adapter layers for flexibility in model selection.

AdapterDirect RunAuto Mode
any-llmuv run examples/model_providers/any_llm_provider.pyuv run examples/model_providers/any_llm_auto.py
LiteLLMuv run examples/model_providers/litellm_provider.pyuv run examples/model_providers/litellm_auto.py

Model Override:

uv run examples/model_providers/any_llm_provider.py --model openrouter/openai/gpt-5.4-mini

Sources: examples/model_providers/README.md

Common Configuration

Environment Variables

Most examples require the OPENAI_API_KEY environment variable. Configure it in one of these locations:

  1. Repository-root .env file
  2. Example's local .env file
  3. Shell environment

Running with uv

The project uses uv for dependency management. Run examples with:

uv run python <path-to-example>

Interactive Mode

For examples with prompts, set EXAMPLES_INTERACTIVE_MODE=auto to auto-answer:

EXAMPLES_INTERACTIVE_MODE=auto uv run python examples/sandbox/healthcare_support/main.py --scenario messy_ambiguous_knee_case

Example Selection Guide

graph TD
    A[Use Case] --> B{Basic Interaction?}
    B -->|Yes| C[Basic Examples]
    B -->|No| D{Multi-Agent Workflow?}
    D -->|Yes| E{Research Domain?}
    D -->|No| F{Sandbox Required?}
    E -->|Financial| G[Financial Research Agent]
    E -->|General| H[Research Bot]
    F -->|Yes| I{Specialized Provider?}
    F -->|No| J[Agent Patterns]
    I -->|E2B| K[E2B Examples]
    I -->|Modal| L[Modal Examples]
    I -->|Vercel| M[Vercel Examples]
    I -->|Daytona| N[Daytona Examples]
    I -->|Blaxel| O[Blaxel Examples]

Sandbox Backend Comparison

BackendInterfaceWorkspace PersistenceCloud Support
E2BBash-styleSnapshot filesYes
ModalBash-styleTar, snapshot files/directoryYes
BlaxelBash-style + PTYDrive mount, cloud bucketsYes (S3, R2, GCS)
VercelCommand executionTar, snapshotYes
DaytonaBash-styleYesYes
RunloopTBDYesYes

Sources: examples/sandbox/extensions/README.md

Sources: [examples/basic/hello_world.py](examples/basic/hello_world.py)

Agents

Related topics: Tools, Handoffs, Guardrails, Run Loop and Execution

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Agent Core Components

Continue reading this section for the full explanation and source context.

Section Agent Types

Continue reading this section for the full explanation and source context.

Section Source File Organization

Continue reading this section for the full explanation and source context.

Related topics: Tools, Handoffs, Guardrails, Run Loop and Execution

Agents

Overview

Agents are the core execution units in the OpenAI Agents SDK. An agent encapsulates an LLM with instructions, tools, handoffs, and guardrails that enable autonomous task completion. Agents process user inputs, make decisions about tool usage, transfer control to other agents, and generate responses.

The agent system provides a structured approach to building AI-powered applications by separating concerns between orchestration, tool execution, and response generation. Agents can be composed hierarchically, where one agent can delegate tasks to sub-agents or hand off control entirely to specialized agents.

Architecture

Agent Core Components

An agent consists of several interconnected components that work together to process requests and generate responses.

graph TD
    A[User Input] --> B[Agent]
    B --> C[Instructions/Prompt]
    B --> D[Tools]
    B --> E[Handoffs]
    B --> F[Guardrails]
    C --> G[LLM Decision Engine]
    D --> H[Tool Execution]
    E --> I[Agent Transfer]
    G --> J[Response/Action]
    H --> J
    I --> K[Target Agent]
    K --> G

Agent Types

TypeDescriptionUse Case
Agent[TContext]Base agent type with generic contextGeneral purpose agents
SandboxAgentAgent with isolated workspaceCode execution, file operations
FunctionAgentAgent for function/tool orchestrationTool-heavy workflows

Source File Organization

FilePurpose
src/agents/agent.pyCore agent class definition
src/agents/lifecycle.pyAgent lifecycle management
src/agents/agent_output.pyOutput types and responses
src/agents/items.pyRun item definitions and helpers
src/agents/function_schema.pyTool schema generation

Agent Lifecycle

Agents follow a defined lifecycle from initialization through execution to completion or handoff.

stateDiagram-v2
    [*] --> Initialized: Agent Created
    Initialized --> Running: Input Received
    Running --> ToolExecution: Tool Call
    ToolExecution --> Running: Tool Result
    Running --> Handoff: Transfer Request
    Handoff --> [*]: Complete
    Running --> Response: Final Output
    Response --> [*]: Complete
    Handoff --> Running: New Agent

Lifecycle States

StateDescriptionEntry Condition
InitializedAgent created but not yet processingObject instantiation
RunningActively processing inputrun() or run_sync() called
ToolExecutionExecuting one or more toolsLLM requests tool call
HandoffTransferring to another agentLLM triggers handoff
ResponseGenerating final responseNo more actions needed

Sources: src/agents/lifecycle.py:1-50

Turn Resolution

The turn resolution process handles the core agent loop. Each turn processes input and determines next actions.

sequenceDiagram
    participant U as User
    participant R as Runner
    participant A as Agent
    participant T as Tools
    participant H as Handoffs
    
    U->>R: User Input
    R->>A: Process Turn
    A->>T: Tool Calls?
    T-->>A: Results
    A->>H: Handoff?
    H-->>A: New Agent
    A->>R: Response
    R-->>U: Output

Sources: src/agents/run_internal/turn_resolution.py:1-80

Run Items

Run items represent the atomic units of work within an agent execution. They capture messages, tool calls, tool results, and handoffs.

Item Types

TypeDescriptionSource
MessageOutputItemLLM generated messagesrc/agents/items.py:30-60
ToolCallItemTool invocation requestsrc/agents/items.py:61-90
ToolCallOutputItemTool execution resultsrc/agents/items.py:91-120
HandoffItemAgent transfersrc/agents/items.py:121-150
ToolApprovalItemHuman approval for toolssrc/agents/handoffs/history.py:50-70

Message Extraction

The ItemHelpers class provides utilities for extracting content from run items:

# Extract text from message output
text = ItemHelpers.text_message_output(message_item)

# Extract refusal if present
refusal = ItemHelpers.extract_refusal(message.raw_item)

# Convert string to input list
input_list = ItemHelpers.input_to_new_input_list("user message")

Sources: src/agents/items.py:40-75

Handoffs

Handoffs enable agent-to-agent transfer, allowing specialized agents to handle specific tasks.

Handoff Configuration

ParameterTypeDescription
agentAgentTarget agent
tool_name_overridestrOverride for handoff tool name
tool_description_overridestrOverride for handoff description
on_handoffCallableCallback when handoff occurs
input_typeTypeType validation for handoff input
input_filterCallableFilter inputs passed to next agent
is_enabled`bool \Callable`Enable/disable handoff

Sources: src/agents/handoffs/__init__.py:30-80

Handoff History Management

When an agent hands off to another, the conversation history is summarized to maintain context:

# Nested history processing
nested_history = nest_handoff_history(
    handoff_input_data,
    history_mapper=custom_mapper
)

The history wrapper markers default to <CONVERSATION HISTORY> tags but can be customized:

# Customize history markers
set_conversation_history_wrappers(
    start="<PREVIOUS_CONTEXT>",
    end="</PREVIOUS_CONTEXT>"
)

Sources: src/agents/handoffs/history.py:20-60

Tools and Function Schema

Tools extend agent capabilities by providing functions the LLM can call.

Function Schema Generation

The FunctionSchema class converts Python functions into OpenAI-compatible tool schemas:

schema = FunctionSchema.from_fn(my_function)
tool_definition = schema.to_tool_definition()

Tool Definition Structure

FieldTypeDescription
namestrTool identifier
descriptionstrHuman-readable description
parametersdictJSON schema for parameters
strictboolEnable strict parameter validation

Sources: src/agents/function_schema.py:1-50

Agent Visualization

The SDK provides DOT-format visualization for agent graphs:

graph TD
    subgraph AgentGraph
        A["User Input"] --> B["Agent"]
        B --> C["Tool: search"]
        B --> D["Tool: calculate"]
        B --> E["Handoff: specialist"]
        E --> F["Specialist Agent"]
    end

Graph Components

ComponentShapeColorDescription
StartEllipselightblueEntry point
AgentBoxlightyellowAgent nodes
ToolEllipselightgreenTool definitions
HandoffBoxlightgreyAgent transfer points
EndEllipselightblueExit point

Sources: src/agents/extensions/visualization.py:1-60

Agent Output

Agent execution produces structured output containing messages, tool calls, and metadata.

Output Structure

@dataclass
class AgentOutput:
    messages: list[MessageOutputItem]
    tool_calls: list[ToolCallItem]
    tool_results: list[ToolCallOutputItem]
    handoffs: list[HandoffItem]
    final_response: str | None

Sources: src/agents/agent_output.py:1-40

Response Finalization

After tool execution, the system finalizes responses:

tool_final_output = await _maybe_finalize_from_tool_results(
    public_agent=agent,
    original_input=input,
    new_response=response,
    pre_step_items=pre_items,
    new_step_items=new_items,
    function_results=results
)

Refusals are extracted and converted to errors:

refusal = ItemHelpers.extract_refusal(message_item.raw_item)
if refusal:
    raise ModelRefusalError(refusal)

Sources: src/agents/run_internal/turn_resolution.py:80-120

Runner Integration

The Runner class orchestrates agent execution, managing the turn loop and state transitions.

Run Configuration

ParameterTypeDefaultDescription
max_turnsint10Maximum conversation turns
max_toolsint100Maximum tool calls
context_lengthintModel dependentContext window size
tool_choicestr"auto"Tool selection strategy

State Management

The runner maintains RunState throughout execution:

run_state = RunState(
    current_agent=agent,
    model_response=response,
    generated_items=items,
    run_config=config
)

Sources: src/agents/run.py:100-180

Error Handling

Model Refusal

When the LLM refuses to respond, a ModelRefusalError is raised:

if refusal:
    refusal_error = ModelRefusalError(refusal)
    run_error_data = build_run_error_data(...)

Tool Activity Tracking

The system tracks tool usage even when no messages are generated:

has_tool_activity_without_message = not message_items and bool(
    processed_response.tools_used
)

Multi-Agent Patterns

Hierarchical Agents

graph TD
    O[Orchestrator] --> S[Search Agent]
    O --> A[Analysis Agent]
    O --> W[Writer Agent]
    S --> R[Research Results]
    A --> R
    A --> D[Data Insights]
    W --> R
    W --> D

Parallel Execution

Agents can execute in parallel for independent tasks:

# Multiple search agents running concurrently
search_tasks = [search_agent.run(query) for query in queries]
results = await asyncio.gather(*search_tasks)

Best Practices

  1. Context Management: Use generic Agent[TContext] with custom context classes for type safety
  2. Handoff Design: Create focused agents with clear responsibilities and minimal handoffs
  3. Tool Organization: Group related tools into toolkits for better organization
  4. History Filtering: Use input_filter in handoffs to prevent context overflow
  5. Error Handling: Always handle ModelRefusalError and tool execution failures
ComponentFileRelationship
MCP Serversrc/agents/mcp/server.pyProvides external tool access
Guardrailssrc/agents/guardrails.pyInput/output validation
Streamingsrc/agents/streaming.pyReal-time output
Tracingsrc/agents/tracing.pyExecution monitoring

Sources: [src/agents/lifecycle.py:1-50]()

Tools

Related topics: Agents, Guardrails

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Tool Base Class

Continue reading this section for the full explanation and source context.

Section Tool Interface

Continue reading this section for the full explanation and source context.

Section FunctionTool

Continue reading this section for the full explanation and source context.

Related topics: Agents, Guardrails

Tools

Overview

Tools in the OpenAI Agents Python SDK enable AI agents to interact with external systems, execute code, manipulate files, and perform actions in isolated environments. The tools system provides a structured way for agents to extend their capabilities beyond pure text generation by calling functions, accessing resources, and performing complex operations.

The SDK implements a tool abstraction that wraps callable functions with metadata, descriptions, and execution logic. When an agent decides to use a tool, the SDK handles the invocation, manages the context, processes results, and returns responses to the agent for further processing.

工具系统支持多种工具类型,从简单的函数调用到复杂的沙箱执行环境。工具可以在初始化时配置各种选项,包括名称、描述、参数模式等,并且可以与代理的批准机制和防护栏系统集成。

Core Tool Architecture

Tool Base Class

The foundation of the tools system is the Tool class defined in src/agents/tool.py. This abstract base class defines the interface that all tools must implement, ensuring consistent behavior across different tool types.

graph TD
    A[Tool Base Class] --> B[FunctionTool]
    A --> C[FileSearchTool]
    A --> D[ComputerTool]
    A --> E[WebSearchTool]
    A --> F[Sandbox Agent Tools]

Each tool implementation must provide:

  • A unique name identifier
  • A description for the LLM to understand tool purpose
  • Parameter schema for function calling
  • Execution logic in an invoke or acall method

Tool Interface

The tool interface follows a standard pattern where each tool is defined with metadata that allows the LLM to understand when and how to use it. Tools can be synchronous or asynchronous, supporting both simple function calls and complex operations that require I/O operations.

工具的关键属性包括:

PropertyTypeDescription
namestrUnique identifier for the tool
descriptionstrNatural language description for LLM
parametersdictJSON Schema for tool arguments
strictboolWhether to enforce parameter validation

Sources: src/agents/tool.py:1-50

Built-in Tool Types

FunctionTool

FunctionTool is the most common tool type, wrapping a Python function with tool metadata. It allows developers to expose arbitrary Python functions as tools that agents can call.

from agents import FunctionTool

def calculate_budget(items: list[str]) -> float:
    # Implementation
    return total

budget_tool = FunctionTool(
    name="calculate_budget",
    description="Calculate the total budget for a list of items",
    params_json_schema={...},
    handle_invoke=calculate_budget
)

File and Editor Tools

The SDK provides specialized tools for file operations. The FileSearchTool enables searching through file contents, while editor tools provide controlled file manipulation capabilities.

Sources: src/agents/editor.py:1-100

#### Editor Tool Capabilities

OperationDescription
readRead file contents
writeWrite content to files
editModify existing files
globFind files by pattern
lsList directory contents
mvMove/rename files
rmDelete files

Computer Tool

The ComputerTool enables agents to interact with a virtualized computer environment. This is particularly useful for tasks requiring UI automation, screenshot analysis, and keyboard/mouse control.

Sources: src/agents/computer.py:1-100

The Computer Tool provides:

  • Screen Capture: Take screenshots of the virtual display
  • Mouse Control: Move cursor, click, scroll operations
  • Keyboard Control: Type text, press keys and key combinations
  • Process Management: Launch and interact with applications
graph LR
    A[Agent Decision] --> B[Computer Tool Action]
    B --> C{Screen Capture?}
    C -->|Yes| D[Screenshot Analysis]
    C -->|No| E[Execute Action]
    D --> F[Observation Result]
    E --> G[Action Result]
    F --> H[Agent Processing]
    G --> H

ApplyDiff Tool

The ApplyDiff tool provides efficient file modification capabilities using diff-based operations. Instead of replacing entire files, it applies targeted changes, making it more efficient for large files and reducing the risk of unintended modifications.

Sources: src/agents/apply_diff.py:1-100

Tool Context and State Management

Tool Context

Tool context (tool_context) provides runtime information to tools during execution. It encapsulates the current run state, session information, and access to shared resources.

Sources: src/agents/tool_context.py:1-100

graph TD
    A[Tool Execution] --> B[ToolContext]
    B --> C[RunContext]
    B --> D[Session]
    B --> E[Store Settings]
    C --> F[Current Agent]
    C --> G[User Context]

Agent Tool State

The AgentToolState manages tool-related state within an agent's execution context. This includes tracking tool usage, maintaining state across tool calls, and managing tool-specific configurations.

Sources: src/agents/agent_tool_state.py:1-100

Key responsibilities include:

  • Tracking which tools have been invoked
  • Maintaining state between sequential tool calls
  • Managing tool-specific configuration options
  • Handling tool result caching when appropriate

Tool Configuration

Tool Parameters

Tools are configured with JSON Schema definitions that describe their expected parameters. This schema serves dual purposes:

  1. LLM Understanding: Helps the model generate correct tool calls
  2. Validation: Ensures incoming parameters meet requirements
params_json_schema = {
    "type": "object",
    "properties": {
        "query": {
            "type": "string",
            "description": "Search query string"
        },
        "limit": {
            "type": "integer",
            "description": "Maximum results to return",
            "default": 10
        }
    },
    "required": ["query"]
}

Tool Options

OptionDescriptionDefault
nameTool identifierFunction name
descriptionLLM-facing descriptionDocstring
params_json_schemaParameter schemaAuto-generated
strictEnforce schema strictlyFalse
require_approvalRequire human approvalNone

Tool Guardrails

Input Guardrails

Input guardrails validate tool parameters before execution. They provide an opportunity to inspect, modify, or reject tool calls based on custom logic.

async def validate_search_params(
    ctx: RunContextWrapper,
    tool: MCPTool,
    params: dict
) -> InputGuardrailResult:
    # Custom validation logic
    if contains_prohibited_terms(params.get("query")):
        return InputGuardrailResult(
            did_pass=False,
            message="Query contains prohibited content"
        )
    return InputGuardrailResult(did_pass=True)

Output Guardrails

Output guardrails validate tool results after execution. They ensure that tool outputs meet safety, formatting, or content requirements before being returned to the agent.

Sources: src/agents/items.py:50-100

Tool Filtering

The SDK supports filtering which tools are exposed to agents. This is particularly useful when:

  • Limiting agent capabilities for security
  • Testing specific tool behaviors
  • Implementing role-based access control

Sources: examples/mcp/tool_filter_example/README.md

# Static tool filter
tool_filter = ["filesystem_read", "filesystem_write"]

# Dynamic tool filter
async def dynamic_filter(
    ctx: RunContextWrapper,
    agent: Agent,
    tool: Tool
) -> bool:
    return tool.name in allowed_tools

Integration with Agents

Adding Tools to Agents

Tools are added to agents through the agent's initialization or configuration:

agent = Agent(
    name="research_agent",
    tools=[
        web_search_tool,
        file_search_tool,
        custom_function_tool
    ],
    instructions="You are a research assistant..."
)

Tool Execution Flow

sequenceDiagram
    participant Agent
    participant SDK
    participant Tool
    participant External

    Agent->>SDK: Request tool execution
    SDK->>Tool: Validate parameters
    Tool->>Tool: Apply input guardrails
    Tool->>External: Execute operation
    External-->>Tool: Return result
    Tool->>Tool: Apply output guardrails
    Tool-->>SDK: Return processed result
    SDK-->>Agent: Provide tool result

Human-in-the-Loop with Tools

Approval Requirements

Tools can be configured to require human approval before execution. When enabled, the SDK pauses tool execution and awaits human confirmation.

tool = FunctionTool(
    name="send_email",
    handle_invoke=send_email,
    require_approval="always"
)

Sources: src/agents/mcp/server.py:100-150

Approval Resume

After human approval or rejection, the SDK resumes execution with the approval result:

await runner.resume(
    run_id=run_id,
    approval_result=ApprovalResult(approved=True)
)

Summary

The Tools system in the OpenAI Agents Python SDK provides a flexible, extensible framework for adding capabilities to AI agents. Key features include:

  • Abstraction: Consistent interface for diverse tool types
  • Composition: Tools can be combined and filtered dynamically
  • Safety: Built-in guardrails and approval mechanisms
  • Context Awareness: Runtime context enables stateful tool interactions
  • Integration: Seamless integration with the agent execution model

By leveraging these tools, developers can create sophisticated agents that can search the web, manipulate files, execute code, interact with computer interfaces, and integrate with external services through protocols like MCP.

Sources: [src/agents/tool.py:1-50]()

Guardrails

Related topics: Agents, Tools

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Input Guardrails

Continue reading this section for the full explanation and source context.

Section Tool Input Guardrails

Continue reading this section for the full explanation and source context.

Section Tool Output Guardrails

Continue reading this section for the full explanation and source context.

Related topics: Agents, Tools

Guardrails

Guardrails provide a security and validation layer in the agents framework, enabling developers to intercept, validate, and control both incoming inputs and outgoing outputs at various stages of agent execution. They serve as programmable checkpoints that can enforce policy compliance, prevent data leakage, block harmful content, and ensure operational safety across the entire agent runtime.

Overview

The guardrail system operates at multiple checkpoints during agent execution:

graph TD
    A[User Input] --> B[Input Guardrails]
    B --> C[Agent Processing]
    C --> D[Tool Call]
    D --> E[Tool Input Guardrails]
    E --> F[Tool Execution]
    F --> G[Tool Output Guardrails]
    G --> H[Response Generation]
    H --> I[Output Guardrails]
    I --> J[Final Output]
    
    B -.->|Block/Modify| A
    E -.->|Block/Modify| D
    G -.->|Block/Modify| F
    I -.->|Block/Modify| H

Guardrails are implemented as pluggable components that can be attached to agents, individual tools, or configured globally. Each guardrail can define one of three behavioral responses when triggered:

Behavior TypeDescription
raise_exceptionThrows a tripwire exception, halting execution
reject_contentReplaces the content with a custom rejection message
filterRemoves or sanitizes the problematic content (planned)

Sources: src/agents/run_internal/tool_execution.py:1-50

Types of Guardrails

Input Guardrails

Input guardrails validate user-provided input before it reaches the agent. They receive the raw input and can inspect, modify, or reject it based on custom logic.

sequenceDiagram
    participant User
    participant Runner
    participant InputGuardrail
    participant Agent
    
    User->>Runner: User Input
    Runner->>InputGuardrail: Run input through guardrails
    alt Guardrail triggers
        InputGuardrail->>Runner: GuardrailOutput with behavior
        alt raise_exception
            Runner-->>User: GuardrailTripwireTriggered Error
        else reject_content
            Runner->>Agent: Modified/Sanitized input
        end
    else Pass through
        InputGuardrail->>Runner: GuardrailOutput with pass behavior
        Runner->>Agent: Original input
    end

Sources: src/agents/run_internal/guardrails.py:1-30

Tool Input Guardrails

Tool input guardrails validate the arguments passed to tool calls before execution. They have access to the tool context, agent information, and the raw tool arguments.

@dataclass
class ToolInputGuardrailData:
    context: ToolContext[Any]
    agent: Agent[Any]
    input: Any  # The raw tool arguments

Sources: src/agents/tool_guardrails.py:1-20

Tool Output Guardrails

Tool output guardrails validate the results returned from tool execution before those results are processed further. They can inspect, filter, or reject tool outputs.

@dataclass
class ToolOutputGuardrailData:
    context: ToolContext[Any]
    agent: Agent[Any]
    output: Any  # The raw tool result

Sources: src/agents/tool_guardrails.py:1-20

Output Guardrails

Output guardrails validate the agent's final response before it is returned to the user. These operate on the completed message stream and can perform final content filtering or policy checks.

GuardrailResult Structure

Each guardrail execution produces a GuardrailOutput result that defines the subsequent action:

@dataclass
class GuardrailOutput:
    content_filtered: bool
    policy_name: str
    policy_version: str
    content: str | None
    behavior: dict[str, Any]

The behavior dictionary must contain at minimum a type key specifying one of the supported behavior types.

Sources: src/agents/guardrail.py:1-50

Configuration

Agent-Level Guardrail Configuration

Guardrails can be attached directly to an agent instance:

from agents import Agent, Guardrail

agent = Agent(
    name="secure_agent",
    instructions="You are a helpful assistant",
    input_guardrails=[
        Guardrail(guardrail_name="content_filter"),
        Guardrail(guardrail_name="pii_detector"),
    ],
    output_guardrails=[
        Guardrail(guardrail_name="safety_check"),
    ],
)

Tool-Level Guardrail Configuration

Individual tools can have their own guardrails:

from agents import function_tool, ToolInputGuardrail, ToolOutputGuardrail

@function_tool(
    tool_input_guardrails=[input_check_guardrail],
    tool_output_guardrails=[output_check_guardrail],
)
def sensitive_operation(x: str) -> str:
    return process(x)

Sources: src/agents/tool.py:1-30

Guardrail Behavior Configuration

Guardrails can be configured with different tripwire behaviors:

ParameterTypeDescription
guardrail_namestrUnique identifier for the guardrail
on_failGuardrailFailureModeBehavior when triggered
error_messagestrCustom error message for exceptions
logboolWhether to log guardrail triggers

Tracing and Observability

Guardrail execution is automatically traced using the observability framework:

graph LR
    A[Guardrail Trigger] --> B[guardrail_span]
    B --> C[Record triggered status]
    B --> D[Capture span data]
    D --> E[Export to trace provider]
    
    C -->|True| F[Mark span as triggered]
    C -->|False| G[Continue normally]

The guardrail_span function creates spans for monitoring:

def guardrail_span(
    name: str,
    triggered: bool = False,
    span_id: str | None = None,
    parent: Trace | Span[Any] | None = None,
    disabled: bool = False,
) -> Span[GuardrailSpanData]:

Sources: src/agents/tracing/create.py:1-40

Execution Flow

Tool Guardrail Execution

Tool guardrails are executed within the tool execution pipeline:

flowchart TD
    A[Tool Call Invoked] --> B{Input Guardrails exist?}
    B -->|Yes| C[Execute Input Guardrails]
    C --> D{Any trigger raise_exception?}
    D -->|Yes| E[Raise ToolInputGuardrailTripwireTriggered]
    D -->|No| F{Any trigger reject_content?}
    F -->|Yes| G[Replace input with message]
    F -->|No| H[Execute Tool]
    H --> I{Output Guardrails exist?}
    I -->|Yes| J[Execute Output Guardrails]
    J --> K{Any trigger raise_exception?}
    K -->|Yes| L[Raise ToolOutputGuardrailTripwireTriggered]
    K -->|No| M{Any trigger reject_content?}
    M -->|Yes| N[Replace output with message]
    M -->|No| O[Return result]

Sources: src/agents/run_internal/tool_execution.py:50-100

Guardrail Tripwire Exceptions

When a guardrail triggers with raise_exception behavior, specific exception types are raised:

Exception TypeTriggered By
ToolInputGuardrailTripwireTriggeredTool input guardrail rejection
ToolOutputGuardrailTripwireTriggeredTool output guardrail rejection

These exceptions contain both the guardrail reference and the output that triggered it, enabling detailed error handling and debugging.

Implementation Pattern

Creating a Custom Guardrail

from agents import Guardrail, RunContextWrapper
from agents.guardrail import (
    GuardrailOutput,
    InputGuardrailOutputData,
    OutputGuardrailOutputData,
)

async def my_guardrail(
    context: RunContextWrapper,
    input_data: InputGuardrailOutputData,
) -> GuardrailOutput:
    text = input_data.agents_input
    if contains_problematic_content(text):
        return GuardrailOutput(
            content_filtered=True,
            policy_name="my_policy",
            policy_version="1.0",
            content="Content filtered due to policy violation",
            behavior={"type": "reject_content", "message": "Content not allowed"},
        )
    return GuardrailOutput(
        content_filtered=False,
        policy_name="my_policy",
        policy_version="1.0",
        content=None,
        behavior={"type": "pass"},
    )

guardrail = Guardrail(
    guardrail_name="my_custom_guardrail",
    guardrail_function=my_guardrail,
)

Using with FunctionTool

from agents import function_tool, ToolInputGuardrail, ToolOutputGuardrail

@function_tool(
    tool_input_guardrails=[
        ToolInputGuardrail(guardrail_function=validate_json_input),
    ],
    tool_output_guardrails=[
        ToolOutputGuardrail(guardrail_function=validate_output_schema),
    ],
)
def process_data(input: str) -> dict:
    # Tool implementation
    pass

Best Practices

  1. Defense in Depth: Layer multiple guardrails at different checkpoints for comprehensive coverage
  2. Fail-Safe Defaults: Configure guardrails to fail closed (reject) rather than open (pass) when uncertain
  3. Logging: Enable guardrail logging for security auditing and debugging
  4. Performance: Keep guardrail logic lightweight to avoid introducing latency
  5. Idempotency: Ensure guardrails produce consistent results for the same input

See Also

  • Agents Overview — General agent architecture
  • Tools — Tool implementation and configuration
  • Tracing — Observability and monitoring
  • Handoffs — Multi-agent handoff mechanisms

Sources: [src/agents/run_internal/tool_execution.py:1-50]()

Handoffs

Related topics: Agents, Agents as Tools, Run Loop and Execution

Section Related Pages

Continue reading this section for the full explanation and source context.

Section What is a Handoff?

Continue reading this section for the full explanation and source context.

Section The Handoff Class

Continue reading this section for the full explanation and source context.

Section HandoffInputData

Continue reading this section for the full explanation and source context.

Related topics: Agents, Agents as Tools, Run Loop and Execution

Handoffs

Overview

Handoffs in the OpenAI Agents Python SDK enable seamless transfer of control and conversation context between different agents. When an agent determines that a task should be handled by another agent, a handoff executes the transition, optionally filtering or transforming the input data before the receiving agent begins processing.

The handoff mechanism serves as the backbone for multi-agent architectures, allowing complex workflows where specialized agents handle specific subtasks while maintaining coherent conversation state across transitions.

Core Concepts

What is a Handoff?

A handoff is a structured mechanism that transfers control from one agent to another. It encapsulates:

  • The destination agent
  • Tool configuration for invoking the handoff
  • Optional input filtering logic
  • Optional type validation for handoff arguments
  • Enable/disable conditions

Sources: src/agents/handoffs/__init__.py:1-100

The Handoff Class

The Handoff class is the primary abstraction for defining agent-to-agent transfers:

class Handoff(Generic[TAgent, TContext]):
    name: str
    description: str
    input_json_schema: dict[str, Any]
    on_invoke_handoff: Callable[[RunContextWrapper[Any], str], Awaitable[TAgent]]
    agent_name: str
    input_filter: HandoffInputFilter | None = None
    is_enabled: bool | Callable[[RunContextWrapper[Any], Agent[TContext]], bool] = True

Sources: src/agents/handoffs/__init__.py:100-130

HandoffInputData

When a handoff is invoked, it receives and processes HandoffInputData:

FieldTypeDescription
input_historylist[InputItem]Conversation history up to the handoff point
pre_handoff_itemslist[RunItem]Run items generated before handoff
input_itemslist[InputItem]Input items to pass to the next agent
new_itemslist[RunItem]New items to add to the receiving agent's context

Sources: src/agents/handoffs/__init__.py:50-80

Architecture

Handoff Flow

graph TD
    A[Current Agent] -->|Determines handoff needed| B[Handoff Tool Call]
    B --> C{is_enabled check}
    C -->|Enabled| D[on_invoke_handoff]
    C -->|Disabled| E[Hide from LLM]
    D --> F[Input Filter Processing]
    F --> G{HandoffInputData}
    G --> H[Next Agent Context]
    H --> I[Receiving Agent]
    
    J[Type Validation] -.->|if input_type provided| F
    K[History Nesting] -.->|if nest_handoff_history enabled| G

Agent Hierarchy with Handoffs

graph TD
    A[Orchestrator Agent] -->|handoff| B[Research Agent]
    A -->|handoff| C[Writer Agent]
    A -->|handoff| D[Review Agent]
    B -->|handoff| E[Web Search Agent]
    B -->|handoff| F[Data Analysis Agent]
    C -->|handoff| D

Configuration Options

Handoff Constructor Parameters

ParameterTypeRequiredDefaultDescription
agentAgent[TContext]Yes-The destination agent
namestrNoagent.nameCustom name for the handoff tool
descriptionstrNoagent.descriptionTool description shown to the model
tool_description_overridestrNoNoneOverride the tool description
on_handoffCallableNoNoneSide effect function executed on handoff
input_typetypeNoNoneType for validating handoff arguments
input_filterHandoffInputFilterNoNoneFunction to filter/transform inputs
nest_handoff_historyboolNoNoneOverride run-level history nesting setting
is_enabled`bool \Callable`NoTrueWhether the handoff is available

Sources: src/agents/handoffs/__init__.py:150-200

Input Type Validation

When input_type is provided, the model-generated JSON arguments are validated:

if input_type is not None and on_handoff is None:
    raise UserError("You must provide on_handoff when input_type is provided")

The on_handoff callback must accept two parameters for type-validated inputs:

async def on_handoff(ctx: RunContext, data: ValidatedInputType) -> Agent:
    ...

Sources: src/agents/handoffs/__init__.py:200-220

Enabling/Disabling Handoffs

Handoffs can be conditionally enabled using the is_enabled parameter:

# Static boolean
handoff = Handoff(agent=agent, is_enabled=False)

# Dynamic condition
handoff = Handoff(
    agent=agent,
    is_enabled=lambda ctx, current_agent: ctx.user_id in ADMIN_USERS
)

Disabled handoffs are hidden from the LLM at runtime.

Sources: src/agents/handoffs/__init__.py:180-190

Input Filtering

HandoffInputFilter

The input_filter function receives the entire conversation history and can modify what the next agent receives:

HandoffInputFilter = Callable[
    [HandoffInputData], HandoffInputData | Awaitable[HandoffInputData]
]

Common Filtering Patterns

PatternUse Case
Remove sensitive dataStrip user credentials before handoff
Context summarizationCondense long conversations
Tool filteringRemove tools not needed by next agent
History truncationKeep only recent relevant items

Example Input Filter

def filter_sensitive_inputs(data: HandoffInputData) -> HandoffInputData:
    # Remove tool call outputs containing sensitive info
    filtered_history = [
        item for item in data.input_history
        if not contains_sensitive(item)
    ]
    return dataclasses.replace(data, input_history=filtered_history)

Sources: src/agents/extensions/handoff_filters.py

History Management

Nesting Conversation History

When nest_handoff_history=True, the previous agent's conversation is summarized before being passed to the next agent:

def nest_handoff_history(
    handoff_input_data: HandoffInputData,
    *,
    history_mapper: HandoffHistoryMapper | None = None,
) -> HandoffInputData:
    """Summarize the previous transcript for the next agent."""

This prevents context overflow and provides the new agent with a concise summary rather than full conversation history.

Sources: src/agents/handoffs/history.py:40-60

Conversation History Wrappers

Default markers wrap nested conversation summaries:

MarkerDefault Value
Start<CONVERSATION HISTORY>
End</CONVERSATION HISTORY>

These can be customized:

set_conversation_history_wrappers(
    start="<PREVIOUS AGENT TRANSCRIPT>",
    end="</PREVIOUS AGENT TRANSCRIPT>"
)

Sources: src/agents/handoffs/history.py:20-40

Creating Handoffs

Basic Handoff

from agents import Agent, Handoff, Runner

agent_a = Agent(name="Agent A", instructions="...")
agent_b = Agent(name="Agent B", instructions="...")

# Create handoff
handoff_to_b = Handoff(name="transfer_to_b", agent=agent_b)

# Add to source agent
agent_a.handoffs.append(handoff_to_b)

Handoff with Callbacks

async def on_transfer_to_b(ctx: RunContext, input_data: str) -> Agent:
    # Log the handoff
    logger.info(f"Handoff triggered by user: {ctx.user_id}")
    # Return destination agent
    return agent_b

handoff_to_b = Handoff(
    agent=agent_b,
    name="transfer_to_b",
    on_handoff=on_transfer_to_b
)

Handoff with Type Validation

from pydantic import BaseModel

class TransferData(BaseModel):
    reason: str
    priority: int = 1

async def handle_transfer(ctx: RunContext, data: TransferData) -> Agent:
    if data.priority > 5:
        return urgent_agent
    return standard_agent

handoff = Handoff(
    agent=standard_agent,
    input_type=TransferData,
    on_handoff=handle_transfer
)

Handoffs in the Run Loop

Turn Resolution with Handoffs

When a handoff is triggered during agent execution:

sequenceDiagram
    participant Agent as Current Agent
    participant Run as Run Loop
    participant Handoff as Handoff Handler
    
    Agent->>Run: Generate response with handoff tool call
    Run->>Handoff: Process NextStepHandoff
    Handoff->>Handoff: Validate input_type if provided
    Handoff->>Handoff: Execute input_filter
    Handoff->>Handoff: Call on_handoff callback
    Handoff-->>Run: Return new agent and filtered input
    Run->>Run: Reset current agent
    Run->>Run: Start next turn with new agent

Sources: src/agents/run.py:200-250

Handoff Result Processing

The run loop handles handoff transitions:

elif isinstance(turn_result.next_step, NextStepHandoff):
    current_agent = cast(Agent[TContext], turn_result.next_step.new_agent)
    # Next agent starts with the nested/filtered input
    starting_input = turn_result.original_input
    original_input = turn_result.original_input
    should_run_agent_start_hooks = True

Sources: src/agents/run.py:230-245

Prompt Integration

Handoff Tool Representation

Handoffs appear as tools to the LLM with descriptions generated from the handoff configuration:

# Tool name format
f"transfer_to_{agent_name}"

# Tool description includes
- Handoff name
- Agent description
- Input schema if defined
- Custom tool_description_override if provided

Sources: src/agents/extensions/handoff_prompt.py

Prompt Instructions

The system prompt can include handoff guidance:

- When a task matches another agent's expertise, use the handoff tool
- Explain the reason for handoff in your response
- Preserve relevant context during transfer

Best Practices

Design Principles

  1. Clear Agent Specialization: Each agent should have a distinct responsibility
  2. Minimal Handoff Arguments: Pass only essential data, not entire conversations
  3. Meaningful Handoff Names: Use descriptive names that indicate the destination
  4. Appropriate History Management: Enable nesting for long conversations

Error Handling

ScenarioRecommended Approach
Handoff to unavailable agentCheck is_enabled before showing to model
Invalid input typeUse Pydantic validation with clear error messages
Filter failureReturn original input with warning

Performance Considerations

  • Avoid complex filters that run synchronously on large histories
  • Use is_enabled callbacks to prevent unnecessary tool calls
  • Consider disabling history nesting for high-frequency handoffs
ComponentFilePurpose
Handoff classsrc/agents/handoffs/__init__.pyCore handoff definition
HandoffInputDatasrc/agents/handoffs/__init__.pyInput data structure
nest_handoff_historysrc/agents/handoffs/history.pyHistory summarization
HandoffInputFiltersrc/agents/extensions/handoff_filters.pyInput filtering utilities
Handoff prompt integrationsrc/agents/extensions/handoff_prompt.pyPrompt rendering

Summary

Handoffs provide a robust mechanism for multi-agent orchestration in the OpenAI Agents Python SDK. Key capabilities include:

  • Structured Transfer: Defined handoff contracts with optional type validation
  • Flexible Input Management: Filtering and transformation before agent handoff
  • History Control: Nesting or truncating conversation context
  • Conditional Execution: Enable/disable based on runtime conditions
  • Callback Support: Side effects and logging during transitions

These mechanisms enable complex agent workflows while maintaining clean separation of concerns and manageable context sizes.

Sources: [src/agents/handoffs/__init__.py:1-100]()

Agents as Tools

Related topics: Handoffs, Agents

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Basic Agent-to-Tool Conversion

Continue reading this section for the full explanation and source context.

Section AgentTool with Structured Output

Continue reading this section for the full explanation and source context.

Related topics: Handoffs, Agents

Agents as Tools

Agents as Tools is a powerful architectural pattern in the openai-agents-python library that enables one agent to be invoked as a callable tool by another agent. This pattern allows for sophisticated multi-agent orchestration where specialized agents can be dynamically called with specific inputs, returning structured results to the calling agent.

Overview

In the traditional agent architecture, agents operate as standalone units that receive input, execute tasks, and return results. The "Agents as Tools" pattern extends this by wrapping agents inside function tool abstractions, enabling:

  • Dynamic Agent Invocation: Agents can be called like functions within other agents' workflows
  • Structured Inputs and Outputs: Typed interfaces ensure consistent data exchange between agents
  • Conditional Execution: Agents can be invoked based on specific conditions or input patterns
  • Parallel Tool Calls: Multiple agents can be called simultaneously as tools
  • Nested Architectures: Complex hierarchies of agents calling sub-agents as tools

This pattern is particularly valuable for building research assistants, customer service systems, and specialized workflow engines where different capabilities need to be composed dynamically.

Architecture

graph TD
    subgraph "Primary Agent"
        PA[Main Agent]
        PA -->|has tools| T1[Agent-as-Tool 1]
        PA -->|has tools| T2[Agent-as-Tool 2]
        PA -->|has tools| Tn[Agent-as-Tool N]
    end
    
    subgraph "Wrapped Agents"
        T1 -->|wraps| A1[Specialized Agent 1]
        T2 -->|wraps| A2[Specialized Agent 2]
        Tn -->|wraps| An[Specialized Agent N]
    end
    
    A1 -->|returns| T1
    A2 -->|returns| T2
    An -->|returns| Tn
    T1 -->|tool result| PA
    T2 -->|tool result| PA
    Tn -->|tool result| PA

Core Components

ComponentRoleLocation
AgentBase agent with instructions, tools, handoffssrc/agents/agent.py
FunctionToolWraps callable functions for agent useTool infrastructure
RunnerExecutes agents and manages tool callssrc/agents/run.py
HandoffEnables agent-to-agent transferssrc/agents/handoffs/__init__.py

Implementation Patterns

Basic Agent-to-Tool Conversion

The simplest form of this pattern converts an existing agent into a callable tool:

from agents import Agent, function_tool

# Create a specialized agent
search_agent = Agent(
    name="web_searcher",
    instructions="You are a web search expert. Search for the given query and summarize results.",
    tools=[web_search_tool],
)

# Convert to a function tool that the primary agent can use
@function_tool
def search_tool(query: str) -> str:
    """Search the web for information."""
    result = Runner.run(search_agent, input=query)
    return result.final_output

AgentTool with Structured Output

For more sophisticated scenarios, agents can be wrapped with explicit input/output schemas:

from agents import Agent
from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str
    summary: str

search_agent = Agent(
    name="structured_searcher",
    instructions="Search for information and return structured results.",
    output_type=SearchResult,
)

Conditional Agent Invocation

Agents can be configured to only be available under certain conditions:

from agents import Agent

admin_agent = Agent(
    name="admin_panel",
    instructions="Handle administrative tasks.",
)

# Conditional enabling based on user role
def is_admin(context):
    return context.user_role == "admin"

admin_agent.is_enabled = is_admin

Usage Examples

Research Assistant Pattern

A common use case is a research bot with specialized sub-agents:

sequenceDiagram
    participant User
    participant Planner as Planner Agent
    participant Search as Search Agent (Tool)
    participant Writer as Writer Agent
    
    User->>Planner: "Research topic: AI trends"
    Planner->>Planner: Generate search queries
    Planner->>Search: tool_call(search_queries[0])
    Planner->>Search: tool_call(search_queries[1])
    Planner->>Search: tool_call(search_queries[n])
    Search-->>Planner: SearchResult
    Planner->>Writer: Pass summaries
    Writer-->>User: Final report

Example: Agent Patterns in Code

The repository includes several agent pattern examples demonstrating this functionality:

Basic Pattern (examples/agent_patterns/agents_as_tools.py):

# Agents are wrapped as tools and called by a primary agent
primary_agent = Agent(
    name="orchestrator",
    instructions="Coordinate specialized agents to answer user queries.",
    tools=[search_agent_as_tool, code_agent_as_tool],
)

Conditional Pattern (examples/agent_patterns/agents_as_tools_conditional.py):

# Agents are conditionally available based on context
if user.is_premium:
    primary_agent.tools.append(premium_agent_tool)

Structured Pattern (examples/agent_patterns/agents_as_tools_structured.py):

# Agents return structured data types
@function_tool
def get_weather(location: str) -> WeatherData:
    """Get weather for a location."""
    return Runner.run(weather_agent, input=location)

Configuration Options

Tool Metadata Configuration

When converting an agent to a tool, you can override the default tool behavior:

ParameterTypePurpose
namestrOverride the tool name shown to the LLM
descriptionstrHuman-readable description of what the tool does
input_typeType[BaseModel]Pydantic model for input validation
output_typeType[BaseModel]Pydantic model for output schema
is_enabled`bool \Callable`Condition for tool availability

Agent Configuration

Agents used as tools support standard agent parameters:

ParameterDescription
instructionsSystem prompt for the agent
toolsAdditional tools available to the agent
handoffsAgents the sub-agent can transfer to
output_typeExpected output type
modelSpecific model to use

Execution Flow

flowchart LR
    A[Primary Agent] -->|decides to call| B[Agent-as-Tool]
    B -->|parses input| C{Input Validation}
    C -->|valid| D[Execute Wrapped Agent]
    C -->|invalid| E[Return Error]
    D -->|run agent| F[Runner.run]
    F -->|collect results| G[Format Output]
    G -->|return| B
    B -->|tool result| A

Integration with Handoffs

The Agents as Tools pattern complements the handoff mechanism:

AspectAgents as ToolsHandoffs
Control FlowAgent calls tool, waits for resultAgent transfers control completely
StateShared contextFresh context for new agent
Use CaseParallel specialized tasksSequential role switches
ReturnStructured resultHandoff message

资料来源src/agents/handoffs/__init__.py

Best Practices

  1. Clear Tool Descriptions: Provide explicit descriptions so the LLM knows when to invoke the agent
  2. Typed Interfaces: Use Pydantic models for input/output to ensure type safety
  3. Error Handling: Wrap agent executions in try-catch to handle failures gracefully
  4. Context Management: Pass relevant context to sub-agents without overwhelming them
  5. Conditional Enabling: Use is_enabled to control access based on user permissions
  • Handoffs: Complete agent-to-agent transfer for distinct roles
  • Multi-Agent Orchestration: Coordinated multi-agent workflows
  • Sandbox Agents: Isolated execution environments for agents
  • Guardrails: Input/output validation for agent tool calls

资料来源examples/sandbox/handoffs.py

Source: https://github.com/openai/openai-agents-python / Human Manual

Run Loop and Execution

Related topics: Agents, Sessions and Memory

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Execution Flow

Continue reading this section for the full explanation and source context.

Section Key Modules

Continue reading this section for the full explanation and source context.

Section RunOptions

Continue reading this section for the full explanation and source context.

Related topics: Agents, Sessions and Memory

Run Loop and Execution

The Run Loop and Execution system is the core engine of the openai-agents-python SDK. It orchestrates the interaction between agents, language models, tools, and external systems through an iterative turn-based processing architecture.

Overview

The execution model follows a turn-based loop where each turn consists of:

  1. Turn Preparation - Setting up context, hooks, and session state
  2. Model Invocation - Calling the language model with the current input
  3. Response Processing - Parsing and validating model outputs
  4. Tool Execution - Running any tools or side effects requested by the model
  5. Turn Resolution - Determining the next step (continue, handoff, or finish)

Sources: src/agents/run.py:1-50

Architecture Components

Core Execution Flow

graph TD
    A[User Input] --> B[Run Loop Entry]
    B --> C[Turn Preparation]
    C --> D[Call Model]
    D --> E{Response Type?}
    E -->|Tool Calls| F[Execute Tools]
    E -->|Handoff| G[Switch Agent]
    E -->|Message| H[Finalize Output]
    F --> C
    G --> C
    H --> I[Return RunResult]

Key Modules

ModulePurposeKey Classes/Functions
run.pyMain entry pointrun(), run_sync()
run_loop.pyCore loop logicrun_loop()
turn_preparation.pyTurn setupInput filtering, hook invocation
turn_resolution.pyResponse handlingTool result processing, output finalization
tool_execution.pyTool runnerexecute_tools_and_side_effects()
streaming.pyStreaming supportStream handlers

Sources: src/agents/run.py:1-30

Run Configuration

RunOptions

The RunOptions TypedDict defines all parameters for running an agent:

class RunOptions(TypedDict, Generic[TContext]):
    context: NotRequired[TContext | None]
    max_turns: NotRequired[int | None]
    hooks: NotRequired[RunHooks[TContext] | None]
    run_config: NotRequired[RunConfig | None]
    previous_response_id: NotRequired[str | None]
    auto_previous_response_id: NotRequired[bool]
    conversation_id: NotRequired[str | None]
    session: NotRequired[Session | None]
    error_handlers: NotRequired[RunErrorHandlers[TContext] | None]

Configuration Options

ParameterTypeDefaultDescription
max_turns`int \None`NoneMaximum turns; None disables limit
context`TContext \None`NoneCustom context object
hooksRunHooks[TContext]NoneLifecycle hooks
run_configRunConfigNoneRuntime configuration
sessionSessionNoneSession for state persistence
error_handlersRunErrorHandlersNoneError callback handlers

Sources: src/agents/run_config.py:50-75

Turn Processing

Turn Resolution

The turn_resolution.py module handles processing model responses after tool execution:

tool_final_output = await _maybe_finalize_from_tool_results(
    public_agent=public_agent,
    original_input=original_input,
    new_response=new_response,
    pre_step_items=pre_step_items,
    new_step_items=new_step_items,
    function_results=function_results,
    hooks=hooks,
    context_wrapper=context_wrapper,
    tool_input_guardrail_results=tool_input_guardrail_results,
    tool_output_guardrail_results=tool_output_guardrail_results,
)

Message Output Extraction

The ItemHelpers class provides utilities for extracting content from model responses:

@classmethod
def extract_refusal(cls, message: TResponseOutputItem) -> str | None:
    """Extracts refusal content from a message, if any."""
    if not isinstance(message, ResponseOutputMessage):
        return None
    refusal = ""
    for content_item in message.content:
        if isinstance(content_item, ResponseOutputRefusal):
            refusal += content_item.refusal or ""
    return refusal or None

Refusal Handling

When the model refuses to respond, a ModelRefusalError is raised:

if refusal:
    refusal_error = ModelRefusalError(refusal)
    run_error_data = build_run_error_data(...)

Sources: src/agents/run_internal/turn_resolution.py:25-45

Agent Handoffs

Handoff Processing

The run loop handles agent handoffs through the NextStepHandoff type:

elif isinstance(turn_result.next_step, NextStepHandoff):
    current_agent = cast(Agent[TContext], turn_result.next_step.new_agent)
    if run_state is not None:
        run_state._current_agent = current_agent
    starting_input = turn_result.original_input
    original_input = turn_result.original_input
    current_span.finish(reset_current=True)
    should_run_agent_start_hooks = True

Loop Continuation

For cases requiring another iteration without switching agents:

elif isinstance(turn_result.next_step, NextStepRunAgain):
    await save_turn_items_if_needed(
        session=session,
        run_state=run_state,
        session_persistence_enabled=session_persistence_enabled,
        items=session_items_for_turn(turn_result),
        response_id=turn_result.model_response.response_id,
        store=store_setting,
    )
    continue

Sources: src/agents/run.py:150-180

Result Types

RunResult Structure

FieldTypeDescription
last_agentAgentFinal agent that produced output
new_itemslist[RunItem]All items from the run
final_outputResponseFinal model response
raw_responseslist[RawResponsesFromModel]Raw model outputs

Tool Output Handling

Tool outputs are processed through multiple stages:

  1. Pre-step items - State before tool execution
  2. New step items - State after tool execution
  3. Function results - Structured tool call results

The system tracks tool activity without messages using:

has_tool_activity_without_message = not message_items and bool(
    processed_response.tools_used
)

Sources: src/agents/run_internal/turn_resolution.py:35-40

Input Processing

Input Conversion

The ItemHelpers class handles input normalization:

@classmethod
def input_to_new_input_list(
    cls, input: str | list[TResponseInputItem]
) -> list[TResponseInputItem]:
    """Converts a string or list of input items into a list of input items."""
    if isinstance(input, str):
        return [{"content": input, "role": "user"}]
    return cast(list[TResponseInputItem], _to_dump_compatible(input))

Text Extraction

Concatenate all text content from message output items:

@classmethod
def text_message_outputs(cls, items: list[RunItem]) -> str:
    """Concatenates all the text content from a list of message output items."""
    text = ""
    for item in items:
        if isinstance(item, MessageOutputItem):
            text += cls.text_message_output(item)
    return text

Sources: src/agents/items.py:60-90

Error Handling

Error Flow

graph TD
    A[Error Occurs] --> B{Error Type?}
    B -->|Refusal| C[ModelRefusalError]
    B -->|Tool Failure| D[ToolExecutionError]
    B -->|Max Turns| E[MaxTurnsExceededError]
    B -->|Other| F[Generic Error Handler]
    C --> G[Build Error Data]
    D --> G
    E --> G
    F --> G
    G --> H[Return Error Result]

Error Handlers Configuration

Custom error handlers can be registered per error kind:

error_handlers: RunErrorHandlers[TContext] | None

The system supports typed error handling where handlers are keyed by error category.

Sources: src/agents/run_config.py:60-65

Session Persistence

Save Turn Items

The run loop persists state after each turn when session is enabled:

await save_turn_items_if_needed(
    session=session,
    run_state=run_state,
    session_persistence_enabled=session_persistence_enabled,
    items=session_items_for_turn(turn_result),
    response_id=turn_result.model_response.response_id,
    store=store_setting,
)

Parameters

ParameterTypeDescription
session`Session \None`Active session instance
run_state`RunState \None`Current run state
session_persistence_enabledboolWhether persistence is active
itemslist[RunItem]Items to persist
response_idstrModel response ID
storeStoreSettingStorage configuration

Sources: src/agents/run.py:160-170

Streaming Support

The system supports streaming model outputs through the streaming module. Streaming is configured via RunConfig and allows real-time output handling without waiting for complete responses.

Lifecycle Hooks

Available Hooks

HookTriggerPurpose
on_agent_startAgent turn beginsInitialize agent-specific state
on_agent_endAgent turn endsCleanup or logging
on_tool_callTool invocationLogging or monitoring
on_handoffAgent switchTrack transitions

Hooks receive RunContextWrapper and relevant context data, enabling deep customization of the execution flow.

Sources: src/agents/run_config.py:35-45

Summary

The Run Loop and Execution system provides:

  • Iterative Processing: Turn-based model interaction with tool execution
  • Flexible Configuration: Extensive options via RunOptions and RunConfig
  • Agent Orchestration: Seamless handoff between agents
  • Error Resilience: Typed error handlers and refusal detection
  • Session Management: Persistent state across turns
  • Lifecycle Hooks: Customization at every execution stage

The architecture prioritizes extensibility, allowing developers to hook into any phase of execution while maintaining a clear, predictable flow from input to final output.

Sources: [src/agents/run.py:1-50]()

Sessions and Memory

Related topics: Run Loop and Execution

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Session Protocol

Continue reading this section for the full explanation and source context.

Section Memory Capability in Sandboxes

Continue reading this section for the full explanation and source context.

Section Session Lifecycle

Continue reading this section for the full explanation and source context.

Related topics: Run Loop and Execution

Sessions and Memory

Overview

The Sessions and Memory system in the openai-agents-python library provides persistent conversation state management for AI agents. This system enables agents to maintain context across multiple interactions, store conversation history, and access previously learned information through a flexible session abstraction layer.

The architecture is built around a protocol-based design that allows different storage backends while maintaining a consistent interface. Sessions track conversation items, manage agent handoffs, and enable memory persistence for sandboxed agent environments.

Sources: src/agents/extensions/memory/__init__.py:1-8

Architecture

Session Protocol

The core of the session system is the Session protocol, which defines the contract for all session implementations. This allows developers to swap storage backends without changing application code.

graph TD
    A[Agent Run] --> B[Session Protocol]
    B --> C[SQLiteSession]
    B --> D[AsyncSQLiteSession]
    B --> E[AdvancedSQLiteSession]
    B --> F[EncryptedSession]
    B --> G[RedisSession]
    B --> H[SQLAlchemySession]
    B --> I[MongoDBSession]
    B --> J[DaprSession]

Sources: src/agents/extensions/memory/__init__.py:1-30

Memory Capability in Sandboxes

Sandbox agents have a dedicated memory capability that provides context from previous sessions. The Memory class in the sandbox capabilities layer enables agents to read and write persistent memory.

graph LR
    A[SandboxAgent] -->|requires| B[Memory Capability]
    B --> C[read: MemoryReadConfig]
    B --> D[generate: MemoryGenerateConfig]
    B --> E[layout: MemoryLayout]

The memory system requires either read or generate configuration to be meaningful. When read.live_update is enabled, the capability requires both filesystem and shell capabilities; otherwise, only shell is required.

Sources: src/agents/sandbox/capabilities/memory.py:1-30

Session Persistence Layer

Session Lifecycle

Sessions manage the persistence of conversation state through a structured workflow:

sequenceDiagram
    participant Agent as Agent Run
    participant Session as Session Store
    participant Sandbox as Sandbox Session
    
    Agent->>Session: Create/Resume Session
    Session-->>Agent: Session ID
    Agent->>Sandbox: Initialize Workspace
    loop Turn Processing
        Agent->>Sandbox: Execute Tool
        Sandbox-->>Agent: Tool Result
        Agent->>Session: Save Turn Items
        Session-->>Agent: Acknowledge
    end
    Agent->>Session: Finalize Session

Turn Item Persistence

During agent execution, each turn generates items that must be persisted:

  • input: Current segment user input
  • generated_items: Memory-relevant assistant and tool items
  • terminal_metadata: Completion/failure state
  • final_output: Final segment output when available

Sources: src/agents/sandbox/memory/prompts/rollout_extraction_user_message.md:1-20

Memory Rollout Extraction

When an agent session completes, the system can extract a structured memory summary for future reference. This process is handled by the rollout extraction prompt system.

JSON Output Schema

The extraction produces JSON with three fields:

FieldTypeDescription
raw_memorystringRaw memory content from the session
rollout_summarystringGenerated summary of the session
rollout_slugstringShort identifier (empty string if unknown)

Sources: src/agents/sandbox/memory/prompts/rollout_extraction_user_message.md:1-25

Memory Summary Path

The memory system reads summaries from a configurable path within the sandbox workspace:

memory_summary_path = Path(layout.memories_dir) / "memory_summary.md"

The memory summary is truncated to a maximum token limit (_MEMORY_SUMMARY_MAX_TOKENS) to ensure efficient processing.

Sources: src/agents/sandbox/capabilities/memory.py:50-65

Workspace Sink System

The WorkspaceSink class manages buffered writes to the sandbox workspace, providing a layer between agent operations and persistent storage.

Flush Strategy

The sink implements intelligent flushing based on several conditions:

graph TD
    A[Should Flush?] --> B{Seen count % flush_every == 0}
    A --> C{Operation: persist_workspace start}
    A --> D{Operation: stop}
    A --> E{Operation: shutdown start}
    B -->|Yes| F[Flush to workspace]
    C -->|Yes| F
    D -->|Yes| F
    E -->|Yes| F
    B -->|No| G{Check running state}
    G -->|Running| F
    G -->|Not running| H[Defer flush]

Flush conditions include:

  • Periodic flush based on event count
  • Explicit persist workspace operations
  • Session stop and shutdown events

Sources: src/agents/sandbox/session/sinks.py:1-40

Workspace Persistence

The sink handles reading existing outbox content before writing new data, ensuring append-style semantics for workspace files. If no existing outbox is found, it marks the outbox as loaded and proceeds with new writes.

Sources: src/agents/sandbox/session/sinks.py:60-85

Error Handling

The session system defines specific error types for workspace operations:

Error Hierarchy

Error ClassCodePurpose
WorkspaceIOError-Base class for workspace read/write errors
ApplyPatchPathErrorAPPLY_PATCH_INVALID_PATHInvalid path (absolute, escape root, or empty)
ApplyPatchDiffError-Malformed patch diff
ExecNonZeroError-Non-zero exit code from exec operations
InvalidManifestPathError-Path resolution failed in manifest context

Path Validation

The system validates relative paths to prevent directory traversal attacks:

def _validate_relative_path(*, name: str, path: Path) -> None:
    if path.is_absolute():
        raise ValueError(f"{name} must be relative")
    if ".." in path.parts:
        raise ValueError(f"{name} must not escape root")
    if path.parts in [(), (".",)]:
        raise ValueError(f"{name} must be non-empty")

Sources: src/agents/sandbox/errors.py:1-50

Session Handoff History

When agents hand off to other agents, the system can summarize conversation history for the receiving agent. This is managed by the handoff history module.

History Normalization

The system normalizes input history and flattens nested messages before creating summaries. Items like ToolApprovalItem are filtered out as they shouldn't be forwarded.

graph LR
    A[Handoff Input] --> B[Normalize History]
    B --> C[Flatten Nested Messages]
    C --> D[Filter Tool Approvals]
    D --> E[Convert to Plain Inputs]
    E --> F[Generate Transcript Summary]

Sources: src/agents/handoffs/history.py:1-60

History Markers

The conversation history uses customizable markers for wrapping summaries:

VariableDefault
_conversation_history_start<CONVERSATION HISTORY>
_conversation_history_end</CONVERSATION HISTORY>

These can be overridden at runtime using set_conversation_history_wrappers().

Sources: src/agents/handoffs/history.py:1-50

Extension Memory Backends

The library includes several optional session backends that require additional dependencies:

Available Backends

BackendPackageFeatures
SQLiteSessionBuilt-inBasic SQLite persistence
AsyncSQLiteSessionBuilt-inAsync SQLite operations
AdvancedSQLiteSessionBuilt-inAdvanced SQLite features
EncryptedSessioncryptographyEncryption at rest
RedisSessionredisDistributed session management
SQLAlchemySessionsqlalchemyORM integration
MongoDBSessionmongodbDocument store backend
DaprSessiondaprDapr state store integration

Sources: src/agents/extensions/memory/__init__.py:1-50

Lazy Loading

Extensions use lazy imports to avoid requiring all dependencies when not needed:

_LAZY_EXPORTS: dict[str, tuple[str, tuple[str, str] | None]] = {
    "EncryptedSession": (".encrypt_session", ("cryptography", "encrypt")),
    "RedisSession": (".redis_session", ("redis", "redis")),
    ...
}

This pattern ensures that optional dependencies are only loaded when the specific backend is used.

Sources: src/agents/extensions/memory/__init__.py:1-50

Configuration

Session Settings

Sessions are configured through SessionSettings which control:

  • Storage backend selection
  • Connection parameters
  • Persistence strategies
  • Compaction policies (for OpenAI responses backend)

Memory Layout

For sandbox memory, the MemoryLayout class specifies directory structure:

SettingDescription
memories_dirDirectory for stored memories
sessions_dirDirectory for session data

Both paths must be relative to the sandbox workspace root to prevent escape vulnerabilities.

Sources: src/agents/sandbox/capabilities/memory.py:20-35

Usage Patterns

Basic Session Usage

from agents.memory import SQLiteSession

session = SQLiteSession(session_id="user-123")
await session.initialize()

# Run agent with session
result = await Runner.run(agent, input, session=session)

# Session automatically persists turn items

Sandbox Memory Setup

from agents.sandbox.capabilities import Memory, MemoryReadConfig, MemoryLayout

memory = Memory(
    read=MemoryReadConfig(live_update=True),
    layout=MemoryLayout(memories_dir="memory", sessions_dir="sessions"),
    run_as="root"
)

Resume from Session

# Resume a previous session
session = SQLiteSession(session_id="user-123", resume=True)

# Continue the conversation
result = await Runner.run(agent, input, session=session)

Best Practices

  1. Path Validation: Always use relative paths for memory directories to prevent sandbox escape vulnerabilities.
  1. Session Initialization: Check session.is_initialized() before running agent logic.
  1. Error Handling: Catch specific session errors rather than generic exceptions for better recovery.
  1. Turn Item Management: Let the session system manage persistence automatically through the save_turn_items_if_needed() function.
  1. Live Update Trade-offs: Enable live_update only when agents need real-time file system access; otherwise, rely on shell-only mode for better isolation.
  1. Extension Dependencies: Use lazy-loading backends to minimize startup time and avoid unnecessary dependency loading.

Sources: [src/agents/extensions/memory/__init__.py:1-8]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Project risk needs validation

The project should not be treated as fully validated until this signal is reviewed.

medium AdvancedSQLiteSession.delete_branch() leaves branch-only messages in the base table

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium Clarify whether retry-after delays should respect retry max_delay

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium OpenAIConversationsSession persists empty reasoning item {"type":"reasoning","summary":[]} and Conversations API reject…

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Project risk: Project risk needs validation

  • Severity: medium
  • Finding: Project risk is backed by a source signal: Project risk needs validation. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: identity.distribution | github_repo:946380199 | https://github.com/openai/openai-agents-python | repo=openai-agents-python; install=openai-agents

2. Configuration risk: AdvancedSQLiteSession.delete_branch() leaves branch-only messages in the base table

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: AdvancedSQLiteSession.delete_branch() leaves branch-only messages in the base table. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/issues/3346

3. Configuration risk: Clarify whether retry-after delays should respect retry max_delay

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: Clarify whether retry-after delays should respect retry max_delay. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/issues/3266

4. Configuration risk: OpenAIConversationsSession persists empty reasoning item {"type":"reasoning","summary":[]} and Conversations API reject…

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: OpenAIConversationsSession persists empty reasoning item {"type":"reasoning","summary":[]} and Conversations API reject…. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/issues/3268

5. Configuration risk: Tracing shutdown cannot interrupt exporter retry backoff

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: Tracing shutdown cannot interrupt exporter retry backoff. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/issues/3354

6. Configuration risk: v0.15.2

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: v0.15.2. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.15.2

7. Configuration risk: v0.15.3

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: v0.15.3. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.15.3

8. Configuration risk: v0.16.1

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: v0.16.1. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.16.1

9. Configuration risk: v0.17.0

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: v0.17.0. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.17.0

10. Capability assumption: v0.15.1

  • Severity: medium
  • Finding: Capability assumption is backed by a source signal: v0.15.1. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.15.1

11. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:946380199 | https://github.com/openai/openai-agents-python | README/documentation is current enough for a first validation pass.

12. Project risk: v0.14.8

  • Severity: medium
  • Finding: Project risk is backed by a source signal: v0.14.8. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/openai/openai-agents-python/releases/tag/v0.14.8

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using openai-agents-python with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence