# https://github.com/openai/openai-agents-js 项目说明书生成时间：2026-05-18 04:44:56 UTC ## 目录 - [Introduction to OpenAI Agents SDK](#introduction) - [Installation and Setup](#installation) - [Examples and Patterns](#examples-overview) - [Creating and Running Agents](#agents) - [Tools and Tool Use](#tools) - [Guardrails and Input/Output Validation](#guardrails) - [Handoffs and Multi-Agent Systems](#handoffs) - [Sandbox Agents Architecture](#sandbox-architecture) - [Sandbox Providers and Extensions](#sandbox-providers) - [Voice Agents and Realtime Communication](#voice-realtime) ## Introduction to OpenAI Agents SDK ### 相关页面相关主题：[Installation and Setup](#installation), [Creating and Running Agents](#agents), [Examples and Patterns](#examples-overview)

Relevant Source Files

以下源码文件用于生成本页说明： - [README.md](https://github.com/openai/openai-agents-js/blob/main/README.md) - [packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) - [packages/agents-core/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/README.md) - [packages/agents-extensions/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/README.md) - [packages/agents-extensions/package.json](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/package.json) - [packages/agents-realtime/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/README.md) - [examples/basic/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/basic/README.md) - [examples/agent-patterns/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/agent-patterns/README.md)

# Introduction to OpenAI Agents SDK The OpenAI Agents SDK is a lightweight yet powerful framework for building multi-agent workflows in JavaScript/TypeScript. It provides developers with primitives for creating AI agents that can use tools, handle complex conversations, manage handoffs between agents, and integrate with guardrails for input/output validation. ## Overview The SDK is designed to simplify the development of agentic applications by providing: - **Agent abstraction**: A core `Agent` class that encapsulates instructions, tools, and behaviors - **Tool system**: Built-in support for various tool types including function calling, web search, file search, and code execution - **Multi-agent orchestration**: Handoff mechanisms for routing conversations between specialized agents - **Guardrails**: Input and output validation to ensure safe and appropriate agent responses - **Tracing**: Comprehensive span-based telemetry for monitoring agent execution - **Streaming support**: Real-time response streaming for interactive applications The framework is modular and extensible, allowing developers to pick and choose components based on their use case. ## Package Architecture The SDK is organized into multiple packages within a monorepo structure: ```mermaid graph TD A[Application] --> B[@openai/agents] A --> C[@openai/agents-core] A --> D[@openai/agents-extensions] A --> E[@openai/agents-realtime] B --> C B --> D C --> F[agents-openai] D --> G[AI SDK Integration] D --> H[Codex Tool] D --> I[Daytona Sandbox] E --> C ``` ### Package Matrix | Package | Purpose | Installation | |---------|---------|--------------| | `@openai/agents` | Main SDK bundle with all features | `npm install @openai/agents` | | `@openai/agents-core` | Core agent primitives and interfaces | `npm install @openai/agents` | | `@openai/agents-extensions` | Extension features (AI SDK UI, Codex, Sandbox) | `npm install @openai/agents-extensions` | | `@openai/agents-realtime` | Voice agent capabilities for realtime sessions | `npm install @openai/agents` | | `@openai/agents-openai` | OpenAI-specific implementations | `npm install @openai/agents` | 资料来源：[packages/agents-core/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/README.md), [packages/agents-extensions/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/README.md) ## Core Agent Class The `Agent` class is the fundamental building block of the SDK. It extends `AgentHooks` and implements `AgentConfiguration`, providing a comprehensive interface for agent configuration. 资料来源：[packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) ### Agent Configuration Agents are configured with the following key options: | Parameter | Type | Description | |-----------|------|-------------| | `name` | `string` | The agent's identifier | | `instructions` | `string \| function` | System prompt or dynamic instruction generator | | `tools` | `Tool[]` | Collection of tools the agent can invoke | | `handoffs` | `Agent[] \| Handoff[]` | Agents available for handoff | | `outputType` | `AgentOutputType` | Expected output format | | `model` | `string \| Model` | The model to use for this agent | | `modelSettings` | `ModelSettings` | Model-specific configuration | | `inputGuardrails` | `Guardrail[]` | Validation for incoming messages | | `outputGuardrails` | `Guardrail[]` | Validation for agent responses | 资料来源：[packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) ### Creating an Agent ```typescript import { Agent } from '@openai/agents'; const agent = new Agent({ name: 'Assistant', instructions: 'You are a helpful assistant that responds in concise sentences.', tools: [/* your tools here */], }); ``` ## Tool System The SDK provides a flexible tool system that allows agents to interact with external systems and perform actions. ### Built-in Tool Categories | Category | Tools | Use Case | |----------|-------|----------| | **Web** | `webSearchTool`, `webSearchToolWithFilters` | General web queries and filtered searches | | **Files** | `fileSearchTool`, `fileSearchWithReferencesTool` | Vector-based file searching | | **Code** | `codeInterpreterTool`, `localShellTool`, `containerShellTool` | Code execution and shell commands | | **Images** | `imageGenerationTool` | AI image generation | | **Computer** | `computerTool` | Browser automation with Playwright | | **Codex** | `codexTool` | Code editing and task execution via Codex CLI | 资料来源：[examples/tools/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/tools/README.md) ### Tool Configuration Options When defining tools for an agent, the following options are available: | Option | Type | Description | |--------|------|-------------| | `toolName` | `string` | Custom name for the tool (defaults to agent name) | | `toolDescription` | `string` | Human-readable description of tool behavior | | `parameters` | `TParameters` | Zod schema for tool input validation | | `needsApproval` | `boolean \| function` | Require human approval before execution | | `customOutputExtractor` | `function` | Custom logic to extract output from tool result | | `inputBuilder` | `AgentToolInputBuilder` | Build nested agent input from structured data | 资料来源：[packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) ## Agent Patterns The SDK supports various agent orchestration patterns demonstrated in the examples directory. 资料来源：[examples/agent-patterns/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/agent-patterns/README.md) ### Common Patterns | Pattern | Example | Description | |---------|---------|-------------| | **Agents as Tools** | `agents-as-tools.ts` | Wrap agents as callable tools for orchestration | | **Structured Input** | `agents-as-tools-structured.ts` | Use `Agent.asTool()` with structured input schemas | | **Conditional Tools** | `agents-as-tools-conditional.ts` | Enable/disable tools based on context | | **Deterministic Flow** | `deterministic.ts` | Fixed agent flow with gating and quality checks | | **Human-in-the-Loop** | `human-in-the-loop.ts` | Manual approval for sensitive operations | | **Guardrails** | `input-guardrails.ts`, `output-guardrails.ts` | Validate inputs and block unsafe outputs | | **LLM as Judge** | `llm-as-a-judge.ts` | Use LLM to evaluate agent outputs | ### Workflow Example ```mermaid graph LR A[User Input] --> B[Input Guardrails] B --> C[Agent Processing] C --> D{Tool Call?} D -->|Yes| E[Execute Tool] E --> C D -->|No| F[Output Guardrails] F --> G[Response to User] ``` ## Extensions Package The `@openai/agents-extensions` package provides additional features beyond the core SDK. 资料来源：[packages/agents-extensions/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/README.md) ### Available Exports | Export | Purpose | |--------|---------| | `./ai-sdk` | AI SDK integration utilities | | `./ai-sdk-ui` | UI framework adapters (Next.js, etc.) | | `./experimental/codex` | Experimental Codex CLI tool wrapper | 资料来源：[packages/agents-extensions/package.json](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/package.json) ### AI SDK UI Integration For Next.js and similar frameworks, the SDK provides streaming response helpers: ```typescript import { Agent, run } from '@openai/agents'; import { createAiSdkTextStreamResponse } from '@openai/agents-extensions/ai-sdk-ui'; export async function POST() { const agent = new Agent({ name: 'Assistant', instructions: 'Reply with a short answer.', }); const stream = await run(agent, 'Hello there.', { stream: true }); return createAiSdkTextStreamResponse(stream); } ``` 资料来源：[examples/ai-sdk-ui/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/ai-sdk-ui/README.md) ## Basic Examples The SDK includes numerous example scripts demonstrating core functionality: 资料来源：[examples/basic/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/basic/README.md) | Example | Command | Demonstrates | |---------|---------|--------------| | `hello-world.ts` | `pnpm -F basic start:hello-world` | Basic agent that responds in haiku | | `chat.ts` | `pnpm -F basic start:chat` | Interactive CLI chat with weather handoff | | `stream-text.ts` | `pnpm -F basic start:stream-text` | Plain text response streaming | | `stream-items.ts` | `pnpm -F basic start:stream-items` | Event streaming with tool usage | | `stream-ws.ts` | `pnpm -F basic start:stream-ws` | WebSocket streaming with HITL approval | | `dynamic-system-prompt.ts` | `pnpm -F basic start:dynamic-system-prompt` | Dynamic instructions per run | | `lifecycle-example.ts` | `pnpm -F basic start:lifecycle-example` | Detailed lifecycle events and usage | | `local-image.ts` | `pnpm -F basic start:local-image` | Sending local images to agents | | `image-tool-output.ts` | `pnpm -F basic start:image-tool-output` | Image return from tools | ## Running Examples The repository uses `pnpm` as its package manager with workspace-based organization. ```bash # Install dependencies pnpm install # Run a specific example pnpm -F basic start:hello-world pnpm -F basic start:chat pnpm -F basic start:stream-text # Run agent pattern examples pnpm examples:agents-as-tools pnpm examples:human-in-the-loop # Run tool integration examples pnpm examples:tools-web-search pnpm examples:tools-file-search pnpm examples:tools-codex ``` ## Installation The SDK can be installed via npm: ```bash # Core package (includes agents-core, agents-openai) npm install @openai/agents # Extensions package (requires agents-core) npm install @openai/agents @openai/agents-extensions ``` 资料来源：[README.md](https://github.com/openai/openai-agents-js/blob/main/README.md), [packages/agents-core/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/README.md) ## Summary The OpenAI Agents SDK provides a comprehensive framework for building agentic applications: - **Lightweight and modular**: Pick only the components you need - **Type-safe**: Full TypeScript support with generic types - **Extensible**: Create custom tools, guardrails, and hooks - **Observable**: Built-in tracing with customizable span types - **Production-ready**: MIT licensed with streaming and error handling support --- ## Installation and Setup ### 相关页面相关主题：[Introduction to OpenAI Agents SDK](#introduction)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-extensions/README.md](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/README.md) - [examples/basic/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/basic/README.md) - [examples/agent-patterns/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/agent-patterns/README.md) - [examples/sandbox/extensions/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/sandbox/extensions/README.md) - [examples/ai-sdk-ui/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/ai-sdk-ui/README.md)

# Installation and Setup ## Overview The OpenAI Agents SDK provides a modular JavaScript/TypeScript framework for building AI-powered agent applications. The SDK is organized into multiple packages that can be installed independently or together depending on your use case. 资料来源：[packages/agents-extensions/README.md:1-8]() ## Package Architecture The SDK consists of several packages published to npm: | Package | Description | Core Dependency | |---------|-------------|-----------------| | `@openai/agents` | Main agent runtime and CLI | Yes | | `@openai/agents-core` | Core agent abstractions and types | Yes | | `@openai/agents-extensions` | Extension features and integrations | No | | `@openai/agents-openai` | OpenAI-specific model integrations | No | | `@openai/agents-realtime` | Realtime voice agent capabilities | No | 资料来源：[packages/agents-extensions/README.md:1-12]() ## Installation Methods ### Standard Installation For most use cases, install the core packages: ```bash npm install @openai/agents @openai/agents-extensions ``` 资料来源：[packages/agents-extensions/README.md:6]() ### With Extensions Install additional extension packages based on your requirements: ```bash # For AI SDK UI integration npm install @openai/agents-extensions/ai-sdk-ui # For sandbox execution npm install @openai/agents-extensions/sandbox ``` 资料来源：[examples/ai-sdk-ui/README.md:1-15]() ### Running Examples The repository uses `pnpm` as the package manager for running example scripts: ```bash # Run a basic example pnpm -F basic start:hello-world # Run an agent pattern example pnpm examples:agents-as-tools # Run with streaming pnpm examples:streamed:human-in-the-loop ``` 资料来源：[examples/basic/README.md:1-30]() 资料来源：[examples/agent-patterns/README.md:1-20]() ## Environment Configuration ### Required Environment Variables #### OpenAI API Key Set your OpenAI API key before running any agent: ```bash export OPENAI_API_KEY=sk-... ``` #### Sandbox Extensions (Optional) For sandbox execution features, additional environment variables are required: ```bash export BL_API_KEY=... export BL_WORKSPACE=... export BL_REGION=us-pdx-1 ``` 资料来源：[examples/sandbox/extensions/README.md:1-60]() ### Environment File Setup Create a `.env.local` file for local development: ```bash cat > .env.local <<'EOF' OPENAI_API_KEY=your-api-key BL_WORKSPACE=your-workspace BL_API_KEY=your-api-key BL_REGION=us-pdx-1 EOF set -a source .env.local set +a ``` 资料来源：[examples/sandbox/extensions/README.md:40-58]() ## Project Structure When using the SDK in your project, the typical structure is: ```mermaid graph TD A[Your Project] --> B[Install SDK] B --> C[Configure Environment] C --> D[Create Agent Instance] D --> E[Add Tools & Guardrails] E --> F[Run Agent] G[agents-core] --> H[Core Abstractions] G --> I[Tool Definitions] G --> J[Runner Logic] K[agents-extensions] --> L[Sandbox Integration] K --> M[AI SDK UI] K --> N[Daytona Support] ``` ## Quick Start Workflow ```mermaid graph LR A[Initialize Agent] --> B[Configure Model] B --> C[Add Instructions] C --> D[Register Tools] D --> E[Execute Run] E --> F[Process Output] A1[Agent Class] --> |config| B1 B1[Model Settings] --> |settings| C1 C1[Instructions] --> |prompt| D1 D1[Tools] --> |functions| E1 E1[Runner.run] --> |stream| F1 ``` ## TypeScript Configuration The SDK is written in TypeScript and ships with type definitions. Ensure your `tsconfig.json` includes: ```json { "compilerOptions": { "strict": true, "module": "NodeNext", "moduleResolution": "NodeNext", "esModuleInterop": true } } ``` ## Verifying Installation After installation, verify the setup by running a basic example: ```bash pnpm -F basic start:hello-world ``` Expected output: A response from the agent in haiku format. 资料来源：[examples/basic/README.md:5-7]() ## SDK Package Dependencies ### Core Packages The `agents` package depends on `agents-core`: ```json { "dependencies": { "@openai/agents-core": "workspace:*" } } ``` ### Extension Packages Extensions are designed to be optional and do not require `agents-core` directly: ```mermaid graph TD A[Your App] --> B[@openai/agents] A --> C[@openai/agents-extensions] B --> D[@openai/agents-core] C --> D C --> E[Optional: AI SDK] ``` ## Next Steps After completing installation and setup: 1. Review the [Agent Configuration](./agent-configuration.md) documentation 2. Explore available [Tools](./tools.md) 3. Implement [Guardrails](./guardrails.md) for input/output validation 4. Set up [Handoffs](./handoffs.md) for multi-agent orchestration --- ## Examples and Patterns ### 相关页面相关主题：[Introduction to OpenAI Agents SDK](#introduction), [Creating and Running Agents](#agents), [Tools and Tool Use](#tools)

Relevant Source Files

以下源码文件用于生成本页说明： - [examples/basic/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/basic/README.md) - [examples/agent-patterns/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/agent-patterns/README.md) - [examples/tools/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/tools/README.md) - [examples/research-bot/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/research-bot/README.md) - [packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) - [packages/agents-core/src/tool.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/tool.ts) - [packages/agents-core/src/handoff.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/handoff.ts) - [packages/agents-core/src/lifecycle.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/lifecycle.ts) - [packages/agents-realtime/src/realtimeAgent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/realtimeAgent.ts)

# Examples and Patterns This page documents the example applications and architectural patterns demonstrated in the OpenAI Agents SDK for JavaScript/TypeScript. These examples serve as practical guides for building AI-powered applications using the `@openai/agents-core` package. ## Overview The repository organizes examples into distinct categories that demonstrate different aspects of agent development: | Category | Location | Purpose | |----------|----------|---------| | Basic Examples | `examples/basic/` | Core functionality and getting started | | Agent Patterns | `examples/agent-patterns/` | Common architectural patterns | | Tool Integrations | `examples/tools/` | Hosted tool usage | | Research Bot | `examples/research-bot/` | Multi-agent orchestration | | MCP Examples | `examples/mcp/` | Model Context Protocol integration | ## Basic Examples The Basic Examples directory provides foundational examples for understanding the Agents SDK core concepts. ### Hello World A minimal agent that responds with a haiku, demonstrating the simplest possible agent setup. ```bash pnpm -F basic start:hello-world ``` This example shows: - Creating an `Agent` instance with instructions - Running the agent with `run()` - Basic text output handling ### Interactive Chat An interactive CLI chat application that demonstrates: - Real-time user input handling - Streaming responses - Handoff integration with weather queries ```bash pnpm -F basic start:chat ``` ### Streaming Examples The SDK supports multiple streaming modes for different use cases: | Example | Command | Description | |---------|---------|-------------| | stream-text | `pnpm -F basic start:stream-text` | Plain text streaming | | stream-items | `pnpm -F basic start:stream-items` | Event streaming with tool usage | | stream-ws | `pnpm -F basic start:stream-ws` | WebSocket streaming with HITL approval | ```bash pnpm -F basic start:stream-text pnpm -F basic start:stream-items pnpm -F basic start:stream-ws ``` ### Dynamic System Prompt Demonstrates runtime instruction modification using the `input_guardrails` pattern to pick instructions dynamically per run. ```bash pnpm -F basic start:dynamic-system-prompt ``` ### Lifecycle Hooks Two examples demonstrate the event hook system: - `lifecycle-example.ts` - Logs detailed lifecycle events and usage statistics - `agent-lifecycle-example.ts` - Minimal lifecycle hooks demonstration ```bash pnpm -F basic start:lifecycle-example pnpm -F basic start:agent-lifecycle-example ``` ### Image Handling | Example | Purpose | |---------|---------| | `local-image.ts` | Send local images to the agent | | `image-tool-output.ts` | Return images from tools for agent analysis | ```bash pnpm -F basic start:local-image pnpm -F basic start:image-tool-output ``` ## Agent Patterns The Agent Patterns directory demonstrates common architectural patterns for building sophisticated AI applications. ### Agents as Tools Pattern The most important pattern - enabling agents to be used as tools by other agents. This enables hierarchical agent architectures. ```bash pnpm examples:agents-as-tools ``` #### Key Files - `agents-as-tools.ts` - Orchestrate translator agents using them as tools - `agents-as-tools-structured.ts` - Use structured tool input with `Agent.asTool()` - `agents-as-tools-conditional.ts` - Enable language tools based on user preference #### Implementation Pattern ```typescript // From agents-as-tools.ts - Agents can be wrapped as tools const translatorAgent = Agent.make({ name: 'translator', instructions: 'Translate text between languages', }); const translatorTool = translatorAgent.asTool({ toolName: 'translate', toolDescription: 'Translate text to a target language', }); ``` 资料来源：[examples/agent-patterns/agents-as-tools.ts]() The `Agent.asTool()` method creates a `FunctionTool` that wraps the agent, allowing other agents to invoke it with typed parameters. ### Deterministic Flow Pattern Fixed agent flow with gating and quality checks for controlled execution paths. ```bash pnpm examples:deterministic ``` ### Tool Forcing Pattern Requires specific tools to be called before final output is generated. ```bash pnpm -F agent-patterns start:forcing-tool-use ``` ### Human-in-the-Loop (HITL) Pattern Human approval integration for sensitive operations: | Example | Command | Use Case | |---------|---------|----------| | `human-in-the-loop.ts` | `pnpm examples:human-in-the-loop` | Manual approval | | `human-in-the-loop-stream.ts` | `pnpm examples:streamed:human-in-the-loop` | Streaming approval | ```bash pnpm examples:human-in-the-loop pnpm examples:streamed:human-in-the-loop ``` ### Guardrails Pattern Input and output validation to control agent behavior: | Example | Purpose | |---------|---------| | `input-guardrails.ts` | Reject unwanted requests before processing | | `output-guardrails.ts` | Block unsafe output | ```bash pnpm examples:input-guardrails pnpm examples:output-guardrails ``` ### LLM-as-Judge Pattern Self-evaluation and iteration using a secondary LLM to judge outputs. ```bash pnpm -F agent-patterns start:llm-as-a-judge ``` ## Tool Integrations The Tools Examples directory demonstrates hosted tools provided by the SDK. ### Computer Use Tool Browser automation using Playwright for GUI interaction: ```bash pnpm examples:tools-computer-use ``` ### File Search Tool Vector search integration for document retrieval: ```bash pnpm examples:tools-file-search ``` ### Tool Search (Deferred Loading) Demonstrates namespace loading and selective tool loading: ```bash pnpm examples:tools-tool-search ``` Key concepts: - `deferLoading: true` for lazy tool loading - Namespace-based tool organization - Selective tool loading per context ### Codex Tool Code execution and analysis using the Codex CLI: | Example | Purpose | |---------|---------| | `codex.ts` | Streaming event logs and skill usage | | `codex-same-thread.ts` | Reuse Codex thread across turns | ```bash pnpm examples:tools-codex pnpm examples:tools-codex-same-thread ``` Environment variables: - `CODEX_API_KEY` - Primary API key - `OPENAI_API_KEY` - Fallback API key - `CODEX_PATH` - Override Codex CLI path ### Web Search Tool General web queries using `webSearchTool`: ```bash pnpm examples:tools-web-search ``` ### Code Interpreter Secure code execution environment: ```bash pnpm -F tools start:code-interpreter ``` ### Image Generation DALL-E integration for image generation: ```bash pnpm examples:tools-image-generation ``` ## Multi-Agent Orchestration ### Research Bot Example A complete example demonstrating multi-agent coordination to produce research reports. #### Architecture ```mermaid graph TD A[User Query] --> B[ResearchManager] B --> C[Planner Agent] C -->|Search Terms| D[Search Agent] D --> E[Web Search Tool] E --> D D -->|Summaries| B B --> F[Writer Agent] F -->|Research Report| A ``` #### Components | File | Role | |------|------| | `main.ts` | CLI entrypoint | | `manager.ts` | Coordinates workflow stages | | `agents.ts` | Agent definitions | 资料来源：[examples/research-bot/README.md]() #### Usage ```bash pnpm examples:research-bot ``` ## Realtime Agents Specialized agents for voice applications using the Realtime API. ### Configuration Differences Realtime agents have limited configuration compared to standard agents: | Supported | Not Supported | |-----------|---------------| | `name` | `model` | | `handoffs` | `modelSettings` | | `voice` | `toolUseBehavior` | | | `outputGuardrails` | | | `inputGuardrails` | 资料来源：[packages/agents-realtime/src/realtimeAgent.ts]() ## Lifecycle Events The SDK provides comprehensive event hooks for monitoring agent execution: ### Agent-Level Events | Event | Parameters | Description | |-------|------------|-------------| | `agent_start` | `context, agent, turnInput?` | Agent begins execution | | `agent_end` | `context, agent, output` | Agent completes | | `agent_handoff` | `context, fromAgent, toAgent` | Agent transfers control | | `agent_tool_start` | `context, agent, tool, details` | Tool invocation begins | | `agent_tool_end` | `context, agent, tool, result, details` | Tool invocation completes | 资料来源：[packages/agents-core/src/lifecycle.ts]() ### Event Handler Pattern ```typescript agent.addHook('agent_start', async (context, agent) => { console.log(`Starting agent: ${agent.name}`); }); ``` ## Common Patterns Summary ### Pattern: Agent-as-Tool Hierarchy ```mermaid graph BT A[Orchestrator Agent] -->|calls| B[Specialist Agent 1] A -->|calls| C[Specialist Agent 2] B -->|uses| D[Tool: Translation] C -->|uses| E[Tool: Search] ``` ### Pattern: Human-in-the-Loop ```mermaid graph LR A[Agent] -->|tool call| B{Approval Gate} B -->|approve| C[Execute Tool] B -->|reject| D[Return Error] C --> E[Continue Run] ``` ### Pattern: Error Handling with Handlers The SDK supports custom error handlers for: - `MaxTurnsExceededError` - Turn limit exceeded - `ModelRefusalError` - Model refused to respond ```typescript type RunErrorHandler = (input: { error: MaxTurnsExceededError | ModelRefusalError; context: RunContext; runData: RunErrorData; }) => RunErrorHandlerResult | void; ``` 资料来源：[packages/agents-core/src/runner/errorHandlers.ts]() ## Running Examples All examples use `pnpm` for execution: ```bash # From repository root pnpm examples: # Or with package-specific commands pnpm -F start: ``` ### Example Commands Reference | Category | Example | Command | |----------|---------|---------| | Basic | Hello World | `pnpm -F basic start:hello-world` | | Basic | Chat | `pnpm -F basic start:chat` | | Basic | Streaming | `pnpm -F basic start:stream-items` | | Patterns | Agents as Tools | `pnpm examples:agents-as-tools` | | Patterns | Human in Loop | `pnpm examples:human-in-the-loop` | | Patterns | Guardrails | `pnpm examples:input-guardrails` | | Tools | Web Search | `pnpm examples:tools-web-search` | | Tools | Code Interpreter | `pnpm -F tools start:code-interpreter` | | Research | Multi-Agent | `pnpm examples:research-bot` | ## Key APIs Demonstrated ### Agent Creation ```typescript const agent = Agent.make({ name: 'agent-name', instructions: 'System prompt or instructions', handoffs: [otherAgent], tools: [someTool], outputType: TextOutput, }); ``` ### Tool Creation ```typescript const tool = tool({ name: 'tool-name', description: 'What this tool does', parameters: z.object({ ... }), strict: true, })(async (context, { param }) => { return 'result'; }); ``` ### Handoff Configuration ```typescript const handoff = new Handoff(agent, { onHandoff: async (context, input) => { // Custom handoff logic }, }); ``` 资料来源：[packages/agents-core/src/handoff.ts]() ## See Also - [Agent SDK Documentation](https://github.com/openai/openai-agents-js) - [@openai/agents-core Package](packages/agents-core/) - [@openai/agents-realtime Package](packages/agents-realtime/) - [API Reference](packages/agents-core/src/) --- ## Creating and Running Agents ### 相关页面相关主题：[Tools and Tool Use](#tools), [Guardrails and Input/Output Validation](#guardrails), [Handoffs and Multi-Agent Systems](#handoffs)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) - [packages/agents-core/src/run.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/run.ts) - [packages/agents-core/src/lifecycle.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/lifecycle.ts) - [packages/agents-core/src/handoff.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/handoff.ts) - [packages/agents-core/src/tool.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/tool.ts) - [packages/agents-core/src/runner/errorHandlers.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runner/errorHandlers.ts)

# Creating and Running Agents ## Overview Agents are the core building blocks of the OpenAI Agents SDK. An agent is an AI model configured with instructions (system prompt), tools, guardrails, handoffs, and other settings that determine its behavior during execution. The SDK provides a flexible `Agent` class that supports various configuration options for different use cases. Agents are designed to be composable — they can be chained together through handoffs, used as tools by other agents, and integrated into complex workflows. Each agent maintains its own configuration and can emit lifecycle events for monitoring and debugging. 资料来源：[packages/agents-core/src/agent.ts:117-127]() ## Agent Architecture ```mermaid graph TD A[User Input] --> B[Agent Runner] B --> C[Model Execution] C --> D{Tool Calls?} D -->|Yes| E[Tool Executor] E --> F[Tool Result] F --> C D -->|No| G[Output Generation] G --> H[Final Output] subgraph "Agent Components" I[Instructions/System Prompt] J[Tools] K[Handoffs] L[Guardrails] M[Output Type] end B -.->|configures| I B -.->|uses| J B -.->|enables| K B -.->|validates via| L B -.->|produces| M ``` ## Agent Configuration ### Basic Configuration Parameters The `Agent` class accepts a configuration object that defines its behavior. The main configuration interface is `AgentConfiguration`. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `name` | `string` | Yes | A unique identifier for the agent | | `instructions` | `string \| function` | Recommended | The system prompt that guides agent behavior | | `handoffDescription` | `string` | No | Human-readable description for handoffs | | `tools` | `Tool[]` | No | Array of tools the agent can use | | `handoffs` | `Agent[] \| Handoff[]` | No | Other agents the agent can transfer control to | | `outputType` | `AgentOutputType` | No | Expected output format | | `model` | `string` | No | The model to use (defaults to `gpt-4o`) | | `modelSettings` | `ModelSettings` | No | Additional model configuration | | `inputGuardrails` | `InputGuardrail[]` | No | Validation before agent execution | | `outputGuardrails` | `OutputGuardrail[]` | No | Validation after agent execution | 资料来源：[packages/agents-core/src/agent.ts:1-150]() ### Output Types Agents support different output types defined by the `AgentOutputType` union: | Output Type | Description | |-------------|-------------| | `TextOutput` | Plain text response (default) | | `JsonOutput` | Structured JSON response | | Custom types | User-defined output schemas | ```typescript import { Agent, TextOutput, JsonOutput } from '@openai/agents'; // Text output agent (default) const textAgent = new Agent({ name: 'text-agent', instructions: '...' }); // JSON output agent const jsonAgent = new Agent({ name: 'json-agent', instructions: '...', outputType: JsonOutput, }); ``` ## Creating an Agent ### Basic Agent Creation The simplest agent requires only a name and instructions: ```typescript import { Agent } from '@openai/agents'; const agent = new Agent({ name: 'my-agent', instructions: 'You are a helpful assistant.', }); ``` 资料来源：[packages/agents-core/src/agent.ts:128-150]() ### Agent with Tools Tools extend agent capabilities by allowing it to perform actions: ```typescript import { Agent, Tool } from '@openai/agents'; const myTool = Tool.from({ name: 'calculator', description: 'Perform mathematical calculations', parameters: { type: 'object', properties: { expression: { type: 'string', description: 'Math expression' }, }, required: ['expression'], }, invoke: async (runContext, input) => { const { expression } = JSON.parse(input); return eval(expression).toString(); }, }); const agentWithTools = new Agent({ name: 'math-assistant', instructions: 'Use the calculator for complex math.', tools: [myTool], }); ``` 资料来源：[packages/agents-core/src/tool.ts:1-80]() ### Agent with Handoffs Handoffs enable transfer of control between agents: ```typescript import { Agent, Handoff } from '@openai/agents'; const salesAgent = new Agent({ name: 'sales', instructions: 'Handle sales inquiries.', }); const supportAgent = new Agent({ name: 'support', instructions: 'Handle support requests.', }); const routerAgent = new Agent({ name: 'router', instructions: 'Route to the appropriate department.', handoffs: [salesAgent, supportAgent], }); ``` 资料来源：[packages/agents-core/src/handoff.ts:1-80]() ### Handoff Configuration The `HandoffConfig` type provides additional options for handoff behavior: | Parameter | Type | Description | |-----------|------|-------------| | `toolNameOverride` | `string` | Custom name for the handoff tool | | `toolDescriptionOverride` | `string` | Custom description for the handoff | | `onHandoff` | `OnHandoffCallback` | Callback function executed during handoff | | `inputJsonSchema` | `JsonSchema` | Schema for handoff input validation | ```typescript import { Handoff } from '@openai/agents'; const configuredHandoff = new Handoff(supportAgent, async (context) => { console.log(`Handing off to support for session ${context.threadId}`); return supportAgent; }); ``` 资料来源：[packages/agents-core/src/handoff.ts:80-120]() ### Agents as Tools Agents can be used as tools by other agents using `Agent.asTool()`: ```typescript const translatorAgent = new Agent({ name: 'translator', instructions: 'Translate text to the target language.', }); const mainAgent = new Agent({ name: 'main', instructions: 'Use the translator for language tasks.', tools: [ translatorAgent.asTool({ toolName: 'translate', toolDescription: 'Translate text between languages', parameters: { type: 'object', properties: { text: { type: 'string' }, targetLanguage: { type: 'string' }, }, }, }), ], }); ``` 资料来源：[packages/agents-core/src/agent.ts:150-200]() ## Running Agents ### Basic Run Execute an agent using the `run()` function: ```typescript import { Agent, run } from '@openai/agents'; const agent = new Agent({ name: 'hello-agent', instructions: 'Respond in haiku format.', }); const result = await run(agent, 'Tell me about dogs'); console.log(result.finalOutput); ``` ### Run with Options The `run()` function accepts additional options: | Option | Type | Description | |--------|------|-------------| | `context` | `TContext` | Mutable context object passed to tools | | `stream` | `boolean` | Enable streaming response | | `maxTurns` | `number` | Maximum agent turns before stopping | | `model` | `string` | Override the agent's default model | | `modelSettings` | `ModelSettings` | Override model settings | | `tools` | `Tool[]` | Additional tools to include | | `outputType` | `AgentOutputType` | Override output type | ```typescript const result = await run(agent, 'Process this request', { context: { userId: '123' }, maxTurns: 10, modelSettings: { temperature: 0.7 }, }); ``` ### Streaming Responses Enable streaming for real-time output: ```typescript const stream = await run(agent, 'Generate a long story', { stream: true }); for await (const event of stream) { if (event.type === 'agent_update') { console.log(event.data); } } ``` ## Agent Lifecycle ### Lifecycle Events Agents emit events throughout their execution lifecycle. The `AgentHooks` class extends `EventEmitterDelegate` to provide event handling. ```mermaid stateDiagram-v2 [*] --> start: agent_start start --> tool_use: tool call required start --> end: direct response tool_use --> tool_execution: agent_tool_start tool_execution --> tool_complete: agent_tool_end tool_complete --> start: continue tool_complete --> end: no more tools end --> [*]: agent_end start --> handoff: handoff trigger handoff --> [*]: agent_handoff ``` ### Available Lifecycle Events | Event | Parameters | Description | |-------|------------|-------------| | `agent_start` | `context, agent, turnInput?` | Fired when agent execution begins | | `agent_end` | `context, agent, output` | Fired when agent execution completes | | `agent_handoff` | `context, fromAgent, toAgent` | Fired during agent transfer | | `agent_tool_start` | `context, tool, details` | Fired when a tool begins | | `agent_tool_end` | `context, tool, result, details` | Fired when a tool completes | ```typescript const agent = new Agent({ name: 'monitored-agent', instructions: 'You are helpful.', }); agent.on('agent_start', (context, agent) => { console.log(`Starting agent: ${agent.name}`); }); agent.on('agent_tool_start', (context, tool, details) => { console.log(`Tool called: ${tool.name}`); }); agent.on('agent_end', (context, agent, output) => { console.log(`Agent finished with: ${output}`); }); ``` 资料来源：[packages/agents-core/src/lifecycle.ts:1-100]() ### Run-Level Hooks In addition to agent hooks, you can register hooks at the run level: ```typescript import { Agent, run, RunHookEvents } from '@openai/agents'; const hooks: RunHookEvents = { onAgentStart: async (context, agent) => { console.log(`Run starting with agent: ${agent.name}`); }, onAgentEnd: async (context, agent, output) => { console.log(`Run completed with output: ${output}`); }, }; const result = await run(agent, input, { hooks }); ``` ## Context Management ### Run Context The `RunContext` provides a mechanism to pass and share state across tools, guardrails, and handoffs: ```typescript import { Agent, run, RunContext } from '@openai/agents'; interface MyContext { userId: string; sessionData: Record; } const agent = new Agent({ name: 'context-agent', instructions: 'You are a personalized assistant.', }); const result = await run(agent, 'Hello', { context: { userId: 'user-123', sessionData: { lastVisit: Date.now() }, }, }); // Access context in tools const myTool = Tool.from({ name: 'user-info', description: 'Get user information', invoke: async (runContext, input) => { return runContext.context.userId; }, }); ``` ### Tool Input Capture The SDK can capture tool input into the run context for debugging and logging: ```typescript const tool = Tool.from({ name: 'search', description: 'Search the web', parameters: { /* schema */ }, invoke: async (runContext, input, details) => { // The SDK can capture input when shouldCaptureToolInput is enabled const capturedInput = runContext.toolInput; console.log(`Search called with: ${capturedInput}`); return 'results'; }, }); ``` 资料来源：[packages/agents-core/src/agent.ts:200-280]() ## Error Handling ### Error Handlers The SDK provides structured error handling through `RunErrorHandler` types: | Error Type | Description | |------------|-------------| | `MaxTurnsExceededError` | Maximum turns limit reached | | `ModelRefusalError` | Model refused to respond | ```typescript import { Agent, run, RunErrorHandlers } from '@openai/agents'; const errorHandlers: RunErrorHandlers = { onMaxTurnsExceeded: async ({ error, context, runData }) => { return { finalOutput: 'The conversation reached its maximum length.', includeInHistory: true, }; }, onModelRefusal: async ({ error, context, runData }) => { return { finalOutput: 'I apologize, but I cannot help with that request.', includeInHistory: false, }; }, default: async ({ error }) => { return { finalOutput: 'An unexpected error occurred.', includeInHistory: true, }; }, }; const result = await run(agent, input, { errorHandlers }); ``` 资料来源：[packages/agents-core/src/runner/errorHandlers.ts:1-80]() ## Agent Factory Method ### Static `create()` Method The `Agent.create()` static method provides type-safe agent creation with automatic output type inference: ```typescript import { Agent } from '@openai/agents'; // Create agent with handoffs - output type is inferred const agent = Agent.create({ name: 'orchestrator', instructions: 'Route requests appropriately.', handoffs: [ salesAgent, // outputs: TextOutput billingAgent, // outputs: JsonOutput ], // TOutput is automatically inferred as: TextOutput | JsonOutput }); ``` 资料来源：[packages/agents-core/src/agent.ts:140-180]() ## Complete Example ```typescript import { Agent, run, Tool } from '@openai/agents'; // Define tools const searchTool = Tool.from({ name: 'web_search', description: 'Search the web for information', parameters: { type: 'object', properties: { query: { type: 'string' }, }, required: ['query'], }, strict: true, invoke: async (runContext, input) => { const { query } = JSON.parse(input); return `Results for: ${query}`; }, }); // Define agents const researchAgent = new Agent({ name: 'researcher', instructions: 'Use web search to gather information.', tools: [searchTool], }); const writerAgent = new Agent({ name: 'writer', instructions: 'Write clear, concise summaries.', }); // Orchestrator agent const orchestrator = new Agent({ name: 'orchestrator', instructions: 'Delegate research and writing tasks.', handoffs: [researchAgent, writerAgent], }); // Register lifecycle hooks orchestrator.on('agent_start', (ctx, agent) => { console.log(`Starting: ${agent.name}`); }); orchestrator.on('agent_handoff', (ctx, from, to) => { console.log(`Handoff: ${from.name} → ${to.name}`); }); // Run the agent const result = await run(orchestrator, 'Research and write about AI'); console.log(result.finalOutput); ``` ## Summary The OpenAI Agents SDK provides a flexible and extensible framework for creating and running AI agents: | Feature | Description | |---------|-------------| | **Agent Configuration** | Flexible configuration with instructions, tools, handoffs, and guardrails | | **Tool Integration** | Define custom tools with typed parameters and validation | | **Handoffs** | Transfer control between agents seamlessly | | **Lifecycle Events** | Monitor and debug agent execution | | **Context Management** | Share state across tools and handoffs | | **Error Handling** | Gracefully handle errors with custom handlers | | **Streaming** | Support for real-time streaming responses | All agents are built on the core `Agent` class which implements the `AgentConfiguration` interface, providing consistency and predictability across the SDK. --- ## Tools and Tool Use ### 相关页面相关主题：[Creating and Running Agents](#agents)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-core/src/tool.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/tool.ts) - [packages/agents-core/src/mcp.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/mcp.ts) - [packages/agents-core/src/mcpUtil.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/mcpUtil.ts) - [packages/agents-core/src/runner/toolExecution.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runner/toolExecution.ts) - [packages/agents-core/src/runner/toolSearch.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runner/toolSearch.ts)

# Tools and Tool Use ## Overview Tools extend an agent's capabilities by allowing it to interact with external systems, execute code, search for information, and perform actions beyond text generation. In the OpenAI Agents SDK, tools are defined as first-class constructs that agents can invoke during their reasoning loops. When an agent decides to use a tool, the SDK handles the execution, processes the results, and feeds them back to the agent for continued decision-making. The tool system supports multiple tool types including function tools, hosted tools, computer tools (for browser automation), shell tools, and MCP (Model Context Protocol) tools. Each tool type has specific use cases and integration patterns. 资料来源：[packages/agents-core/src/tool.ts:1-50]() ## Tool Types The SDK defines several distinct tool types through a discriminated union. Understanding these types is essential for selecting the appropriate tool for a given task. | Tool Type | Description | Use Case | |-----------|-------------|----------| | `function` | Custom JavaScript functions exposed to the agent | Custom logic, API calls, data processing | | `hosted_tool` | Tools provided by external services (MCP, web search, etc.) | Third-party integrations | | `computer` | Browser automation via Playwright | Web interactions, UI testing | | `shell` | Command-line execution | System operations, script running | | `apply_patch` | Code modification operations | Automated code editing | 资料来源：[packages/agents-core/src/tool.ts:60-120]() ### Function Tools Function tools are the most common tool type. They wrap JavaScript functions with metadata that allows the agent to understand when and how to call them. ```typescript type FunctionTool< Context = UnknownContext, TParameters extends ToolInputParameters = undefined, Result = unknown, > = { type: 'function'; name: string; description: string; parameters: JsonObjectSchema; strict: boolean; deferLoading?: boolean; invoke: ( runContext: RunContext, input: string, details?: ToolCallDetails, ) => Promise; needsApproval?: boolean | ToolApprovalFunction; }; ``` 资料来源：[packages/agents-core/src/tool.ts:60-85]() ### Hosted Tools Hosted tools represent external services that the agent can invoke. They include a `providerData` field for additional configuration and support for custom executors. ```typescript export type HostedTool = { type: 'hosted_tool'; name: string; providerData?: Record; }; ``` 资料来源：[packages/agents-core/src/tool.ts:140-150]() ## Tool Configuration Options ### Basic Configuration When defining a tool, several configuration options control its behavior: | Option | Type | Default | Description | |--------|------|---------|-------------| | `name` | `string` | (required) | Unique identifier for the tool | | `description` | `string` | (required) | Human-readable description for the agent | | `parameters` | `JsonObjectSchema` | `{ input: string }` | JSON Schema for input validation | | `strict` | `boolean` | `false` | Enforce strict schema following | | `deferLoading` | `boolean` | `false` | Defer tool loading until needed | | `needsApproval` | `boolean \| Function` | `false` | Require human approval before execution | 资料来源：[packages/agents-core/src/tool.ts:70-95]() ### Namespace Support Tools can be grouped into namespaces to organize related functionality: ```typescript const tool = FunctionTool.create({ name: 'search', description: 'Search for information', parameters: z.object({ query: z.string() }), namespace: 'search_tools', namespaceDescription: 'Tools for searching various sources', }); ``` Namespaces are useful when you have many tools and want to help the agent understand logical groupings. All tools within a namespace must share the same description. 资料来源：[packages/agents-openai/src/openaiResponsesModel.ts:30-60]() ## Tool Registration with Agents Tools are registered with agents through the agent configuration. The SDK provides flexible patterns for tool registration. ### Using the `tools` Option ```typescript import { Agent, FunctionTool } from '@openai/agents-core'; const searchTool = FunctionTool.create({ name: 'web_search', description: 'Search the web for information', parameters: z.object({ query: z.string(), maxResults: z.number().optional(), }), execute: async (context, args) => { // Implementation return await performSearch(args.query, args.maxResults); }, }); const agent = Agent.make({ name: 'Research Agent', instructions: 'You are a research assistant.', tools: [searchTool], }); ``` ### Using `Agent.asTool()` Agents can expose other agents as tools, enabling hierarchical agent architectures: ```typescript const translatorAgent = Agent.make({ name: 'Translator', instructions: 'Translate text between languages.', }); const translatorTool = translatorAgent.asTool({ toolName: 'translate', toolDescription: 'Translate text from one language to another', }); const orchestratorAgent = Agent.make({ name: 'Orchestrator', instructions: 'Coordinate tasks using available tools.', tools: [translatorTool], }); ``` 资料来源：[examples/agent-patterns/README.md:10-15]() ## Tool Execution Flow The following diagram illustrates the tool execution flow within the SDK: ```mermaid sequenceDiagram participant Agent as Agent participant Runner as Runner participant ToolSearch as Tool Search participant Executor as Tool Executor participant External as External System Agent->>Runner: Request tool call Runner->>ToolSearch: Check tool availability alt Tool is deferred ToolSearch->>ToolSearch: Load tool on demand end Runner->>Executor: Execute tool call Executor->>External: Invoke tool External-->>Executor: Return result Executor-->>Runner: Process result Runner-->>Agent: Feed back to agent ``` 资料来源：[packages/agents-core/src/runner/toolExecution.ts:1-50]() ## Deferred Tool Loading The SDK supports deferred loading for tools that may not always be needed. This is particularly useful for: - Expensive tools that should only be loaded when actually called - Tools that depend on runtime configuration - Large tool sets where loading everything upfront is inefficient ### Top-Level Deferred Tools Tools can be marked with `deferLoading: true` at the top level: ```typescript const expensiveTool = FunctionTool.create({ name: 'expensive_operation', description: 'Performs expensive computation', deferLoading: true, // Will not be loaded until called parameters: z.object({ input: z.string() }), execute: async (context, args) => { return await runExpensiveOperation(args.input); }, }); ``` 资料来源：[packages/agents-core/src/runner/toolSearch.ts:40-80]() ### Deferred Namespace Loading Namespaces can also be configured for deferred loading: ```typescript const tools = [ FunctionTool.create({ name: 'db_query', description: 'Query the database', namespace: 'database', deferLoading: true, parameters: z.object({ sql: z.string() }), }), FunctionTool.create({ name: 'db_insert', description: 'Insert into database', namespace: 'database', deferLoading: true, parameters: z.object({ table: z.string(), data: z.record(z.any()) }), }), ]; const agent = Agent.make({ name: 'Data Agent', instructions: 'Manage data operations', tools: tools, toolNamespaces: { database: { description: 'Database operations', deferLoading: true, }, }, }); ``` The SDK provides a built-in client tool search executor for handling deferred loading: ```typescript export type ClientToolSearchExecutor = ( args: ClientToolSearchExecutorArgs, ) => Tool[] | Tool | null | undefined | Promise[] | Tool | null | undefined>; ``` 资料来源：[packages/agents-core/src/tool.ts:160-175]() ## Tool Input Validation Tools support JSON Schema-based input validation through the `parameters` option. The SDK uses Zod for schema definition but converts these to JSON Schema for API communication. ### Schema Definition ```typescript const tool = FunctionTool.create({ name: 'calculate', description: 'Perform calculations', parameters: z.object({ expression: z.string().describe('Mathematical expression'), precision: z.number().int().min(0).max(10).optional(), }), strict: true, // Enforce strict schema following }); ``` When `strict` is enabled, the model must try to strictly follow the schema, though this may result in slower response times. 资料来源：[packages/agents-core/src/tool.ts:72-75]() ## Human-in-the-Loop Approval Tools can require human approval before execution. This is controlled via the `needsApproval` option: ```typescript const sensitiveTool = FunctionTool.create({ name: 'delete_data', description: 'Delete data from the system', needsApproval: true, // Always require approval parameters: z.object({ id: z.string(), confirm: z.boolean(), }), execute: async (context, args) => { if (!args.confirm) { return 'Deletion cancelled'; } return await deleteRecord(args.id); }, }); ``` For dynamic approval logic, pass a function: ```typescript const conditionalTool = FunctionTool.create({ name: 'process_payment', description: 'Process payment transactions', needsApproval: (args) => args.amount > 1000, // Approve only large amounts parameters: paymentSchema, }); ``` 资料来源：[packages/agents-core/src/tool.ts:80-85]() ## Tool Serialization The SDK supports serializing tools for transmission and storage. The `serializeTool` function handles different tool types: ```typescript export function serializeTool(tool: Tool): SerializedTool { if (tool.type === 'function') { return { type: 'function', name: tool.name, description: tool.description, parameters: tool.parameters as JsonObjectSchema, strict: tool.strict, deferLoading: tool.deferLoading, ...(namespace ? { namespace } : {}), }; } // Handle other types... } ``` 资料来源：[packages/agents-core/src/utils/serialize.ts:30-60]() ### Computer Tool Serialization Computer tools have special serialization requirements since they require initialization: ```typescript if (tool.type === 'computer') { if (!isComputerInstance(tool.computer)) { throw new UserError( 'Computer tool is not initialized for serialization. Call resolveComputer({ tool, runContext }) first.', ); } return { type: 'computer', name: tool.name, environment: tool.computer.environment, dimensions: tool.computer.dimensions, }; } ``` 资料来源：[packages/agents-core/src/utils/serialize.ts:45-65]() ## MCP Tool Integration The SDK supports MCP (Model Context Protocol) tools through the `HostedMCPTool` type. MCP tools are hosted tools with special provider data: ```typescript export type HostedMCPTool = HostedTool & { providerData: { type: 'mcp'; serverName: string; toolName: string; defer_loading?: boolean; }; }; ``` MCP tools can also be deferred using the `defer_loading` flag in provider data: ```typescript function isDeferredHostedMcpTool(tool: Tool): tool is HostedMCPTool { return ( tool.type === 'hosted_tool' && tool.providerData?.type === 'mcp' && tool.providerData.defer_loading === true ); } ``` 资料来源：[packages/agents-core/src/mcp.ts:1-50]() ## Best Practices ### Tool Naming - Use clear, descriptive names that indicate the tool's purpose - Follow a consistent naming convention (e.g., `verb_noun` or `noun_verb`) - Avoid ambiguous abbreviations ### Tool Descriptions - Write descriptions from the agent's perspective - Include examples of when to use the tool - Specify input requirements and constraints - Mention any limitations or side effects ### Error Handling Tools should handle errors gracefully and return meaningful error messages: ```typescript const robustTool = FunctionTool.create({ name: 'fetch_data', description: 'Fetch data from the API', parameters: z.object({ id: z.string() }), execute: async (context, args) => { try { const result = await fetchData(args.id); return JSON.stringify(result); } catch (error) { return JSON.stringify({ error: 'fetch_failed', message: error instanceof Error ? error.message : 'Unknown error', }); } }, }); ``` ### Tool Granularity - Prefer single-responsibility tools over monolithic tools - Compose complex operations from simpler tools - Avoid tools that do too many unrelated things ## Summary The OpenAI Agents SDK provides a flexible and extensible tool system that enables agents to interact with external systems and perform actions. Key concepts include: - **Multiple tool types**: Function tools, hosted tools, computer tools, shell tools, and MCP tools - **Rich configuration**: Namespaces, deferred loading, strict validation, and approval requirements - **Integration patterns**: Tools can be created directly, agents can be exposed as tools, and external services can be integrated via MCP - **Tool execution**: The SDK handles tool invocation, result processing, and agent feedback - **Serialization**: Tools can be serialized for transmission and storage By following the patterns and practices outlined in this guide, you can effectively extend your agents' capabilities with custom tools tailored to your specific use cases. --- ## Guardrails and Input/Output Validation ### 相关页面相关主题：[Creating and Running Agents](#agents), [Tools and Tool Use](#tools)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-core/src/guardrail.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/guardrail.ts) - [packages/agents-core/src/toolGuardrail.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/toolGuardrail.ts) - [packages/agents-core/src/runner/guardrails.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runner/guardrails.ts) - [docs/src/content/docs/guides/guardrails.mdx](https://github.com/openai/openai-agents-js/blob/main/docs/src/content/docs/guides/guardrails.mdx)

# Guardrails and Input/Output Validation Guardrails provide a security and validation layer in the Agents SDK, allowing developers to intercept and validate inputs before agent processing and outputs after generation. This mechanism ensures that agents operate within defined safety boundaries, reject malformed or harmful requests, and maintain controlled behavior throughout their execution lifecycle. ## Architecture Overview The guardrail system is organized into three distinct types, each serving a specific validation phase: | Guardrail Type | Trigger Point | Purpose | |----------------|---------------|---------| | **Input Guardrails** | Before agent processes user input | Reject or transform incoming requests | | **Output Guardrails** | After agent generates response | Validate and filter agent outputs | | **Tool Output Guardrails** | After tool execution | Validate tool results before agent sees them | 资料来源：[packages/agents-core/src/guardrail.ts:1-50]() ```mermaid graph TD A[User Input] --> B{Input Guardrails} B -->|Pass| C[Agent Processing] B -->|Fail| D[Reject Request] C --> E[Tool Calls] E --> F{Tool Output Guardrails} F -->|Pass| G[Agent Receives Result] F -->|Fail| H[Block Result] G --> I[Agent Generates Output] I --> J{Output Guardrails} J -->|Pass| K[Final Response] J -->|Fail| L[Block/Modify Output] style B fill:#e1f5fe style F fill:#fff3e0 style J fill:#e8f5e9 ``` ## Input Guardrails Input guardrails intercept user requests before they reach the agent's processing logic. They are ideal for implementing rate limiting, content filtering, authentication checks, and request validation. ### Interface Definition ```typescript export interface InputGuardrailFunctionArgs { agent: Agent; input: string; context: RunContext; } export type InputGuardrailFunction = ( args: InputGuardrailFunctionArgs ) => Promise; ``` 资料来源：[packages/agents-core/src/guardrail.ts:1-30]() ### Guardrail Function Output Both input and output guardrails return a standardized result structure: ```typescript export interface GuardrailFunctionOutput { decision: 'allow' | 'block'; message?: string; metadata?: Record; } ``` When `decision` is `'block'`, the run terminates and the optional `message` is returned to the user. ### Input Guardrail Interface ```typescript export interface InputGuardrail { name: string; execute: InputGuardrailFunction; } ``` 资料来源：[packages/agents-core/src/guardrail.ts:35-50]() ### Configuration in Agent Input guardrails are configured at the agent level: ```typescript const agent = Agent.make({ name: 'my-agent', instructions: 'You are a helpful assistant.', inputGuardrails: [ { name: 'content-filter', execute: async ({ input }) => { if (containsProfanity(input)) { return { decision: 'block', message: 'Inappropriate content detected' }; } return { decision: 'allow' }; } } ] }); ``` ## Output Guardrails Output guardrails validate the agent's generated response before it is returned to the user. They enable content safety checks, PII detection, quality filtering, and output format enforcement. ### Interface Definition ```typescript export interface OutputGuardrailFunctionArgs< TOutput extends AgentOutputType = TextOutput, TContext = UnknownContext, > { agent: Agent; agentOutput: string; context: RunContext; details: { modelResponse: ModelResponse | undefined; output: TurnInput; }; } export type OutputGuardrailFunction< TOutput extends AgentOutputType = TextOutput, TContext = UnknownContext, > = (args: OutputGuardrailFunctionArgs) => Promise; ``` 资料来源：[packages/agents-core/src/guardrail.ts:50-80]() ### Output Guardrail Interface ```typescript export interface OutputGuardrail< TOutput extends AgentOutputType = TextOutput, TContext = UnknownContext, > { name: string; execute: OutputGuardrailFunction; } ``` 资料来源：[packages/agents-core/src/guardrail.ts:80-100]() ### Output Guardrail Definition The SDK provides a builder function to create output guardrail definitions: ```typescript export interface DefineOutputGuardrailArgs< TOutput extends AgentOutputType = TextOutput, TContext = UnknownContext, > { name: string; execute: OutputGuardrailFunction; } export function defineOutputGuardrail< TOutput extends AgentOutputType = TextOutput, TContext = UnknownContext, >({ name, execute }: DefineOutputGuardrailArgs): OutputGuardrailDefinition ``` 资料来源：[packages/agents-core/src/guardrail.ts:100-130]() ## Tool Output Guardrails Tool output guardrails validate the results returned by tool executions before the agent processes them. This is particularly useful for sanitizing tool responses, preventing sensitive data leakage, and enforcing output constraints. ### Interface Definition ```typescript export interface ToolOutputGuardrailFunctionArgs { toolCallId: string; toolName: string; toolOutput: string; context: RunContext; } export type ToolOutputGuardrailFunction = ( args: ToolOutputGuardrailFunctionArgs ) => Promise; ``` 资料来源：[packages/agents-core/src/toolGuardrail.ts:1-30]() ### Definition Creation Tool output guardrails can be created with or without explicit typing: ```typescript export function defineToolOutputGuardrail( config: { name: string; run: ToolOutputGuardrailFunction } ): ToolOutputGuardrailDefinition ``` When guardrails are passed to the runner, they are normalized to the definition format: ```typescript export function toToolOutputGuardrailDefinitions( guardrails?: ToolOutputGuardrailInit[], ): ToolOutputGuardrailDefinition[] { if (!guardrails) { return []; } return guardrails.map((gr) => 'type' in gr && gr.type === 'tool_output' ? (gr as ToolOutputGuardrailDefinition) : defineToolOutputGuardrail(gr as { name: string; run: any }), ); } ``` 资料来源：[packages/agents-core/src/toolGuardrail.ts:30-60]() ## Guardrail Execution in the Runner The runner orchestrates guardrail execution at appropriate points in the agent lifecycle. ### Input Guardrail Execution ```typescript export async function runInputGuardrails( state: RunState>, input: string, inputGuardrailDefs: InputGuardrailDefinition[], ) { if (inputGuardrailDefs.length === 0) { return; } for (const guardrail of inputGuardrailDefs) { const result = await guardrail.run({ agent: state._currentAgent, input, context: state._context, }); if (result.tripwireTriggered) { throw new InputGuardrailTripwireTriggered(guardrail.name); } } } ``` 资料来源：[packages/agents-core/src/runner/guardrails.ts:1-40]() On failure, the current turn is rolled back to enable reruns: ```typescript onError: (error) => { state._currentTurn--; throw new GuardrailExecutionError( `Input guardrail failed to complete: ${error}`, error as Error, state, ); } ``` ### Output Guardrail Execution ```typescript export async function runOutputGuardrails< TContext, TOutput extends AgentOutputType, TAgent extends Agent, >( state: RunState, runnerOutputGuardrails: OutputGuardrailDefinition<...>[], output: string, ) { const runnerGuardrails = runnerOutputGuardrails as OutputGuardrailDefinition<...>[]; const guardrails = runnerGuardrails.concat( state._currentAgent.outputGuardrails.map(defineOutputGuardrail), ); if (guardrails.length === 0) { return; } const agentOutput = state._currentAgent.processFinalOutput(output); const runOutput = getTurnInput([], state._generatedItems, ...); const guardrailArgs: OutputGuardrailFunctionArgs = { agent: state._currentAgent, agentOutput, context: state._context, details: { modelResponse: state._lastTurnResponse, output: runOutput }, }; } ``` 资料来源：[packages/agents-core/src/runner/guardrails.ts:50-100]() ## Guardrail Types Comparison | Property | Input Guardrail | Output Guardrail | Tool Output Guardrail | |----------|-----------------|------------------|----------------------| | **Trigger** | Before agent processes input | After agent generates output | After tool execution | | **Input Access** | Raw user input | Processed agent output | Tool result string | | **Common Use Cases** | Content filtering, auth | Safety checks, PII removal | Data sanitization | | **Failure Action** | Block request | Block/modify output | Block tool result | | **Interface** | `InputGuardrailFunction` | `OutputGuardrailFunction` | `ToolOutputGuardrailFunction` | ## Common Patterns ### Content Safety Pattern ```typescript const contentSafetyGuardrail: OutputGuardrail = { name: 'content-safety', execute: async ({ agentOutput }) => { const isSafe = await checkContentSafety(agentOutput); if (!isSafe) { return { decision: 'block', message: 'Output failed safety review' }; } return { decision: 'allow' }; } }; ``` ### Input Transformation Pattern ```typescript const sanitizationGuardrail: InputGuardrail = { name: 'input-sanitizer', execute: async ({ input }) => { const sanitized = removeSensitiveData(input); return { decision: 'allow' }; } }; ``` ### Tool Output Filtering Pattern ```typescript const toolOutputGuardrail = defineToolOutputGuardrail({ name: 'filter-database-results', run: async ({ toolOutput }) => { const filtered = filterPersonalInformation(toolOutput); return { decision: 'allow' }; } }); ``` ## Integration with Agent Configuration Guardrails can be configured at multiple levels: | Level | Configuration Location | Scope | |-------|----------------------|-------| | **Agent** | `inputGuardrails`, `outputGuardrails` | Applies to all runs of this agent | | **Runner** | `outputGuardrails` parameter | Applies to all agents in the run | ```typescript const agent = Agent.make({ name: 'secure-agent', instructions: '...', inputGuardrails: [/* agent-level input guards */], outputGuardrails: [/* agent-level output guards */], }); // Runner-level guards apply to all agents const result = await runAgent(agent, input, { outputGuardrails: [/* runner-level output guards */] }); ``` ## Error Handling Guardrail failures generate specific error types: | Error Type | Cause | Recovery | |------------|-------|----------| | `InputGuardrailTripwireTriggered` | Input guardrail returned `block` | Request rejected | | `GuardrailExecutionError` | Guardrail threw exception | Rollback and retry possible | The runner implements error handling that preserves state for potential reruns: ```typescript tryOrThrow({ fn: async () => runInputGuardrails(...), errorName: 'input guardrail', onError: (error) => { state._currentTurn--; // Rollback for rerun throw new GuardrailExecutionError(...); } }); ``` ## Best Practices 1. **Fail-safe defaults**: Return `decision: 'allow'` when guardrail cannot complete 2. **Provide feedback**: Include meaningful `message` on block decisions 3. **Order matters**: Place most restrictive guardrails first for early rejection 4. **Avoid side effects**: Guardrails should be idempotent and not modify state 5. **Performance**: Keep guardrail execution lightweight to avoid timeouts 6. **Testing**: Test both allow and block paths for each guardrail ## Source Files Summary | File | Role | |------|------| | `packages/agents-core/src/guardrail.ts` | Core interfaces and type definitions | | `packages/agents-core/src/toolGuardrail.ts` | Tool output guardrail utilities | | `packages/agents-core/src/runner/guardrails.ts` | Runner integration and execution logic | --- ## Handoffs and Multi-Agent Systems ### 相关页面相关主题：[Creating and Running Agents](#agents), [Tools and Tool Use](#tools)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-core/src/handoff.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/handoff.ts) - [packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) - [packages/agents-core/src/runner/modelOutputs.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runner/modelOutputs.ts) - [packages/agents-core/src/lifecycle.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/lifecycle.ts) - [packages/agents-core/src/result.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/result.ts) - [packages/agents-core/src/runState.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/runState.ts) - [packages/agents-realtime/src/realtimeSessionEvents.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/realtimeSessionEvents.ts)

# Handoffs and Multi-Agent Systems ## Overview Handoffs enable one AI agent to transfer control to another agent within a conversation. This mechanism is fundamental to building multi-agent systems where different specialized agents collaborate to handle complex tasks. Unlike tools, which are called and return results to the calling agent, handoffs completely transfer conversational control to the new agent, which then continues the interaction with the user. 资料来源：[packages/agents-core/src/handoff.ts:1-50]() ### Key Characteristics of Handoffs | Characteristic | Description | |----------------|-------------| | **Conversation Transfer** | The new agent receives the full conversation history and takes over the dialogue | | **Control Transfer** | The original agent stops execution; the new agent continues from where the previous left off | | **Bidirectional** | Agents can hand off to each other in any direction, enabling complex workflows | | **Typed Outputs** | Handoffs can carry structured output types from different agents | | **Feature Gating** | Handoffs can be conditionally enabled or disabled based on runtime context | 资料来源：[packages/agents-core/src/agent.ts:1-30]() ## Architecture ### Handoff Flow ```mermaid graph TD A[User Input] --> B[Primary Agent] B -->|Determines handoff needed| C[Handoff Tool Call] C --> D{Handoff Enabled?} D -->|Yes| E[Transfer to Target Agent] D -->|No| F[Continue with Primary Agent] E --> G[Target Agent Receives Full History] G --> H[Target Agent Processes Request] H --> I[Response to User] I --> J[Optionally Handoff Back] J -->|Yes| B ``` 资料来源：[packages/agents-core/src/runner/modelOutputs.ts:1-50]() ### Multi-Agent System Components | Component | Role | Location | |-----------|------|----------| | `Agent` | Base agent class with handoff capabilities | `packages/agents-core/src/agent.ts` | | `Handoff` | Wrapper class for agent-to-agent transfers | `packages/agents-core/src/handoff.ts` | | `HandoffConfig` | Configuration for handoff behavior | `packages/agents-core/src/handoff.ts` | | `RunResultData` | Results containing handoff information | `packages/agents-core/src/result.ts` | ## Core Classes and Types ### The Handoff Class The `Handoff` class wraps an agent and provides the mechanism for transferring control: ```typescript export class Handoff< TContext = UnknownContext, TOutput extends AgentOutputType = TextOutput, > { public agentName: string; public toolName: string; public toolDescription: string; public inputParameters: JsonSchema; public isEnabled: HandoffEnabledFunction = async () => true; constructor( agent: Agent, onInvokeHandoff: (context: RunContext, args: string) => Promise> | Agent, ) { ... } } ``` 资料来源：[packages/agents-core/src/handoff.ts:50-80]() ### Handoff Configuration ```typescript export type HandoffConfig< TInputType extends ToolInputParameters, TContext = UnknownContext, > = { toolNameOverride?: string; toolDescriptionOverride?: string; onHandoff?: OnHandoffCallback; inputType?: TInputType; inputFilter?: HandoffInputFilterFunction; }; ``` 资料来源：[packages/agents-core/src/handoff.ts:100-130]() ### Handoff Configuration Options | Option | Type | Description | |--------|------|-------------| | `toolNameOverride` | `string` | Custom name for the handoff tool | | `toolDescriptionOverride` | `string` | Custom description shown to the model | | `onHandoff` | `OnHandoffCallback` | Callback function executed when handoff occurs | | `inputType` | `ToolInputParameters` | Zod schema for validating handoff input | | `inputFilter` | `HandoffInputFilterFunction` | Function to filter/modify conversation history | ## Handoff Types and Output Union ### Automatic Output Type Inference The system automatically infers the union of output types from all handoff agents: ```typescript type ExtractAgentOutput = T extends Agent ? O : never; type ExtractHandoffOutput = T extends Handoff ? O : never; export type HandoffsOutputUnion< Handoffs extends readonly (Agent | Handoff)[], > = | ExtractAgentOutput | ExtractHandoffOutput; ``` 资料来源：[packages/agents-core/src/agent.ts:180-200]() ### Creating Agents with Handoffs ```typescript export type AgentConfigWithHandoffs< TOutput extends AgentOutputType, Handoffs extends readonly (Agent | Handoff)[], > = { name: string; handoffs?: Handoffs; outputType?: TOutput; } & Partial>, 'name' | 'handoffs' | 'outputType'>>; ``` 资料来源：[packages/agents-core/src/agent.ts:15-30]() ## Lifecycle Events ### Agent Lifecycle Hooks Handoffs trigger specific lifecycle events that can be subscribed to: ```typescript agent_handoff: [ context: RunContext, nextAgent: Agent ]; ``` 资料来源：[packages/agents-core/src/lifecycle.ts:1-30]() ### Real-time Session Events For real-time agents, additional handoff events are available: ```typescript agent_handoff: [ context: RunContext>, fromAgent: AgentWithOrWithoutHistory, toAgent: AgentWithOrWithoutHistory, ]; ``` 资料来源：[packages/agents-realtime/src/realtimeSessionEvents.ts:1-50]() ### Available Lifecycle Events | Event | Parameters | Trigger | |-------|------------|---------| | `agent_start` | `context`, `agent`, `turnInput` | Agent begins processing | | `agent_end` | `context`, `agent`, `output` | Agent completes processing | | `agent_handoff` | `context`, `fromAgent`, `toAgent` | Control transfers to another agent | | `agent_tool_start` | `context`, `agent`, `tool`, `details` | Tool execution begins | | `agent_tool_end` | `context`, `agent`, `tool`, `result`, `details` | Tool execution completes | ## Handoff Resolution ### Tool Call Resolution When the model requests a handoff, the system resolves the tool call: ```typescript function resolveHandoffOrTool( toolCall: HandoffCallItem | FunctionCallItem, handoffMap: Map>, functionMap: Map>, agent: Agent, ): | { type: 'handoff'; handoff: Handoff } | { type: 'function'; tool: FunctionTool } ``` 资料来源：[packages/agents-core/src/runner/modelOutputs.ts:1-50]() ### Resolution Process 1. The tool call name is resolved to handle namespaced tools 2. Both function tools and handoffs are checked for matches 3. Ambiguity is detected when a name matches both a function tool and a handoff 4. The resolved tool or handoff is returned with its type ### Error Handling | Error | Cause | Resolution | |-------|-------|------------| | `Tool not found` | Handoff name not registered | Register handoff in agent configuration | | `Ambiguous dotted tool call` | Name matches both function tool and handoff | Rename one or use explicit namespace | | `Handoff not enabled` | `isEnabled` returns false | Check feature flags or context conditions | ## Using Agents as Tools Agents can be invoked as tools using the `asTool()` method: ```typescript asTool = Agent>( this: TAgent, options: AgentToolOptions, ): AgentTool ``` 资料来源：[packages/agents-core/src/agent.ts:200-250]() ### Agent as Tool Options | Option | Type | Description | |--------|------|-------------| | `toolName` | `string` | Name of the tool (defaults to agent name) | | `toolDescription` | `string` | Description for model guidance | | `customOutputExtractor` | `function` | Extract output from agent result | | `needsApproval` | `boolean \| ToolApprovalFunction` | Require human approval | | `parameters` | `TParameters` | JSON schema for tool input | | `inputBuilder` | `AgentToolInputBuilder` | Transform structured input to agent input | ### Key Differences: Handoffs vs Agent-as-Tool | Aspect | Handoffs | Agent-as-Tool | |--------|----------|---------------| | **Conversation History** | Full history transferred | New agent receives generated input | | **Control Flow** | Original agent stops, new agent takes over | Original agent continues after tool returns | | **Use Case** | Complete task delegation | Subtask execution with return | 资料来源：[packages/agents-core/src/agent.ts:150-180]() ## Handoff Results ### RunResultData Structure When a run includes handoffs, the result contains handoff information: ```typescript export interface RunResultData< TAgent extends Agent, THandoffs extends (Agent | Handoff)[] = any[], > { input: string | AgentInputItem[]; newItems: RunItem[]; rawResponses: ModelResponse[]; lastResponseId: string | undefined; lastAgent: TAgent | undefined; inputGuardrailResults: InputGuardrailResult[]; outputGuardrailResults: OutputGuardrailResult[]; // ... additional properties } ``` 资料来源：[packages/agents-core/src/result.ts:1-50]() ### Tracking Handoffs in State ```typescript handoffs: new Map( currentAgent.handoffs.map((entry) => { if (entry instanceof Agent) { return [entry.name, handoff(entry)]; } return [entry.toolName, entry]; }), ); ``` 资料来源：[packages/agents-core/src/runState.ts:1-50]() ## Multi-Agent Patterns ### Triage Pattern A primary agent analyzes user input and delegates to specialized agents: ```mermaid graph LR A[User Request] --> B[Triage Agent] B -->|Spanish Request| C[Spanish Agent] B -->|Technical Issue| D[Support Agent] B -->|Billing Question| E[Billing Agent] C --> F[Response] D --> F E --> F ``` ### Sequential Delegation Agents hand off in a chain for complex workflows: ```mermaid graph TD A[Input] --> B[Data Collection Agent] B --> C[Analysis Agent] C --> D[Report Generation Agent] D --> E[Review Agent] E -->|Approved| F[Final Output] E -->|Needs Revision| C ``` ### Parallel Handoff with Selection Multiple agents work on the same problem, with selection based on quality: ```mermaid graph TD A[Query] --> B[Agent 1] A --> C[Agent 2] A --> D[Agent 3] B --> E[Evaluator] C --> E D --> E E --> F[Best Response] ``` ## Best Practices ### Designing Agent Responsibilities | Guideline | Description | |-----------|-------------| | **Single Responsibility** | Each agent should have a clear, focused purpose | | **Clear Descriptions** | Use `handoffDescription` to help the triage agent decide | | **Typed Outputs** | Define output types for type-safe result handling | | **Feature Gates** | Use `isEnabled` for conditional handoff availability | ### Error Handling - Always wrap handoff execution in try-catch blocks - Implement fallback handoffs for critical paths - Log handoff failures for debugging ### Performance Considerations - Minimize the number of handoffs in a single conversation - Use `inputFilter` to reduce conversation history transfer size - Cache agent instances when possible ## Real-time Agent Handoffs The real-time agent system extends handoffs for voice applications: ```typescript export type RealtimeAgentConfiguration = Partial< Omit< AgentConfiguration, TextOutput>, | 'model' | 'handoffs' | 'modelSettings' | 'outputType' | 'toolUseBehavior' | 'resetToolChoice' | 'outputGuardrails' | 'inputGuardrails' > > & { name: string; handoffs?: (RealtimeAgent | Handoff, TextOutput>)[]; voice?: string; }; ``` 资料来源：[packages/agents-realtime/src/realtimeAgent.ts:1-40]() ### Voice Handoff Constraints When using handoffs in real-time sessions: - If another agent spoke during the session, changing the voice during a handoff will fail - All RealtimeAgents within a session share the same model - `modelSettings` is not supported for RealtimeAgents --- ## Sandbox Agents Architecture ### 相关页面相关主题：[Sandbox Providers and Extensions](#sandbox-providers), [Creating and Running Agents](#agents)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-core/src/sandbox/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/agent.ts) - [packages/agents-core/src/sandbox/client.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/client.ts) - [packages/agents-core/src/sandbox/capabilities/index.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/capabilities/index.ts) - [packages/agents-core/src/sandbox/runtime/manager.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/runtime/manager.ts) - [packages/agents-core/src/sandbox/manifest.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/manifest.ts) - [docs/src/content/docs/guides/sandbox-agents.mdx](https://github.com/openai/openai-agents-js/blob/main/docs/src/content/docs/guides/sandbox-agents.mdx)

# Sandbox Agents Architecture ## Overview The Sandbox Agents Architecture provides a secure, isolated execution environment for AI agents to safely run code, execute shell commands, and interact with file systems. It extends the core `Agent` class with sandboxed capabilities that isolate agent operations from the host system while maintaining a controlled bridge for communication and data exchange. Sandbox agents are designed to: - **Isolate Execution**: Run untrusted or potentially harmful code in a contained environment - **Secure File Operations**: Provide controlled access to file systems with manifest-based permission management - **Runtime Isolation**: Enforce capability-based security policies that limit what operations an agent can perform - **Preserve Sessions**: Maintain state across agent interactions while respecting security boundaries 资料来源：[packages/agents-core/src/sandbox/agent.ts:1-50]() ## Core Components ### SandboxAgent Class The `SandboxAgent` extends the core `Agent` class and adds sandbox-specific configuration and lifecycle management. ```mermaid classDiagram class Agent { +name: string +instructions: string +tools: Tool[] +run(context, input)* } class SandboxAgent { +defaultManifest?: Manifest +baseInstructions?: SandboxBaseInstructions +capabilities: Capability[] +runAs?: string | SandboxUser +runtimeManifest: Manifest +clone(config): SandboxAgent } Agent <|-- SandboxAgent ``` **Configuration Options** | Parameter | Type | Description | |-----------|------|-------------| | `name` | `string` | Agent identifier | | `instructions` | `string \| function` | System prompt or dynamic instruction generator | | `baseInstructions` | `SandboxBaseInstructions` | Base instructions appended to agent prompts | | `capabilities` | `Capability[]` | Runtime capabilities (file system, network, etc.) | | `runAs` | `string \| SandboxUser` | User identity for sandbox execution | | `defaultManifest` | `Manifest` | Default file system permissions | | `tools` | `Tool[]` | Additional tools available to the agent | 资料来源：[packages/agents-core/src/sandbox/agent.ts:35-80]() **Type Safety** The `baseInstructions` property enforces strict type checking: ```typescript if ( config.baseInstructions !== undefined && typeof config.baseInstructions !== 'string' && typeof config.baseInstructions !== 'function' ) { throw new TypeError( 'SandboxAgent baseInstructions must be a string or function.', ); } ``` 资料来源：[packages/agents-core/src/sandbox/agent.ts:55-62]() ### Manifest System The `Manifest` class defines the file system permissions and access controls for a sandbox environment. Each manifest specifies which paths an agent can read, write, or execute within. **Key Manifest Operations** | Operation | Purpose | |-----------|---------| | `clone()` | Creates a deep copy of the manifest | | `addPermission()` | Adds a new path permission | | `removePermission()` | Removes an existing permission | | `checkAccess()` | Verifies if a path is permitted | The manifest supports multiple manifest root levels, allowing fine-grained control over path translation and access boundaries. 资料来源：[packages/agents-core/src/sandbox/manifest.ts:1-100]() ### Capabilities Capabilities define what operations a sandbox agent can perform at runtime. The capability system follows a deny-by-default model. ```mermaid graph TD A[SandboxAgent] --> B[Capability Manager] B --> C[FileSystem Capability] B --> D[Network Capability] B --> E[Process Capability] B --> F[Environment Capability] C --> G{Read} C --> H{Write} C --> I{Execute} D --> J[HTTP Allowed] D --> K[WebSocket Allowed] ``` **Available Capability Types** | Capability | Description | |------------|-------------| | `FileSystemRead` | Read files from specified paths | | `FileSystemWrite` | Write or modify files | | `FileSystemExecute` | Execute binary files | | `NetworkHttp` | Make HTTP/HTTPS requests | | `NetworkWebSocket` | Establish WebSocket connections | | `ProcessSpawn` | Spawn child processes | 资料来源：[packages/agents-core/src/sandbox/capabilities/index.ts:1-50]() ## Sandbox Client The `SandboxClient` manages the connection to the underlying sandbox runtime. It handles session lifecycle, authentication, and command routing. ```mermaid sequenceDiagram participant Host as Host Application participant Client as SandboxClient participant Runtime as Sandbox Runtime participant Sandbox as Sandbox Environment Host->>Client: createSandbox(config) Client->>Runtime: Initialize Session Runtime->>Sandbox: Create Isolated Environment Sandbox-->>Runtime: Session ID Runtime-->>Client: Connected Client-->>Host: SandboxHandle Host->>Client: executeCommand(cmd) Client->>Sandbox: Run Command Sandbox-->>Client: Output Client-->>Host: Result ``` **Client Configuration** | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `backendId` | `string` | Yes | Sandbox backend identifier | | `capabilities` | `Capability[]` | No | Override agent capabilities | | `manifest` | `Manifest` | No | File system permissions | | `environment` | `Record` | No | Environment variables | 资料来源：[packages/agents-core/src/sandbox/client.ts:1-100]() ## Runtime Manager The `RuntimeManager` orchestrates sandbox sessions, handles session preservation, and manages resource allocation across multiple concurrent sandbox environments. ```mermaid graph TB subgraph RuntimeManager A[Session Pool] --> B[Active Sessions] A --> C[Preserved Sessions] B --> D[Live Sessions] C --> E[Serialized Sessions] end F[Agent Request] --> G[Load Balancer] G --> H{Route} H -->|New| I[Create Session] H -->|Preserved| J[Restore Session] I --> D J --> D ``` **Session Management Features** - **Live Sessions**: Active sandbox instances ready for immediate use - **Preserved Sessions**: Sessions that maintain state across turns - **Session Reuse**: Reuse live sessions within the same run state when appropriate - **Garbage Collection**: Clean up orphaned sessions when run state is destroyed 资料来源：[packages/agents-core/src/sandbox/runtime/manager.ts:1-100]() ## Path Translation Sandbox environments use path translation to maintain security boundaries between the host file system and the sandbox workspace. ```mermaid graph LR subgraph Host A[/workspace/project] end subgraph Sandbox B[/] C[/workspace/project] end A -.->|Manifest Root| C B -.->|Workspace Root| C ``` **Translation Functions** | Function | Purpose | |----------|---------| | `translateRootManifestCommandInput()` | Translates absolute paths in commands to workspace-relative paths | | `translateManifestRootCommandOutput()` | Translates output paths back to host paths | | `translateWorkspaceRootCommandOutput()` | Handles path prefix replacement in command output | The translation system uses regex patterns to safely replace paths while preserving command structure: ```typescript function translateRootManifestCommandInput( command: string, workspaceRootPath: string, ): string { return command.replace( /(^|[\s"'=<>])\/([^\s"'|&;<>(){}]*)/g, (_match, prefix: string, pathSuffix: string) => `${prefix}${workspaceRootPath}/${pathSuffix}`, ); } ``` 资料来源：[packages/agents-core/src/sandbox/sandboxes/unixLocal.ts:20-35]() ## Lifecycle ### Session Initialization 1. Agent receives task with sandbox configuration 2. RuntimeManager checks for preserved sessions 3. If no preserved session, create new sandbox instance 4. Apply manifest and capabilities to sandbox 5. Initialize runtime manifest with workspace bindings ### Command Execution Flow ```mermaid graph TD A[Agent Tool Call] --> B[SandboxClient] B --> C{Path Translation} C --> D[Translate Input Paths] D --> E[Execute in Sandbox] E --> F[Capture Output] F --> G[Translate Output Paths] G --> H[Return to Agent] ``` ### Session Preservation Sessions can be preserved across agent turns when: - `reuseLiveSession` is not explicitly set to `false` - The session's `backendId` matches the active client - The run state is still active ```typescript export function livePreservedOwnedSession(args: { runState: RunState> | undefined; client: SandboxClient; agentKey: string; serializedEntry: SerializedSandboxSessionEntry | undefined; }): LivePreservedOwnedSessionEntry | undefined { if (!args.serializedEntry?.preservedOwnedSession || !args.runState) { return undefined; } // ... validation and return } ``` 资料来源：[packages/agents-core/src/sandbox/runtime/livePreservedSessions.ts:50-75]() ## Security Model ### Defense Layers ```mermaid graph TD A[Host System] --> B[Manifest Permissions] B --> C[Capability Checks] C --> D[Path Translation] D --> E[Sandbox Isolation] E --> F[User Permission] F --> G{Allowed?} G -->|No| H[Access Denied] G -->|Yes| I[Execute] ``` ### Permission Hierarchy | Layer | Check | Failure Action | |-------|-------|----------------| | Manifest | Path in allowed list | Reject before execution | | Capability | Operation permitted | Reject tool invocation | | Path Translation | Valid path transformation | Sanitize or reject | | User Permission | `runAs` identity | Fall back to default user | ## Configuration Example ```typescript import { SandboxAgent, Manifest, Capability } from '@openai/agents-core'; const manifest = new Manifest({ roots: ['/workspace'], permissions: [ { path: '/workspace/**', access: 'rw' }, { path: '/tmp/**', access: 'rw' }, ], }); const agent = new SandboxAgent({ name: 'code-executor', instructions: 'You can execute code and manage files in /workspace', capabilities: [ Capability.fileSystem(), Capability.network({ http: true, webSocket: false }), Capability.process(), ], manifest, baseInstructions: 'Always verify paths are within allowed directories.', }); ``` 资料来源：[docs/src/content/docs/guides/sandbox-agents.mdx:1-100]() ## Error Handling Sandbox operations can fail for several reasons: | Error Type | Cause | Recovery | |------------|-------|----------| | `ManifestNotFoundError` | Manifest file missing | Provide default manifest | | `CapabilityDeniedError` | Operation not permitted | Add required capability | | `PathViolationError` | Access outside manifest | Translate or reject path | | `SessionExpiredError` | Preserved session invalid | Create new session | | `RuntimeConnectionError` | Cannot reach sandbox | Retry with backoff | Error handlers should check `RunErrorKind` and provide appropriate fallbacks or user-facing error messages. 资料来源：[packages/agents-core/src/runner/errorHandlers.ts:1-50]() ## Best Practices ### Session Management - **Preserve selectively**: Only preserve sessions when state needs to persist - **Clean up on completion**: Ensure sessions are released when no longer needed - **Monitor resource usage**: Track session creation to prevent leaks ### Path Handling - **Always validate**: Check paths against manifest before operations - **Use translation utilities**: Never manually construct sandbox paths - **Log path operations**: Track path translations for debugging ### Capability Assignment - **Principle of least privilege**: Grant only required capabilities - **Separate by trust level**: Use different agents with different capabilities - **Audit capability usage**: Log when sensitive capabilities are invoked --- ## Sandbox Providers and Extensions ### 相关页面相关主题：[Sandbox Agents Architecture](#sandbox-architecture)

相关源码文件

以下源码文件用于生成本页说明： - [packages/agents-extensions/src/sandbox/cloudflare/sandbox.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/src/sandbox/cloudflare/sandbox.ts) - [packages/agents-extensions/src/sandbox/e2b/sandbox.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/src/sandbox/e2b/sandbox.ts) - [packages/agents-extensions/src/sandbox/daytona/sandbox.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-extensions/src/sandbox/daytona/sandbox.ts) - [packages/agents-core/src/sandbox/sandboxes/docker.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/sandboxes/docker.ts) - [packages/agents-core/src/sandbox/sandboxes/unixLocal.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/sandbox/sandboxes/unixLocal.ts)

# Sandbox Providers and Extensions The OpenAI Agents SDK provides a flexible sandbox architecture that enables AI agents to execute code and commands in isolated, secure environments. Sandboxes are essential for safely running untrusted or potentially harmful code generated by AI agents, ensuring system security while maintaining developer productivity. ## Overview The sandbox system consists of two primary layers: 1. **Core Sandbox Infrastructure** (`agents-core`): Built-in sandbox implementations including Docker and Unix local execution 2. **Extension Providers** (`agents-extensions`): Third-party sandbox provider integrations including Cloudflare, E2B, and Daytona Sandboxes abstract away the complexity of secure code execution, providing agents with a virtual workspace where file operations, command execution, and environment management occur in isolation from the host system. ## Architecture ```mermaid graph TD A[Agent] --> B[SandboxAgent] B --> C[Sandbox Provider Interface] C --> D[Core Providers] C --> E[Extension Providers] D --> D1[Docker Sandbox] D --> D2[UnixLocal Sandbox] E --> E1[Cloudflare Sandbox] E --> E2[E2B Sandbox] E --> E3[Daytona Sandbox] F[Manifest] --> B G[Capabilities] --> B H[RunAs User] --> B ``` ## Core Components ### SandboxAgent Class The `SandboxAgent` class extends the base `Agent` class and provides sandbox-specific configuration: | Property | Type | Description | |----------|------|-------------| | `defaultManifest` | `Manifest` | Default manifest for workspace structure | | `baseInstructions` | `SandboxBaseInstructions` | Base instructions for sandbox initialization | | `capabilities` | `Capability[]` | List of capabilities enabled for this sandbox | | `runAs` | `string \| SandboxUser` | User or role to run sandbox commands as | | `runtimeManifest` | `Manifest` | Runtime manifest tracking workspace state | 资料来源：[packages/agents-core/src/sandbox/agent.ts:1-50]() **Constructor Validation:** ```typescript if ( config.baseInstructions !== undefined && typeof config.baseInstructions !== 'string' && typeof config.baseInstructions !== 'function' ) { throw new TypeError( 'SandboxAgent baseInstructions must be a string or function.', ); } ``` 资料来源：[packages/agents-core/src/sandbox/agent.ts:20-28]() ### Sandbox Agent Options | Option | Type | Required | Description | |--------|------|----------|-------------| | `name` | `string` | Yes | Name identifier for the sandbox agent | | `instructions` | `string` | No | System prompt/instructions for the agent | | `defaultManifest` | `Manifest` | No | Initial workspace manifest | | `baseInstructions` | `string \| function` | No | Base instructions for sandbox setup | | `capabilities` | `Capability[]` | No | Enabled capabilities (defaults to `Capabilities.default()`) | | `runAs` | `string \| SandboxUser` | No | User identity for sandbox execution | ## Path Translation System Sandbox environments maintain an isolated filesystem. The SDK uses a **path translation system** to map between the host machine's paths and the sandbox's internal workspace paths. ### Translation Strategy ```mermaid graph LR A[Host Path] -->|translateRootManifestCommandInput| B[Sandbox Path] B -->|execute| C[Command in Sandbox] C -->|translateManifestRootCommandOutput| D[Host-Compatible Output] ``` ### Translation Functions The `unixLocal.ts` module provides path translation utilities: **Root Manifest Command Input Translation:** ```typescript function translateRootManifestCommandInput( command: string, workspaceRootPath: string, ): string { return command.replace( /(^|[\s"'=<>])\/([^\s"'|&;<>(){}]*)/g, (_match, prefix: string, pathSuffix: string) => `${prefix}${workspaceRootPath}/${pathSuffix}`, ); } ``` 资料来源：[packages/agents-core/src/sandbox/sandboxes/unixLocal.ts:30-40]() **Manifest Root Command Input Translation:** ```typescript function translateManifestRootCommandInput( command: string, manifestRoot: string, workspaceRootPath: string, ): string { const escapedManifestRoot = escapeRegExp(manifestRoot); const pathPrefix = String.raw`(^|[\s"'=<>])`; const pathSuffix = String.raw`(?=$|[\/\s"'|&;<>(){}])`; return command.replace( new RegExp(`${pathPrefix}${escapedManifestRoot}${pathSuffix}`, 'g'), (_match, prefix: string) => `${prefix}${workspaceRootPath}`, ); } ``` 资料来源：[packages/agents-core/src/sandbox/sandboxes/unixLocal.ts:42-55]() ### Output Path Translation Both root and manifest-level output translations use a shared helper: ```typescript function translateWorkspaceRootCommandOutput( output: string, manifestRoot: string, workspaceRootPath: string, ): string { // Core translation logic shared between input and output } ``` 资料来源：[packages/agents-core/src/sandbox/sandboxes/unixLocal.ts:67-75]() ## Extension Providers The `agents-extensions` package provides integrations with popular sandbox providers. Each provider implements a consistent interface while leveraging provider-specific features. ### Provider Comparison | Provider | Package | Use Case | Key Features | |----------|---------|----------|--------------| | **Cloudflare** | `agents-extensions` | Edge computing, global distribution | Low latency execution, Workers integration | | **E2B** | `agents-extensions` | Code interpretation, AI debugging | Secure VM isolation, filesystem access | | **Daytona** | `agents-extensions` | Development environments | Full IDE capabilities, workspace management | ### Daytona Sandbox Implementation Daytona provides a robust sandbox with advanced file operation capabilities. #### File Write Operations The Daytona sandbox implements secure file writing with workspace escape prevention: ```bash resolved_root=$(realpath -m -- "$root") parent=$(dirname -- "$path") base=$(basename -- "$path") resolved_parent=$(realpath -m -- "$parent") case "$resolved_parent" in "$resolved_root"|"$resolved_root"/*) ;; *) printf "workspace escape: %s\\n" "$resolved_parent" >&2; exit 111 ;; esac ``` 资料来源：[packages/agents-extensions/src/sandbox/daytona/sandbox.ts:50-60]() #### Security Measures | Check | Exit Code | Description | |-------|-----------|-------------| | Workspace Escape | 111 | Prevents writes outside allowed workspace | | Directory Target | 112 | Prevents overwriting directory with file | #### Write Operation Flow ```mermaid graph TD A[Write Request] --> B{Validate Path} B -->|Escape Detected| C[Exit 111] B -->|Is Directory| D[Exit 112] B -->|Valid| E[Create Temp File] E --> F[base64 Decode Content] F --> G[chmod 644] G --> H[Atomic Move to Target] H --> I[Cleanup Trap] ``` #### Temporary File Management ```bash tmp=$(mktemp "$resolved_parent/.openai-agents-write.XXXXXX") cleanup() { rm -f -- "$tmp"; } trap cleanup EXIT HUP INT TERM base64 -d > "$tmp" <<'OPENAI_AGENTS_CONTENT' ${encoded} OPENAI_AGENTS_CONTENT chmod 644 "$tmp" mv -f -- "$tmp" "$target" trap - EXIT ``` 资料来源：[packages/agents-extensions/src/sandbox/daytona/sandbox.ts:55-70]() ### Environment Variables Sandbox environments receive a standardized set of environment variables: | Variable | Description | Example | |----------|-------------|---------| | `HOME` | User home directory | `/home/user` | | `SHELL` | Default shell path | `/bin/bash` | | `TMPDIR` | Temporary directory | `/tmp` | | `PWD` | Current working directory | `/workspace` | 资料来源：[packages/agents-core/src/sandbox/sandboxes/unixLocal.ts:10-20]() ## Docker Sandbox The Docker sandbox provider enables containerized execution with full isolation: ```mermaid graph TD A[Sandbox Request] --> B[Create Container] B --> C[Mount Workspace Volume] C --> D[Execute Commands] D --> E[Translate Paths] E --> F[Return Results] F --> G[Cleanup Container] ``` ### Docker Configuration | Option | Description | |--------|-------------| | `image` | Docker image to use | | `volumes` | Volume mounts for persistent storage | | `network` | Network configuration | | `memory` | Memory limits | | `cpus` | CPU allocation | ## Manifest System The Manifest system tracks workspace structure and enables consistent file operations across sandbox providers. ### Manifest Structure ```typescript interface Manifest { root: string; files: FileEntry[]; directories: DirectoryEntry[]; } ``` ### Clone Behavior When creating a new `SandboxAgent` with modified configuration, manifests are cloned to prevent mutation: ```typescript this.defaultManifest = config.defaultManifest ? cloneManifest(config.defaultManifest) : undefined; ``` 资料来源：[packages/agents-core/src/sandbox/agent.ts:29-32]() ## Usage Patterns ### Basic Sandbox Agent Creation ```typescript import { SandboxAgent } from '@openai/agents-core'; const sandboxAgent = SandboxAgent.create({ name: 'code-executor', instructions: 'Execute code and return results', baseInstructions: 'Initialize workspace at /workspace', capabilities: [Capability.fileWrite, Capability.commandExecution], runAs: 'sandbox-user', }); ``` ### Extension Provider Usage ```typescript import { createCloudflareSandbox } from '@openai/agents-extensions'; import { SandboxAgent } from '@openai/agents-core'; const cloudflareSandbox = createCloudflareSandbox({ name: 'edge-executor', instructions: 'Execute at edge locations', }); const agent = SandboxAgent.create({ ...cloudflareSandbox, capabilities: [Capability.fileWrite], }); ``` ## Best Practices ### Security Considerations 1. **Path Validation**: Always validate paths before execution to prevent workspace escapes 2. **Resource Limits**: Configure memory and CPU limits for sandboxed operations 3. **Cleanup**: Ensure proper trap handlers clean up temporary resources 4. **User Isolation**: Run sandbox operations with minimal privileges ### Performance Optimization 1. **Connection Pooling**: Reuse sandbox instances when possible 2. **Manifest Caching**: Cache manifest states to reduce initialization overhead 3. **Path Translation Caching**: Memoize frequently used path translations ## Error Handling Sandbox operations may fail with specific error codes: | Error Type | Provider | Common Causes | |------------|----------|---------------| | Workspace Escape | Daytona | Path traversal attempt | | Resource Exhaustion | Docker/E2B | Memory/CPU limits exceeded | | Permission Denied | All | Insufficient sandbox user privileges | | Container Failure | Docker | Image not found, network issues | ## Further Reading - [Agent Lifecycle Events](packages/agents-core/src/lifecycle.ts) - [Tool Configuration](packages/agents-core/src/tool.ts) - [Extension Packages](packages/agents-extensions/README.md) - [Agent Patterns Examples](examples/agent-patterns/README.md) --- ## Voice Agents and Realtime Communication ### 相关页面相关主题：[Creating and Running Agents](#agents)

Related Source Files

以下源码文件用于生成本页说明： - [packages/agents-realtime/src/realtimeAgent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/realtimeAgent.ts) - [packages/agents-realtime/src/realtimeSession.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/realtimeSession.ts) - [packages/agents-realtime/src/openaiRealtimeBase.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/openaiRealtimeBase.ts) - [packages/agents-realtime/src/openaiRealtimeWebsocket.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/openaiRealtimeWebsocket.ts) - [packages/agents-realtime/src/realtimeSessionEvents.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/realtimeSessionEvents.ts) - [packages/agents-realtime/src/clientMessages.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-realtime/src/clientMessages.ts) - [packages/agents-core/src/agent.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/agent.ts) - [packages/agents-core/src/handoff.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/handoff.ts) - [packages/agents-core/src/lifecycle.ts](https://github.com/openai/openai-agents-js/blob/main/packages/agents-core/src/lifecycle.ts)

# Voice Agents and Realtime Communication ## Overview The OpenAI Agents SDK provides a comprehensive framework for building voice agents with realtime communication capabilities. This system enables developers to create interactive voice applications that leverage OpenAI's Realtime API for low-latency, bidirectional audio communication. The voice agent system is built on two primary abstractions: 1. **RealtimeAgent** - A specialized agent class designed for voice interactions 2. **RealtimeSession** - A session manager that handles the connection lifecycle, event routing, and audio processing 资料来源：[packages/agents-realtime/src/realtimeAgent.ts:50-60]() ## Architecture ### High-Level Component Architecture ```mermaid graph TD A[Client Application] --> B[RealtimeSession] B --> C[Transport Layer] C --> D[OpenAI Realtime API] B --> E[RealtimeAgent] E --> F[System Prompt] E --> G[Handoffs] B --> H[Event System] H --> I[audio_start] H --> J[audio_stopped] H --> K[audio_interrupted] H --> L[guardrail_tripped] C --> M[WebSocket Transport] C --> N[WebRTC Transport] ``` ### Session Configuration Flow ```mermaid sequenceDiagram participant App as Client App participant Session as RealtimeSession participant Transport as Transport Layer participant API as OpenAI Realtime API App->>Session: create session with config Session->>Transport: initialize transport Transport->>API: establish connection API-->>Transport: connection established Transport-->>Session: emit ready Session->>Session: updateSessionConfig() Session->>App: emit session_started ``` 资料来源：[packages/agents-realtime/src/realtimeSession.ts:100-150]() ## Core Components ### RealtimeAgent The `RealtimeAgent` class extends the base `Agent` class with voice-specific configuration: ```typescript export class RealtimeAgent extends Agent< RealtimeContextData, TextOutput > { readonly voice?: string; } ``` **Configuration Options:** | Option | Type | Description | |--------|------|-------------| | `name` | `string` | The name of your realtime agent | | `instructions` | `string` | System prompt for the agent | | `handoffs` | `RealtimeAgent[] \| Handoff[]` | Agents available for handoff | | `voice` | `string` | Voice identifier for audio output | **Unsupported Configuration Options:** Due to the nature of realtime sessions, the following `Agent` configuration options are not supported: - `model` - All RealtimeAgents use the same model within a session - `modelSettings` - Managed at the session level - `outputType` - Structured outputs are not supported - `toolUseBehavior` - Managed at the session level 资料来源：[packages/agents-realtime/src/realtimeAgent.ts:1-55]() ### RealtimeSession The `RealtimeSession` manages the complete lifecycle of a voice agent session: ```typescript export class RealtimeSession extends EventEmitter> ``` **Transport Selection:** The session automatically selects a transport based on configuration: | Transport Type | Condition | Use Case | |----------------|------------|----------| | `OpenAIRealtimeWebRTC` | `transport === 'webrtc'` or WebRTC supported | Browser-based applications | | `OpenAIRealtimeWebSocket` | `transport === 'websocket'` or undefined | Server-side, non-browser | 资料来源：[packages/agents-realtime/src/realtimeSession.ts:200-230]() ## Transport Layer ### WebSocket Transport The WebSocket transport provides reliable, server-to-server connectivity: ```typescript this.#transport = new OpenAIRealtimeWebSocket(); ``` Key methods: - `sendEvent(event)` - Send raw events to the API - `requestResponse(response?)` - Request a model response - `updateSessionConfig(config)` - Update session configuration 资料来源：[packages/agents-realtime/src/openaiRealtimeWebsocket.ts:50-80]() ### Turn Detection Configuration Turn detection can be configured with multiple parameters: ```typescript interface RealtimeTurnDetectionConfig { type?: string; createResponse?: boolean; eagerness?: 'low' | 'medium' | 'high'; interruptResponse?: boolean; prefixPaddingMs?: number; silenceDurationMs?: number; threshold?: number; idleTimeoutMs?: number; modelVersion?: string; } ``` **Config Normalization:** The SDK automatically converts camelCase to snake_case for API compatibility: | camelCase | snake_case | |-----------|------------| | `createResponse` | `create_response` | | `interruptResponse` | `interrupt_response` | | `prefixPaddingMs` | `prefix_padding_ms` | | `silenceDurationMs` | `silence_duration_ms` | | `idleTimeoutMs` | `idle_timeout_ms` | | `modelVersion` | `model_version` | 资料来源：[packages/agents-realtime/src/openaiRealtimeBase.ts:50-100]() ## Session Configuration ### Configuration Merging Strategy The session configuration is built from multiple sources with the following priority: 1. Session method parameters (highest priority) 2. RealtimeSession constructor options 3. Default values (lowest priority) ```typescript async #getSessionConfig( additionalConfig: Partial = {}, ): Promise> { const overridesConfig = additionalConfig ?? {}; const optionsConfig = this.options.config ?? {}; // Merge logic applies priority } ``` ### Tracing Configuration Tracing can be explicitly controlled: ```typescript const tracingConfig: RealtimeTracingConfig | null = this.options.tracingDisabled ? null : this.options.workflowName ? { workflow_name: this.options.workflowName } : 'auto'; ``` **Tracing Options:** | Option | Behavior | |--------|----------| | `tracingDisabled: true` | Explicitly disable tracing | | `workflowName: string` | Enable tracing with workflow name | | `'auto'` | Automatic tracing configuration | | `groupId` | Group related traces (requires workflowName) | | `traceMetadata` | Custom metadata for traces (requires workflowName) | 资料来源：[packages/agents-realtime/src/realtimeSession.ts:300-350]() ## Events System ### Session Events The `RealtimeSession` emits the following events: | Event | Payload | Description | |-------|---------|-------------| | `audio_start` | `context, agent` | Agent starts generating audio | | `audio_stopped` | `context, agent` | Agent stops generating audio | | `audio` | `TransportLayerAudio` | New audio data available | | `audio_interrupted` | `context, agent` | Audio generation was interrupted | | `guardrail_tripped` | `context, agent, error, details` | Output guardrail triggered | | `mcp_tool_call_completed` | `context, agent, toolCall` | MCP tool finished execution | | `tool_approval_requested` | `context, agent, approvalItem` | Human-in-the-loop approval needed | | `transport_event` | `event` | Raw transport event | | `error` | `error` | Error occurred | | `usage_update` | `usage` | Token usage update | | `mcp_tools_changed` | - | Available MCP tools updated | ### Lifecycle Events ```mermaid stateDiagram-v2 [*] --> Initializing Initializing --> Connecting Connecting --> Connected Connected --> AudioActive AudioActive --> AudioStopped AudioActive --> Interrupted Interrupted --> Connecting AudioStopped --> Connected Connected --> [*] ``` 资料来源：[packages/agents-realtime/src/realtimeSessionEvents.ts:1-60]() ## Handoffs in Voice Agents Voice agents support handoff functionality for seamless transitions between agents: ```typescript export type HandoffConfig< TInputType extends ToolInputParameters, TContext = UnknownContext, > = { toolNameOverride?: string; toolDescriptionOverride?: string; onHandoff?: OnHandoffCallback; }; ``` **Voice-Specific Handoff Behavior:** - Voice changes during handoff will fail if another agent already spoke during the session - The `voice` property can be set per agent but cannot be changed after the first agent speaks 资料来源：[packages/agents-core/src/handoff.ts:50-80]() ## Guardrails ### Output Guardrails Output guardrails can be configured at the session level: ```typescript this.#outputGuardrails = (options.outputGuardrails ?? []).map( defineRealtimeOutputGuardrail, ); this.#outputGuardrailSettings = getRealtimeGuardrailSettings( options.outputGuardrailSettings ?? {}, ); ``` When a guardrail is triggered, the `guardrail_tripped` event is emitted: ```typescript this.emit('guardrail_tripped', context, agent, error, { itemId }); ``` ## MCP Tool Integration ### Tool Call Handling MCP (Model Context Protocol) tools are automatically integrated: ```typescript this.#transport.on('mcp_tool_call_completed', (toolCall) => { this.emit('mcp_tool_call_completed', context, currentAgent, toolCall); if (this.#automaticallyTriggerResponseForMcpToolCalls) { if (this.#transport.requestResponse) { this.#transport.requestResponse(); } else { this.#transport.sendEvent({ type: 'response.create' }); } } }); ``` **Configuration Option:** | Option | Type | Default | Description | |--------|------|---------|-------------| | `automaticallyTriggerResponseForMcpToolCalls` | `boolean` | `true` | Auto-trigger response after tool completion | ## Usage Examples ### Basic Voice Agent Setup ```typescript import { RealtimeAgent, RealtimeSession } from '@openai/agents-realtime'; const agent = new RealtimeAgent({ name: 'my-agent', instructions: 'You are a helpful assistant that can answer questions and help with tasks.', }); const session = new RealtimeSession(agent); session.on('audio', (event) => { // Handle audio data console.log('Audio received:', event); }); session.on('audio_interrupted', (context, agent) => { // Stop audio playback console.log('Audio interrupted'); }); await session.start(); ``` ### Next.js Integration The SDK provides examples for Next.js integration at `/examples/realtime-next`: ```bash pnpm examples:realtime-next ``` Available endpoints: - `/` - WebRTC voice demo - `/websocket` - WebSocket voice demo - `/raw-client` - Low-level WebRTC example using `OpenAIRealtimeWebRTC` 资料来源：[examples/realtime-next/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/realtime-next/README.md) ### Vite Demo Application A standalone demo is available at `/examples/realtime-demo`: ```bash pnpm -F realtime-demo generate-token pnpm examples:realtime-demo ``` 资料来源：[examples/realtime-demo/README.md](https://github.com/openai/openai-agents-js/blob/main/examples/realtime-demo/README.md) ## Type Definitions ### Context Data Structure ```typescript interface RealtimeContextData { history: Message[]; usage: UsageData; // Additional user-defined context } ``` ### Session Configuration ```typescript interface RealtimeSessionConfig { model?: string; modalities?: ('text' | 'audio')[]; instructions?: string; voice?: string; audio?: { input?: { format?: string; noiseReduction?: boolean; transcription?: TranscriptionConfig; turnDetection?: RealtimeTurnDetectionConfig; }; output?: { format?: string; voice?: string; }; }; turnDetection?: RealtimeTurnDetectionConfig; } ``` ## Best Practices ### Connection Management 1. **Always handle errors**: Subscribe to the `error` event to catch and handle connection issues 2. **Implement reconnection logic**: The session should be recreated if the connection is lost 3. **Monitor audio state**: Track `audio_start` and `audio_stopped` events for proper playback management ### Voice Configuration 1. **Set voice early**: Configure the voice in the agent or session configuration before the first agent speaks 2. **Handle handoff failures**: Implement fallback logic when voice changes fail during handoffs 3. **Test with different voices**: Use the `voice` property to find the best voice for your use case ### Performance Considerations 1. **Audio buffer management**: Handle `audio` events efficiently to prevent latency 2. **Event cleanup**: Remove event listeners when the session is terminated 3. **Use appropriate transport**: Choose WebRTC for browser applications, WebSocket for server-side 资料来源：[packages/agents-realtime/src/realtimeSession.ts:150-200]() ## See Also - [Basic Examples](../examples/basic) - Simple script demonstrations - [AI SDK UI Integration](../examples/ai-sdk-ui) - Streaming text responses - [Agent Lifecycle](../packages/agents-core/src/lifecycle.ts) - Understanding agent lifecycle events --- --- ## Doramagic 踩坑日志项目：openai/openai-agents-js 摘要：发现 7 个潜在踩坑项，其中 0 个为 high/blocking；最高优先级：身份坑 - 仓库名和安装名不一致。 ## 1. 身份坑 · 仓库名和安装名不一致 - 严重度：medium - 证据强度：runtime_trace - 发现：仓库名 `openai-agents-js` 与安装入口 `@openai/agents` 不完全一致。 - 对用户的影响：用户照着仓库名搜索包或照着包名找仓库时容易走错入口。 - 建议检查：在 npm/PyPI/GitHub 上确认包名映射和官方 README 说明。 - 复现命令：`npm install @openai/agents` - 防护动作：页面必须同时展示 repo 名和真实安装入口，避免用户搜索错包。 - 证据：identity.distribution | github_repo:993521808 | https://github.com/openai/openai-agents-js | repo=openai-agents-js; install=@openai/agents ## 2. 能力坑 · 能力判断依赖假设 - 严重度：medium - 证据强度：source_linked - 发现：README/documentation is current enough for a first validation pass. - 对用户的影响：假设不成立时，用户拿不到承诺的能力。 - 建议检查：将假设转成下游验证清单。 - 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。 - 证据：capability.assumptions | github_repo:993521808 | https://github.com/openai/openai-agents-js | README/documentation is current enough for a first validation pass. ## 3. 维护坑 · 维护活跃度未知 - 严重度：medium - 证据强度：source_linked - 发现：未记录 last_activity_observed。 - 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。 - 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。 - 防护动作：维护活跃度未知时，推荐强度不能标为高信任。 - 证据：evidence.maintainer_signals | github_repo:993521808 | https://github.com/openai/openai-agents-js | last_activity_observed missing ## 4. 安全/权限坑 · 下游验证发现风险项 - 严重度：medium - 证据强度：source_linked - 发现：no_demo - 对用户的影响：下游已经要求复核，不能在页面中弱化。 - 建议检查：进入安全/权限治理复核队列。 - 防护动作：下游风险存在时必须保持 review/recommendation 降级。 - 证据：downstream_validation.risk_items | github_repo:993521808 | https://github.com/openai/openai-agents-js | no_demo; severity=medium ## 5. 安全/权限坑 · 存在评分风险 - 严重度：medium - 证据强度：source_linked - 发现：no_demo - 对用户的影响：风险会影响是否适合普通用户安装。 - 建议检查：把风险写入边界卡，并确认是否需要人工复核。 - 防护动作：评分风险必须进入边界卡，不能只作为内部分数。 - 证据：risks.scoring_risks | github_repo:993521808 | https://github.com/openai/openai-agents-js | no_demo; severity=medium ## 6. 维护坑 · issue/PR 响应质量未知 - 严重度：low - 证据强度：source_linked - 发现：issue_or_pr_quality=unknown。 - 对用户的影响：用户无法判断遇到问题后是否有人维护。 - 建议检查：抽样最近 issue/PR，判断是否长期无人处理。 - 防护动作：issue/PR 响应未知时，必须提示维护风险。 - 证据：evidence.maintainer_signals | github_repo:993521808 | https://github.com/openai/openai-agents-js | issue_or_pr_quality=unknown ## 7. 维护坑 · 发布节奏不明确 - 严重度：low - 证据强度：source_linked - 发现：release_recency=unknown。 - 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。 - 建议检查：确认最近 release/tag 和 README 安装命令是否一致。 - 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。 - 证据：evidence.maintainer_signals | github_repo:993521808 | https://github.com/openai/openai-agents-js | release_recency=unknown