Doramagic Project Pack · Human Manual
phoenix
Related topics: System Architecture
Project Overview
Related topics: System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture
Project Overview
Phoenix is an open-source observability platform for AI applications developed by Arize AI. It provides comprehensive tracing, evaluation, and debugging capabilities for LLM-powered applications, agents, and vector retrieval systems.
What is Phoenix?
Phoenix is a self-hostable observability platform designed to help developers:
- Trace LLM applications: Capture and analyze spans, prompts, completions, and tool executions
- Evaluate AI outputs: Run automated evaluations for hallucinations, relevance, toxicity, and custom metrics
- Debug retrievals: Inspect vector search operations and document retrieval pipelines
- Monitor performance: Track latency, token usage, and cost metrics across AI workflows
Sources: js/packages/phoenix-config/README.md
Architecture Overview
Phoenix follows a multi-layer architecture with Python backend services and TypeScript/JavaScript client libraries.
graph TD
A[AI Application] -->|OTel Traces| B[Phoenix OTEL]
A -->|Direct API| C[Phoenix Client SDK]
B -->|Traces| D[Phoenix Server]
C -->|Data| D
D -->|UI| E[Phoenix Web UI]
F[Phoenix MCP Server] -->|Tools| A
G[Phoenix Evals] -->|Evaluations| DCore Components
| Component | Technology | Purpose |
|---|---|---|
| Phoenix Server | Python | Backend API and data storage |
| Phoenix OTEL | Python | OpenTelemetry instrumentation |
| Phoenix Client | TypeScript | Type-safe API access |
| Phoenix Config | TypeScript | Environment variable parsing |
| Phoenix MCP | TypeScript | Model Context Protocol server |
| Phoenix Evals | TypeScript | LLM-based evaluation library |
Sources: js/pnpm-workspace.yaml
JavaScript/TypeScript Packages
Phoenix provides a comprehensive TypeScript ecosystem organized as a pnpm workspace.
Sources: js/pnpm-workspace.yaml
Package Structure
js/
├── packages/
│ ├── phoenix-client/ # Core API client
│ ├── phoenix-config/ # Environment variable utilities
│ ├── phoenix-evals/ # Evaluation library
│ └── phoenix-mcp/ # Model Context Protocol server
└── examples/
└── apps/
└── cli-agent-starter-kit/ # Example CLI agent
phoenix-config
Shared configuration parsing utilities used across other Phoenix packages. Provides typed helpers for reading Phoenix environment variables.
Sources: js/packages/phoenix-config/README.md
phoenix-evals
A vendor-agnostic TypeScript evaluation library for assessing AI output quality. Supports custom classifiers for hallucination detection, relevance scoring, and binary/multi-class classification tasks.
import { createClassifier } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";
const model = openai("gpt-4o-mini");
const classifier = createClassifier({ model, promptTemplate });
Sources: js/packages/phoenix-evals/README.md
phoenix-mcp
A Model Context Protocol server that exposes Phoenix tools for AI agents. Enables agentic workflows to interact with Phoenix data.
Supported Tools:
- Prompts: list-prompts, get-prompt, get-latest-prompt, upsert-prompt
- Projects: Project management operations
- Traces: Query and analyze traces
Sources: js/packages/phoenix-mcp/README.md
Python SDK
Phoenix provides Python instrumentation through the arize-phoenix-otel package.
Sources: app/src/components/project/PythonProjectGuide.tsx
Installation
pip install arize-phoenix-otel
Quick Start
from phoenix.otel import register
# Configure Phoenix tracing
tracer_provider = register(project_name="my-app")
The arize-phoenix-otel package automatically picks up configuration from environment variables, simplifying the setup process for developers.
Sources: app/src/components/project/PythonProjectGuide.tsx
Environment Variables
Phoenix uses standardized environment variables for configuration across all SDKs.
Sources: js/packages/phoenix-config/README.md
| Variable | Constant | Description |
|---|---|---|
PHOENIX_HOST | ENV_PHOENIX_HOST | Phoenix server URL (e.g., http://localhost:6006) |
PHOENIX_API_KEY | ENV_PHOENIX_API_KEY | API key for authentication |
PHOENIX_CLIENT_HEADERS | ENV_PHOENIX_CLIENT_HEADERS | JSON-encoded custom headers |
PHOENIX_COLLECTOR_ENDPOINT | ENV_PHOENIX_COLLECTOR_ENDPOINT | OTel collector endpoint |
PHOENIX_PORT | ENV_PHOENIX_PORT | HTTP port (integer) |
PHOENIX_GRPC_PORT | ENV_PHOENIX_GRPC_PORT | gRPC port for OpenTelemetry |
PHOENIX_PROJECT | ENV_PHOENIX_PROJECT | Default project name |
Project Onboarding Flow
Phoenix provides an interactive onboarding system that guides users through setup.
Sources: app/src/pages/project/OnboardingSteps.tsx
Onboarding Steps Component
The OnboardingSteps component accepts the following parameters:
interface OnboardingStepsProps {
language: ProgrammingLanguage;
packages: readonly string[];
implementationCode: string;
docsHref?: string;
githubHref?: string;
generatedApiKey: string | null;
onApiKeyGenerated: (key: string) => void;
extraEnvVars?: readonly EnvVar[];
}
Workflow
graph LR
A[Select Language] --> B[Install Packages]
B --> C[Configure Environment]
C --> D{Auth Enabled?}
D -->|Yes| E[Generate API Key]
D -->|No| F[Add Environment Variables]
E --> G[Copy Setup Code]
F --> G
G --> H[Run Application]
H --> I[View Traces in Phoenix]The onboarding system automatically detects authentication requirements and adjusts the setup flow accordingly. Users can generate API keys directly from the UI when authentication is enabled.
Sources: app/src/components/project/PythonProjectGuide.tsx
CLI Agent Starter Kit
Phoenix includes a complete CLI agent example demonstrating production-ready patterns.
Sources: js/examples/apps/cli-agent-starter-kit/README.md
Project Structure
src/
├── cli.ts # Entry point
├── agent/ # Agent factory
├── tools/ # Tool definitions
│ ├── index.ts # Tool exports
│ ├── datetime.ts # Utility tool
│ └── mcp.ts # Phoenix docs MCP
├── prompts/ # System instructions
└── ui/ # CLI interface
Requirements
- Node.js 22+
- pnpm
- Docker Desktop
- Anthropic API key
Quick Start
pnpm install
cp .env.example .env
# Add ANTHROPIC_API_KEY to .env
pnpm dev
Phoenix UI will be available at http://localhost:6006.
Sources: js/examples/apps/cli-agent-starter-kit/README.md
Integration Ecosystem
Phoenix supports a wide range of LLM providers and frameworks through its integration ecosystem.
Supported Providers
| Provider | Icon | Category |
|---|---|---|
| OpenAI | SVG | LLM |
| Anthropic | SVG | LLM |
| LiteLLM | SVG | Proxy |
| OpenRouter | SVG | Proxy |
| LangGraph | SVG | Framework |
| Moonshot | SVG | LLM |
| xAI | SVG | LLM |
| Ollama | SVG | Local |
Sources: app/src/components/project/IntegrationIcons.tsx
Development Workflow
Setting Up the JavaScript Workspace
# From the /js/ directory
pnpm install
pnpm build
Development Mode
pnpm dev
Building
pnpm build
Debugging MCP Server
pnpm inspect
Sources: js/packages/phoenix-mcp/README.md
Summary
Phoenix is a comprehensive observability platform that bridges the gap between development and production monitoring for AI applications. Its multi-language support (Python and TypeScript), OpenTelemetry-native architecture, and extensible evaluation framework make it suitable for teams of all sizes building LLM-powered products.
The platform's modular design allows developers to adopt only the components they need—whether that's basic tracing through OTEL, custom evaluations with the evals library, or full agentic observability through the MCP server.
Sources: js/packages/phoenix-config/README.md
System Architecture
Related topics: Project Overview, Server API and GraphQL, Frontend Application
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview, Server API and GraphQL, Frontend Application
System Architecture
Overview
Phoenix is an LLM observability platform designed to help developers trace, evaluate, and debug AI applications. The system architecture follows a client-server model with support for multiple programming languages and observability standards.
Core Components
The Phoenix platform consists of three primary layers:
| Component Layer | Description | Key Technologies |
|---|---|---|
| Frontend Application | React-based web UI for visualization and interaction | React, TypeScript, @arizeai/ui |
| Configuration Library | Shared utilities for environment parsing | TypeScript, npm package (@arizeai/phoenix-config) |
| Backend Server | Phoenix server for data ingestion and serving | Python, FastAPI, OpenTelemetry |
Configuration System
Phoenix uses environment variables for configuration across all client SDKs and the server itself.
Environment Variables Table
| Variable | Constant | Type | Purpose |
|---|---|---|---|
PHOENIX_HOST | ENV_PHOENIX_HOST | string | Phoenix server host URL (e.g., http://localhost:6006) |
PHOENIX_API_KEY | ENV_PHOENIX_API_KEY | string | API key for authentication |
PHOENIX_CLIENT_HEADERS | ENV_PHOENIX_CLIENT_HEADERS | JSON | Custom headers for client requests |
PHOENIX_COLLECTOR_ENDPOINT | ENV_PHOENIX_COLLECTOR_ENDPOINT | string | OpenTelemetry collector endpoint URL |
PHOENIX_PORT | ENV_PHOENIX_PORT | integer | Phoenix HTTP port |
PHOENIX_GRPC_PORT | ENV_PHOENIX_GRPC_PORT | integer | Phoenix gRPC port for OpenTelemetry |
PHOENIX_PROJECT | ENV_PHOENIX_PROJECT | string | Default project name for project-scoped operations |
Sources: js/packages/phoenix-config/README.md:1-25
Client Architecture
Multi-Language SDK Support
Phoenix provides language-specific client libraries that interface with the Phoenix server.
graph TD
A[Application Code] --> B[Phoenix OTEL / SDK]
B --> C[Phoenix Server]
C --> D[(Data Storage)]
subgraph Python Ecosystem
B1[arize-phoenix-otel]
B1 --> B
end
subgraph TypeScript Ecosystem
B2[phoenix-otel]
B3[phoenix-client]
B2 --> B
B3 --> B
endPython Client
The Python integration uses OpenTelemetry for automatic instrumentation.
Installation:
pip install arize-phoenix-otel
The arize-phoenix-otel package automatically picks up configuration from environment variables, enabling seamless integration without explicit setup code in most cases.
Sources: app/src/components/project/PythonProjectGuide.tsx:1-35
TypeScript/Node.js Client
The TypeScript ecosystem provides two main packages:
| Package | Purpose |
|---|---|
@arizeai/phoenix-otel | OpenTelemetry instrumentation for tracing |
@arizeai/phoenix-client | Client library for API interactions |
Quick Start Pattern:
import { register } from '@arizeai/phoenix-otel';
// Project setup with automatic OTEL initialization
register({ projectName: 'my-project' });
Sources: app/src/components/project/TypeScriptProjectGuide.tsx:1-25
Authentication Architecture
Phoenix implements API key-based authentication with role-based access control.
Authentication Flow
sequenceDiagram
participant Client
participant PhoenixServer
participant Config
Client->>PhoenixServer: Request with API Key
PhoenixServer->>Config: Check authentication settings
Config-->>PhoenixServer: authenticationEnabled: boolean
alt Authentication Enabled
PhoenixServer->>PhoenixServer: Validate API Key
alt Valid Key
PhoenixServer-->>Client: 200 OK + Data
else Invalid Key
PhoenixServer-->>Client: 401 Unauthorized
end
else Authentication Disabled
PhoenixServer-->>Client: 200 OK + Data
endAPI Key Management
The UI provides different API key management interfaces based on user roles:
| Role | API Key Management Location |
|---|---|
| Admin | Settings → General |
| Regular User | Profile Page |
Sources: app/src/components/project/OnboardingSteps.tsx:30-55
Dataset Management
Phoenix clients can interact with datasets through the Python and TypeScript client libraries.
Dataset Operations
| Operation | Description |
|---|---|
| Create | Initialize a new dataset in the project |
| Get | Retrieve dataset by name or version |
| Update | Modify existing dataset metadata |
| List | Enumerate all available datasets |
Python Client Pattern:
client = Client()
dataset = client.datasets.get_dataset(
dataset="my-dataset",
version_id="optional-version-id"
)
Sources: app/src/components/experiment/RunExperimentCodeDialog.tsx:1-45
Generative AI Provider Integration
Phoenix integrates with multiple generative AI providers for observability and tracing.
Supported Providers
| Provider | SVG Icon | Integration Type |
|---|---|---|
| OpenAI | ✓ | API Key + OTEL |
| Moonshot | ✓ | API Key |
| LiteLLM | ✓ | Unified API |
| Agno | ✓ | Agent Framework |
| OpenRouter | ✓ | Gateway |
| Anthropic | ✓ | Direct |
The GenerativeProviderIcon.tsx component renders provider-specific SVG icons throughout the UI, while the IntegrationIcons.tsx file contains SVG definitions for all supported integrations.
Sources: app/src/components/generative/GenerativeProviderIcon.tsx:1-60
UI Component Architecture
Component Hierarchy
graph TD
subgraph Project Guides
PG1[PythonProjectGuide]
PG2[TypeScriptProjectGuide]
end
subgraph Core Components
OC[OnboardingSteps]
MD[Markdown Components]
IC[Icons]
end
subgraph Experiment Features
RX[RunExperimentCodeDialog]
end
OC --> PG1
OC --> PG2
IC --> PG1
IC --> PG2
MD --> RXMarkdown Rendering
The streamdownComponents.tsx module provides custom React components for rendering markdown content:
| Component | Purpose |
|---|---|
li | Task list items with checkbox support |
blockquote | Styled quote blocks |
inlineCode | Inline code styling |
table, thead, tbody, tr, th, td | Table elements |
img | Styled images |
hr | Horizontal rules |
Sources: app/src/components/markdown/streamdownComponents.tsx:1-55
Icon System
Phoenix uses a centralized icon system defined in Icons.tsx. Icons follow a consistent design pattern:
export const IconName = () => (
<svg
width="24"
height="24"
viewBox="0 0 24 24"
fill="none"
xmlns="http://www.w3.org/2000/svg"
>
{/* SVG path definitions */}
</svg>
);
Key icon categories include:
- Navigation: ArrowCompareOutline, MoonOutline
- Actions: TemplateOutline
- Status: FireOutline
Sources: app/src/components/core/icon/Icons.tsx:1-120
Documentation Architecture
Phoenix uses Sphinx for API documentation generation with autodoc support.
Documentation Build Process
graph LR
A[Source Modules] -->|sphinx-apidoc| B[RST Files]
B -->|autodoc| C[Docstrings]
C --> D[HTML/Markdown Output]
subgraph Build Tools
B1[Sphinx]
B2[ReadTheDocs]
endSphinx-Apidoc Command
sphinx-apidoc -o ./source/output ../path/to/module --separate -M
Key options:
--separate: Creates separate .rst files per module-M: Use module names instead of file names for titles
Sources: api_reference/README.md:1-80
OpenTelemetry Integration
Phoenix leverages OpenTelemetry as the standard observability framework:
Collector Endpoint Configuration
graph LR
A[Application] -->|OTLP| B[Phoenix Collector]
B --> C[Phoenix Server]
subgraph Transport Protocols
G[gRPC]
H[HTTP/protobuf]
end
B --- G
B --- H| Port Type | Purpose |
|---|---|
| HTTP (configurable) | OTLP HTTP receiver |
| gRPC (configurable) | OTLP gRPC receiver |
Sources: js/packages/phoenix-config/README.md:15-16
Security Considerations
Environment Variable Security
| Variable | Sensitivity | Recommendation |
|---|---|---|
PHOENIX_API_KEY | High | Store in secure secret manager |
PHOENIX_CLIENT_HEADERS | Medium | Validate JSON structure |
PHOENIX_HOST | Low | Ensure HTTPS for production |
API Key Scopes
- Personal API Keys: Created per-user, managed on Profile page
- System API Keys: Admin-managed, configured in Settings
Summary
The Phoenix system architecture provides a robust, multi-language observability platform with:
- Flexible Configuration: Environment-based configuration works across all SDKs
- Multi-language Support: Native Python and TypeScript clients
- Standard Observability: OpenTelemetry integration for vendor-neutral tracing
- Role-based Access: Enterprise-ready authentication and authorization
- Extensible Providers: Support for multiple LLM providers through standardized interfaces
Tracing System
Related topics: OpenTelemetry Integration, Database Models and Migrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: OpenTelemetry Integration, Database Models and Migrations
Tracing System
Phoenix's Tracing System provides comprehensive observability for LLM applications by capturing, storing, and analyzing execution traces. It leverages OpenTelemetry standards to instrument applications and provides both Python and TypeScript clients for interacting with trace data.
Overview
The tracing system enables developers to:
- Capture detailed execution traces from LLM applications
- Store and query traces with filtering by time, session, and project
- Add annotations for evaluation and human feedback
- Analyze spans within traces for debugging and performance optimization
Sources: js/packages/phoenix-client/README.md:1-30
Architecture
graph TD
A[Application Code] --> B[Phoenix OTEL]
B --> C[Span Exporter]
C --> D[Phoenix Server]
D --> E[(SQLite/PostgreSQL)]
F[Python Client] --> D
G[TypeScript Client] --> D
H[Spans] --> E
I[Trace Annotations] --> E
J[Span Annotations] --> E
K[Session Annotations] --> EThe system consists of three main layers:
- Instrumentation Layer: OpenTelemetry-based tracing via
arize-phoenix-otel - Storage Layer: Database insertion handlers for traces, spans, and annotations
- API Layer: REST endpoints for querying and managing trace data
Sources: js/packages/phoenix-otel/README.md:1-50
Core Components
Spans
Spans represent individual units of work within a trace. Each span captures:
- Timing information: start time, end time, duration
- Input/output data: prompts, responses, tool parameters
- Attributes: metadata like model name, token counts, latency
Sources: src/phoenix/db/insertion/span.py
Trace Annotations
Annotations attached at the trace level for evaluation purposes. Used to store:
- Correctness scores
- Custom evaluation results
- Human feedback
Sources: src/phoenix/db/insertion/trace_annotation.py
Span Annotations
Annotations attached to individual spans for fine-grained evaluation:
from phoenix.client import Client
client = Client()
# Add annotation to a span
client.spans.add_span_annotation(
span_id="span-123",
project_identifier="my-llm-app",
name="correctness",
value=0.95,
annotator_kind="LLM"
)
Sources: src/phoenix/db/insertion/span_annotation.py
Session Annotations
Annotations at the session level for grouping related spans:
Sources: src/phoenix/db/insertion/session_annotation.py
Trace Retention Policy
Data loaders manage trace retention policies per project:
Sources: src/phoenix/server/api/dataloaders/trace_retention_policy_id_by_project_id.py
Configuration
Environment Variables
| Variable | Constant | Description | Example |
|---|---|---|---|
PHOENIX_HOST | ENV_PHOENIX_HOST | Phoenix server host URL | http://localhost:6006 |
PHOENIX_API_KEY | ENV_PHOENIX_API_KEY | API key for authentication | your-api-key |
PHOENIX_CLIENT_HEADERS | ENV_PHOENIX_CLIENT_HEADERS | JSON-encoded custom headers | {"X-Custom":"value"} |
PHOENIX_COLLECTOR_ENDPOINT | ENV_PHOENIX_COLLECTOR_ENDPOINT | OTel collector endpoint URL | https://app.phoenix.arize.com/s/space |
PHOENIX_PROJECT_NAME | ENV_PHOENIX_PROJECT | Default project name | my-llm-app |
PHOENIX_PORT | ENV_PHOENIX_PORT | HTTP port (integer) | 6006 |
PHOENIX_GRPC_PORT | ENV_PHOENIX_GRPC_PORT | gRPC port for OTEL | 4317 |
Sources: js/packages/phoenix-config/README.md:1-50
Python Client Configuration
from phoenix.client import Client
# Environment-based configuration
client = Client()
# Explicit configuration
client = Client(
host="http://localhost:6006",
api_key="your-api-key"
)
Sources: js/packages/phoenix-client/README.md:50-100
OTEL Registration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
projectName | string | "default" | Project name for organizing traces |
url | string | "http://localhost:6006" | Phoenix instance URL |
apiKey | string | undefined | API key for authentication |
headers | Record<string, string> | {} | Custom headers for OTLP requests |
batch | boolean | true | Use batch span processing |
instrumentations | Instrumentation[] | undefined | OpenTelemetry instrumentations |
Sources: js/packages/phoenix-otel/README.md:80-120
Querying Traces
Python Client API
from phoenix.client import Client
from datetime import datetime, timedelta
client = Client()
# Get latest traces
traces = client.traces.get_traces(
project_identifier="my-llm-app",
limit=10
)
# Filter by time range with span details
traces = client.traces.get_traces(
project_identifier="my-llm-app",
start_time=datetime.now() - timedelta(hours=24),
end_time=datetime.now(),
include_spans=True,
sort="latency_ms",
order="desc"
)
# Filter by session
traces = client.traces.get_traces(
project_identifier="my-llm-app",
session_id="my-session-id"
)
Sources: js/packages/phoenix-client/README.md:100-150
TypeScript Client API
import { getTraces } from "@arizeai/phoenix-client/traces";
const result = await getTraces({
project: { projectName: "my-project" },
limit: 10,
});
const detailed = await getTraces({
project: { projectName: "my-project" },
startTime: "2026-03-01T00:00:00Z",
endTime: new Date(),
includeSpans: true,
sort: "latency_ms",
order: "desc"
});
Query Parameters
| Parameter | Type | Default | Description | ||
|---|---|---|---|---|---|
project_identifier | string | — | Project name or ID (required) | ||
start_time | `datetime \ | None` | None | Inclusive lower bound on trace start time | |
end_time | `datetime \ | None` | None | Exclusive upper bound on trace start time | |
sort | `"start_time" \ | "latency_ms" \ | None` | None | Sort field |
order | `"asc" \ | "desc" \ | None` | None | Sort direction |
include_spans | bool | False | Include full span details | ||
session_id | `str \ | Sequence[str] \ | None` | None | Filter by session ID(s) |
limit | int | 100 | Maximum traces to return | ||
timeout | `int \ | None` | 60 | Request timeout in seconds |
Note: Requires Phoenix server >= 13.15.0.
Sources: js/packages/phoenix-client/README.md:150-200
Instrumentation
Basic Setup (Python)
from phoenix.otel import register
tracer_provider = register(
project_name="my-llm-app",
auto_instrument=True, # Auto-trace AI/ML libraries
batch=True, # Background batching
api_key="your-api-key", # Authentication
endpoint="https://app.phoenix.arize.com/s/your-space"
)
Using Decorators
from phoenix.otel import register
tracer_provider = register()
# Get a tracer for manual instrumentation
tracer = tracer_provider.get_tracer(__name__)
@tracer.chain
def process_data(data):
return data + " processed"
@tracer.tool
def weather(location):
return "sunny"
Sources: js/packages/phoenix-otel/README.md:50-80
LangChain Integration
import { register, traceChain } from "@arizeai/phoenix-otel";
const provider = register({
projectName: "my-app",
});
const answerQuestion = traceChain(
async (question: string) => `Handled: ${question}`,
{ name: "answer-question" }
);
await answerQuestion("What is Phoenix?");
await provider.shutdown();
Sources: js/examples/apps/langchain-quickstart/README.md:50-80
Data Flow
graph LR
A[Application] -->|OpenTelemetry| B[Phoenix OTEL]
B --> C[Span Processor]
C --> D[Batch Span Processor]
D --> E[HTTPSpanExporter]
E --> F[Phoenix Server API]
F --> G[DB Insertion Handlers]
G --> H[(Database)]
I[Query Client] --> F
J[Annotations] --> GTrace Visualization
When viewing traces in Phoenix UI, each trace displays:
- LangGraph (agent) span with input messages and final output
- Tool calls with their parameters and results
- Token usage and latency metrics
- Prompts and responses for each span
After running evaluations, span annotations appear on the relevant spans (e.g., correctness / custom_correctness).
Sources: js/examples/apps/langchain-quickstart/README.md:20-45
Integration with Phoenix Client
Dataset Operations with Traces
from phoenix.client import Client
client = Client()
# Get dataset for evaluation
dataset = client.datasets.get_dataset(
dataset="my-dataset",
version_id="v1"
)
# Run experiment with traces
experiment = client.experiments.run(
dataset_id=dataset.id,
task=my_task,
evaluators=[correctness_evaluator],
)
Hint: Tasks and evaluators are instrumented using OpenTelemetry. You can view detailed traces of experiment runs and evaluations directly in the Phoenix UI for debugging and performance analysis.
Sources: js/packages/phoenix-client/README.md:30-60
Related Documentation
- Phoenix OTEL Package - OpenTelemetry instrumentation
- Phoenix Client - Python/TypeScript client libraries
- Phoenix Config - Environment variable configuration
OpenTelemetry Integration
Related topics: Tracing System, Python SDK (arize-phoenix-client)
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Tracing System, Python SDK (arize-phoenix-client)
OpenTelemetry Integration
Overview
OpenTelemetry Integration in Phoenix provides a standardized approach to instrumenting AI applications for observability. Phoenix offers OpenTelemetry (OTel) wrappers for both Python and TypeScript/JavaScript environments, enabling developers to capture traces, spans, and telemetry data from their AI-powered applications.
The integration serves as a bridge between AI frameworks (LangChain, LlamaIndex, OpenAI, etc.) and Phoenix's observability platform, automatically collecting and forwarding telemetry data with minimal configuration.
Sources: packages/phoenix-otel/README.md
Architecture
High-Level Architecture
graph TD
A[AI Application] --> B[OpenInference Instrumentations]
B --> C[Phoenix OTEL Wrapper]
C --> D[OTLP Exporter]
D --> E[Phoenix Collector]
E --> F[Phoenix Server]
G[LangChain] --> B
H[LlamaIndex] --> B
I[OpenAI] --> B
J[Haystack] --> BComponent Stack
| Layer | Python Package | TypeScript Package |
|---|---|---|
| Core Wrapper | arize-phoenix-otel | @arizeai/phoenix-otel |
| Instrumentation | openinference-instrumentation-* | @arizeai/openinference-instrumentation-* |
| Configuration | phoenix.otel.register() | register() |
| Exporter | OTLP | OTLP |
Sources: js/packages/phoenix-otel/README.md
Python Integration
Installation
pip install arize-phoenix-otel
For specific framework instrumentation:
pip install openinference-instrumentation-openai
pip install openinference-instrumentation-langchain
pip install openinference-instrumentation-llama-index
Sources: packages/phoenix-otel/README.md
Core Module: `phoenix.otel`
The Python package exposes the register() function as the primary entry point for configuring OpenTelemetry tracing.
#### Registration Function
from phoenix.otel import register
# Basic setup
tracer_provider = register()
# Production configuration
tracer_provider = register(
project_name="my-production-app",
auto_instrument=True,
batch=True,
api_key="your-api-key",
endpoint="https://app.phoenix.arize.com/s/your-space"
)
Environment Variables
| Variable | Description |
|---|---|
PHOENIX_COLLECTOR_ENDPOINT | OTel collector endpoint URL |
PHOENIX_PROJECT_NAME | Default project name for traces |
PHOENIX_CLIENT_HEADERS | JSON-encoded custom headers |
PHOENIX_API_KEY | Authentication API key |
PHOENIX_HOST | Phoenix server host URL |
PHOENIX_PORT | Phoenix HTTP port |
PHOENIX_GRPC_PORT | Phoenix gRPC port |
Sources: js/packages/phoenix-config/README.md
Legacy Module Migration
The legacy phoenix.trace.* instrumentor modules have been removed. The migration path is:
# Old (removed)
from phoenix.trace.openai import OpenAIInstrumentor
# New approach
from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor
tracer_provider = register()
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)
Sources: src/phoenix/__init__.py
TypeScript/JavaScript Integration
Installation
npm install @arizeai/phoenix-otel
# or
pnpm add @arizeai/phoenix-otel
Sources: js/packages/phoenix-otel/README.md
Registration Function
import { register } from "@arizeai/phoenix-otel";
// Basic setup
const provider = register({
projectName: "my-app",
});
// Production setup with Phoenix Cloud
const provider = register({
projectName: "my-app",
url: "https://app.phoenix.arize.com",
apiKey: process.env.PHOENIX_API_KEY,
});
Sources: js/packages/phoenix-otel/src/register.ts
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
projectName | string | "default" | Project name for organizing traces |
url | string | "http://localhost:6006" | Phoenix instance URL |
apiKey | string | undefined | API key for authentication |
headers | Record<string, string> | {} | Custom headers for OTLP requests |
batch | boolean | true | Use batch span processing |
instrumentations | Instrumentation[] | undefined | OpenTelemetry instrumentations to register |
global | boolean | true | Register as global tracer provider |
diagLogLevel | DiagLogLevel | depends on NODE_ENV | Diagnostic logging level |
Sources: js/packages/phoenix-otel/src/register.ts
Non-Global Provider Usage
import { register } from "@arizeai/phoenix-otel";
const provider = register({
projectName: "my-app",
global: false,
});
// Use the provider explicitly
const tracer = provider.getTracer("my-tracer");
Supported Integrations
Framework Integrations
Phoenix supports tracing for the following AI frameworks through OpenInference instrumentation:
| Framework | Python Package | TypeScript Package |
|---|---|---|
| OpenAI | openinference-instrumentation-openai | @arizeai/openinference-instrumentation-openai |
| LangChain | openinference-instrumentation-langchain | @arizeai/openinference-instrumentation-langchain |
| LlamaIndex | openinference-instrumentation-llama-index | @arizeai/openinference-instrumentation-llama-index |
| Haystack | openinference-instrumentation-haystack | N/A |
| OpenLLMetry | openinference-instrumentation-openllmetry | N/A |
Sources: app/src/components/project/Integrations.tsx
Python Setup Example
from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor
tracer_provider = register()
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)
TypeScript Setup Example
import { register } from "@arizeai/phoenix-otel";
import { OpenAIInstrumentor } from "@arizeai/openinference-instrumentation-openai";
const provider = register();
await OpenAIInstrumentor().instrument();
Tracing Helpers
TypeScript Tracing Utilities
The @arizeai/phoenix-otel package re-exports OpenInference helpers for GenAI patterns:
import {
observe,
traceAgent,
traceChain,
traceTool,
withSpan,
setAttributes,
setMetadata,
} from "@arizeai/phoenix-otel";
#### Example: traceChain
import { register, traceChain } from "@arizeai/phoenix-otel";
const provider = register({
projectName: "my-app",
});
const answerQuestion = traceChain(
async (question: string) => `Handled: ${question}`,
{ name: "answer-question" }
);
await answerQuestion("What is Phoenix?");
await provider.shutdown();
#### Example: traceTool
import { traceTool } from "@arizeai/phoenix-otel";
const searchTool = traceTool(
async (query: string) => {
// Tool implementation
return searchResults;
},
{
name: "web-search",
description: "Search the web for information",
}
);
Sources: js/packages/phoenix-otel/README.md
Configuration Flow
Setup Flow
sequenceDiagram
participant Dev as Developer
participant App as Application
participant Phoenix as Phoenix OTEL
participant OTel as OpenTelemetry
participant Collector as Phoenix Collector
Dev->>App: Configure environment variables
App->>Phoenix: Call register()
Phoenix->>OTel: Initialize TracerProvider
OTel->>OTel: Configure BatchSpanProcessor
OTel->>Collector: Setup OTLP Exporter
Collector-->>App: Ready to receive traces
App->>Collector: Send spans via OTLPEnvironment-Based Configuration
- Local Development: No configuration needed, defaults to
http://localhost:6006
# Optional: Explicit local endpoint
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
- Phoenix Cloud: Configure cloud endpoint and authentication
export PHOENIX_COLLECTOR_ENDPOINT="https://app.phoenix.arize.com/s/your-space"
export PHOENIX_API_KEY="your-api-key"
export PHOENIX_PROJECT_NAME="my-project"
- Self-Hosted: Point to custom deployment
export PHOENIX_COLLECTOR_ENDPOINT="https://your-phoenix.example.com"
export PHOENIX_API_KEY="your-api-key"
Best Practices
Production Recommendations
| Setting | Recommendation | Reason |
|---|---|---|
batch | true | Reduces network overhead with batch processing |
global | true | Ensures all instrumented libraries use same provider |
| API Key | Required | Secure authentication with Phoenix Cloud |
| Endpoint | HTTPS | Secure data transmission |
Zero Code Changes Approach
For automatic instrumentation:
import { register } from "@arizeai/phoenix-otel";
// Enable auto_instrument to automatically trace AI/ML libraries
const provider = register({
auto_instrument: true,
projectName: "my-production-app",
});
CLI Integration
The Phoenix CLI provides commands for working with traces:
# List recent traces
px trace list --limit 10
# Save traces to directory
px trace list ./my-traces --limit 50
# Filter by time
px trace list --last-n-minutes 60 --limit 20
px trace list --since 2024-01-13T10:00:00Z
| Option | Description |
|---|---|
-n, --limit <number> | Number of traces (newest first) |
--last-n-minutes <number> | Only traces from the last N minutes |
--since <timestamp> | Traces since ISO timestamp |
--format raw | Pipe-friendly compact JSON |
Sources: js/packages/phoenix-cli/README.md
Quick Reference
Python
from phoenix.otel import register
# Simple setup
provider = register()
# With auto-instrumentation
provider = register(auto_instrument=True)
# Production
provider = register(
project_name="production",
auto_instrument=True,
batch=True,
api_key="your-key",
endpoint="https://app.phoenix.arize.com/s/your-space"
)
TypeScript
import { register } from "@arizeai/phoenix-otel";
// Simple setup
const provider = register();
// Production
register({
projectName: "production",
url: "https://app.phoenix.arize.com",
apiKey: process.env.PHOENIX_API_KEY,
});
Additional Resources
Sources: packages/phoenix-otel/README.md
Evaluation System (Phoenix Evals)
Related topics: Datasets and Experiments, Python SDK (arize-phoenix-client)
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Datasets and Experiments, Python SDK (arize-phoenix-client)
Evaluation System (Phoenix Evals)
Overview
Phoenix Evals is a comprehensive evaluation framework that provides lightweight, composable building blocks for writing and running evaluations on LLM applications. It enables developers to assess AI application quality through automated evaluators that measure hallucination detection, relevance scoring, toxicity, correctness, and other custom classification tasks.
Sources: packages/phoenix-evals/README.md
Key Capabilities
| Feature | Description |
|---|---|
| Multi-SDK Support | Works with OpenAI, LiteLLM, LangChain, Anthropic via adapters |
| Input Mapping | Powerful binding for complex data structures |
| Pre-built Metrics | Hallucination detection, relevance, toxicity, correctness |
| OpenTelemetry Integration | Evaluators are natively instrumented for observability |
| High Performance | Up to 20x speedup with built-in concurrency and batching |
| Cross-platform | Available in both Python (arize-phoenix-evals) and TypeScript (@arizeai/phoenix-evals) |
Sources: packages/phoenix-evals/README.md
Architecture
System Components
graph TD
A[User Application] --> B[Phoenix Evals Core]
B --> C[LLM Adapters]
B --> D[Evaluator Templates]
B --> E[Prompt Templates]
C --> F[OpenAI Adapter]
C --> G[Anthropic Adapter]
C --> H[LiteLLM Adapter]
C --> I[LangChain Adapter]
D --> J[Correctness Evaluator]
D --> K[Faithfulness Evaluator]
D --> L[Relevance Evaluator]
D --> M[Toxicity Evaluator]
D --> N[Custom Classifier]
E --> O[Hallucination Prompts]
E --> P[Classification Prompts]
B --> Q[OpenTelemetry Traces]Evaluator Types
Phoenix Evals provides two primary evaluator categories:
- Correctness Evaluator - Assesses whether an answer correctly addresses a query based on reference context
- Faithfulness Evaluator - Determines if an answer is faithful to the provided reference text (hallucination detection)
Sources: js/packages/phoenix-evals/src/llm/createCorrectnessEvaluator.ts Sources: js/packages/phoenix-evals/src/llm/createFaithfulnessEvaluator.ts
Installation
Python Package
pip install arize-phoenix-evals
Sources: packages/phoenix-evals/README.md
TypeScript Package
# npm
npm install @arizeai/phoenix-evals
# or yarn, pnpm, bun
yarn add @arizeai/phoenix-evals
pnpm add @arizeai/phoenix-evals
bun add @arizeai/phoenix-evals
Sources: js/packages/phoenix-evals/README.md
Creating Evaluators
TypeScript: Correctness Evaluator
import { createClassifier } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";
const model = openai("gpt-4o-mini");
const promptTemplate = `
In this task, you will be presented with a query, a reference text and an answer. The answer is
generated to the question based on the reference text. The answer may contain false information. You
must use the reference text to determine if the answer to the question contains false information.
`;
const classifier = createClassifier({
model,
promptTemplate,
});
Sources: js/packages/phoenix-evals/README.md
TypeScript: Faithfulness Evaluator
The faithfulness evaluator detects hallucinations by comparing an answer against reference context.
import { createFaithfulnessEvaluator } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";
const evaluator = createFaithfulnessEvaluator({
model: openai("gpt-4o-mini"),
});
Sources: js/packages/phoenix-evals/src/llm/createFaithfulnessEvaluator.ts
Python: Using the Client API
The Phoenix server provides evaluator helpers for running evaluations programmatically:
from phoenix.server.api.helpers.evaluators import (
build_hallucination_evaluator,
build_correctness_evaluator,
)
Sources: src/phoenix/server/api/helpers/evaluators.py
Prompt Templates
Classification Evaluator Configurations
Phoenix Evals uses structured prompt templates for classification tasks. The system supports multiple evaluator configurations stored in the prompts/classification_evaluator_configs directory.
Sources: prompts/classification_evaluator_configs
Available Evaluators on Server
| Evaluator | Purpose | Input Fields |
|---|---|---|
hallucination_evaluator | Detects false information in answers | query, reference, response |
correctness_evaluator | Assesses answer correctness | query, reference, response |
answer_relevance_evaluator | Measures relevance of response to query | query, response |
context_relevance_evaluator | Measures relevance of context to query | query, context |
Sources: src/phoenix/server/api/helpers/evaluators.py:1-100
Evaluator Configuration Schema
Project Dependencies
The Python package defines its dependencies in pyproject.toml:
[project]
name = "arize-phoenix-evals"
version = "5.0.0"
Core Dependencies:
anthropic- Anthropic API clienthttpx- HTTP clientjoblib- Parallel executionlitellm- Unified LLM interfaceopenai- OpenAI API clienttqdm- Progress bars
Optional Dependencies:
langchain/langchain-core- LangChain integrationlangsmith- Tracing support
Sources: packages/phoenix-evals/pyproject.toml
Evaluation Workflow
graph LR
A[Input Data] --> B[Evaluator Selection]
B --> C[Prompt Template]
C --> D[LLM Adapter]
D --> E[Model Inference]
E --> F[Response Parsing]
F --> G[Evaluation Result]
H[Reference Context] --> C
I[Query] --> CRunning Evaluations
Batch Evaluation with Concurrency
Phoenix Evals supports concurrent evaluation for high performance:
from phoenix.evals import run_evaluation
results = run_evaluation(
model=my_model,
evaluators=[correctness_evaluator, faithfulness_evaluator],
data=evaluation_dataset,
concurrency=10, # Up to 20x speedup
)
Exporting Results
Evaluation results can be:
- Logged to Phoenix for visualization
- Exported as JSON/CSV
- Stored in datasets for future reference
Integration with Phoenix Observability
Evaluators are natively instrumented via OpenTelemetry tracing, enabling:
- Trace-level annotations for evaluation results
- Correlation between evaluations and application spans
- Dataset curation from production traces
Sources: packages/phoenix-evals/README.md
Configuration Reference
Environment Variables
| Variable | Description | Default |
|---|---|---|
PHOENIX_HOST | Phoenix server URL | http://localhost:6006 |
PHOENIX_API_KEY | Authentication key | — |
PHOENIX_PROJECT | Default project name | — |
Evaluator Options
interface EvaluatorConfig {
model: any; // LLM model instance
promptTemplate?: string; // Custom prompt template
temperature?: number; // Model temperature (default: 0.0)
maxTokens?: number; // Max tokens for response
batchSize?: number; // Batch size for evaluation
}
Advanced Usage
Custom Classifier
Create custom binary or multi-class classification evaluators:
import { createClassifier } from "@arizeai/phoenix-evals/llm";
const customClassifier = createClassifier({
model: myModel,
promptTemplate: myCustomTemplate,
labels: ["positive", "negative", "neutral"],
});
LangChain Integration
For LangChain applications, use the pre-built evaluations:
npm run pre_built_evals # Uses built-in correctness evaluator
npm run custom_evals # Uses custom rubric with specific LLM
Sources: js/examples/apps/langchain-quickstart/README.md
Performance Considerations
| Optimization | Expected Improvement |
|---|---|
| Concurrent execution | Up to 20x speedup |
| Batch processing | Reduced API overhead |
| Streaming responses | Lower latency perception |
Best Practices
- Use reference contexts - Always provide ground truth or reference data for accurate evaluation
- Configure appropriate models - Use capable models (GPT-4 class) for accurate classification
- Monitor evaluation traces - Review flagged evaluations in Phoenix UI
- Iterate on prompts - Fine-tune prompt templates for domain-specific accuracy
- Set low temperature - Use temperature=0 for deterministic evaluation results
Related Documentation
Sources: packages/phoenix-evals/README.md
Datasets and Experiments
Related topics: Evaluation System (Phoenix Evals), Database Models and Migrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Evaluation System (Phoenix Evals), Database Models and Migrations
Datasets and Experiments
Phoenix provides a comprehensive Datasets and Experiments system that enables users to create structured collections of examples, run evaluation tasks against them, and track results over time. This feature is designed for benchmarking models, evaluating LLM outputs, and building evaluation pipelines.
Overview
Datasets in Phoenix are structured containers that hold examples used for experimentation and evaluation. Each example consists of an input, an output, and optional metadata. Experiments allow you to define tasks that process these examples and evaluators that score the results.
graph TD
A[Dataset] --> B[Example 1]
A --> C[Example 2]
A --> D[Example N]
B --> E[input]
B --> F[output]
B --> G[metadata]
E --> H[Task Function]
F --> H
H --> I[Evaluators]
I --> J[Experiment Results]
J --> K[Annotations on Traces]Core Concepts
Dataset Structure
A Dataset is a named collection of examples with the following properties:
| Property | Type | Description |
|---|---|---|
name | str | Unique identifier for the dataset |
description | str | Human-readable description |
examples | List[Example] | Collection of examples |
version | int | Dataset version for tracking changes |
created_at | datetime | Creation timestamp |
Example Schema
Each Example within a dataset follows this structure:
| Field | Type | Required | Description |
|---|---|---|---|
input | dict | Yes | Input data for the task (e.g., prompts, questions) |
output | dict | Yes | Expected or reference output |
metadata | dict | No | Additional context, tags, or auxiliary data |
# Example structure
example = {
"input": {"question": "What is the capital of France?"},
"output": {"answer": "Paris"},
"metadata": {"category": "geography", "difficulty": "easy"}
}
Sources: js/packages/phoenix-client/README.md
Experiment Workflow
The typical experiment workflow involves:
- Creating a Dataset - Define and populate a dataset with examples
- Defining a Task - Create a function that processes each example
- Configuring Evaluators - Set up scoring/evaluation functions
- Running the Experiment - Execute tasks across all examples
- Analyzing Results - Review scores and export results
graph LR
A1[Create Dataset] --> B[Define Task Function]
B --> C[Configure Evaluators]
C --> D[Run Experiment]
D --> E[Results + Annotations]
E --> F[Analyze & Iterate]Sources: js/packages/phoenix-client/README.md
Python Client API
Dataset Resource
The Python client provides a Datasets resource class for managing datasets programmatically.
from phoenix.client import Client
client = Client()
# Get an existing dataset
dataset = client.datasets.get_dataset(
dataset="my-dataset-name",
version_id="optional-version-id" # Optional: specific version
)
#### Key Methods
| Method | Description |
|---|---|
get_dataset() | Retrieve a dataset by name or ID |
create_dataset() | Create a new dataset with examples |
add_examples() | Add examples to an existing dataset |
upsert_dataset() | Create or update a dataset |
get_dataset_columns() | Get column information for the dataset |
Sources: packages/phoenix-client/src/phoenix/client/resources/datasets/__init__.py
Creating Datasets
from phoenix.client import Client
client = Client()
# Create a dataset with examples
dataset = client.datasets.create_dataset(
name="qa-dataset",
description="Questions and answers for evaluation",
input_keys=["question"],
output_keys=["answer"],
metadata_keys=["category", "difficulty"],
inputs=[
{"question": "What is the capital of France?"},
{"question": "What is the capital of the USA?"}
],
outputs=[
{"answer": "Paris"},
{"answer": "Washington D.C."}
],
metadata=[
{"category": "geography", "difficulty": "easy"},
{"category": "geography", "difficulty": "easy"}
]
)
Adding Examples to Datasets
# Add more examples to an existing dataset
await client.datasets.add_examples(
dataset_id="dataset-uuid",
examples=[
{
"input": {"question": "What is 2 + 2?"},
"output": {"answer": "4"},
"metadata": {"category": "math", "difficulty": "easy"}
}
]
)
#### Parameters for add_examples
| Parameter | Type | Required | Description |
|---|---|---|---|
dataset_id | str | Yes | Dataset identifier |
examples | List[dict] | Yes | List of example objects |
split | str | No | Assign all examples to a split |
timeout | int | No | Request timeout in seconds |
Sources: packages/phoenix-client/src/phoenix/client/resources/datasets/__init__.py
TypeScript/JavaScript Client API
Creating Datasets
import { createDataset } from "@arizeai/phoenix-client/datasets";
const { datasetId } = await createDataset({
name: "questions",
description: "a simple dataset of questions",
examples: [
{
input: { question: "What is the capital of France" },
output: { answer: "Paris" },
metadata: {},
},
{
input: { question: "What is the capital of the USA" },
output: { answer: "Washington D.C." },
metadata: {},
},
],
});
Sources: js/packages/phoenix-client/README.md
Running Experiments
import {
asExperimentEvaluator,
runExperiment
} from "@arizeai/phoenix-client/experiments";
// Define a task to run on each example
const task = async (example) => `hello ${example.input.name}`;
// Define evaluators
const evaluators = [
asExperimentEvaluator({
name: "matches",
kind: "CODE",
evaluate: async ({ output, expected }) => {
return output === expected;
},
}),
];
// Run the experiment
const results = await runExperiment({
datasetId,
task,
evaluators,
});
Sources: js/packages/phoenix-client/README.md
Experiment Evaluators
Evaluators are functions that assess the quality of task outputs against expected results.
Evaluator Types
| Kind | Description | Use Case |
|---|---|---|
CODE | Custom code-based evaluation | Exact match, regex patterns |
LLM | LLM-as-judge evaluation | Semantic similarity, relevance |
BUILT_IN | Pre-built Phoenix evaluators | Common metrics like accuracy |
Built-in Evaluators
Phoenix provides pre-built evaluators for common evaluation tasks:
import {
asExperimentEvaluator,
runExperiment,
createBuiltInEvaluator
} from "@arizeai/phoenix-client/experiments";
// Use a built-in correctness evaluator
const correctnessEval = createBuiltInEvaluator({
name: "correctness",
rubric: "Evaluate if the response correctly answers the question",
});
const evaluators = [correctnessEval];
const results = await runExperiment({
datasetId,
task,
evaluators,
});
Sources: js/examples/apps/langchain-quickstart/README.md
Example Apps
Phoenix Experiment Runner
A complete example application demonstrating dataset and experiment workflows:
# Setup requirements
pnpm install
pnpm -r build
# Run the app
pnpm dev
Features:
- Loads datasets from CSV files
- Configures custom task functions
- Runs experiments with multiple evaluators
- Stores results in Phoenix
Sources: js/examples/apps/phoenix-experiment-runner/README.md
LangChain Integration
The LangChain quickstart demonstrates how to:
- Instrument LangChain agents with Phoenix
- Run correctness evaluations on agent outputs
- Log annotations back to Phoenix traces
// Fetch spans and run evaluation
const spans = await client.spans.get_spans_dataframe({
project_identifier: "my-llm-app",
limit: 100,
});
// Run built-in correctness evaluator
const evalResults = await runBuiltInEvaluator({
name: "correctness",
spans,
rubric: "travel_rubric",
});
// Log annotations back to Phoenix
await client.spans.add_span_annotations(evalResults);
Sources: js/examples/apps/langchain-quickstart/README.md
API Input Types
CreateDatasetInput
The server-side input type for creating datasets:
class CreateDatasetInput:
name: str # Required: unique dataset name
description: Optional[str] # Optional: dataset description
metadata: Optional[dict] # Optional: dataset-level metadata
AddExamplesToDatasetInput
class AddExamplesToDatasetInput:
dataset_id: GlobalID # Required: target dataset
examples: List[ExampleInput] # Required: examples to add
split: Optional[str] # Optional: assign to split
Sources: src/phoenix/server/api/input_types/CreateDatasetInput.py Sources: src/phoenix/server/api/input_types/AddExamplesToDatasetInput.py
Database Operations
Dataset insertion is handled by the database layer:
# In src/phoenix/db/insertion/dataset.py
async def insert_dataset(
session: AsyncSession,
dataset: Dataset,
examples: List[Example]
) -> None:
# Handles dataset creation and example insertion
# Supports batch operations for performance
Sources: src/phoenix/db/insertion/dataset.py
UI Integration
Experiment Code Dialog
The Phoenix UI provides a code generation dialog for running experiments:
<RunExperimentCodeDialog
datasetName="my-dataset"
version={{ id: "v1", version: 1 }}
isAuthEnabled={true}
/>
This component generates:
- Installation commands for
arize-phoenix-client - Base URL configuration
- Dataset retrieval code
- Experiment execution examples
Sources: app/src/components/experiment/RunExperimentCodeDialog.tsx
Python Project Guide
The UI also provides setup instructions for Python-based experiments:
<PythonProjectGuide
packages={["arize-phoenix-otel"]}
isAuthEnabled={true}
/>
Sources: app/src/components/project/PythonProjectGuide.tsx
Authentication
When using datasets and experiments with authentication enabled:
| Environment Variable | Description |
|---|---|
PHOENIX_API_KEY | API key for authentication |
Authorization | Bearer token for REST/GraphQL APIs |
OTEL_EXPORTER_OTLP_HEADERS | Headers for OpenTelemetry SDKs |
# Set API key
export PHOENIX_API_KEY="your-api-key"
# Or for OTEL
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer your-token"
Sources: app/src/components/auth/OneTimeAPIKeyDialog.tsx
Summary
The Datasets and Experiments system in Phoenix provides:
- Structured Data Management: Create, version, and manage datasets with examples
- Flexible Evaluation: Support for code-based, LLM-based, and built-in evaluators
- Multi-Language Support: Python and TypeScript/JavaScript clients
- Trace Integration: Results and annotations can be linked to spans
- Workflow Automation: Script-based and programmatic experiment execution
This system enables teams to systematically evaluate LLM applications, track performance over time, and build automated quality assurance pipelines.
Sources: js/packages/phoenix-client/README.md
Database Models and Migrations
Related topics: Server API and GraphQL, Tracing System
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Server API and GraphQL, Tracing System
Database Models and Migrations
Phoenix uses SQLAlchemy ORM for database abstraction and Alembic for managing schema migrations. This document covers the database architecture, available models, migration system, engine configuration, and bulk data insertion utilities.
Source: https://github.com/Arize-ai/phoenix / Human Manual
Frontend Application
Related topics: System Architecture, Server API and GraphQL
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Server API and GraphQL
Frontend Application
Overview
The Phoenix Frontend Application is a React-based web interface that provides observability, tracing, and evaluation capabilities for AI applications. It serves as the primary user interface for interacting with Phoenix's backend services, enabling users to manage projects, visualize traces, configure annotations, and evaluate LLM outputs.
The frontend is built with modern React patterns including hooks, context providers, and component composition. It integrates with @arizeai/phoenix-otel for telemetry data collection and @arizeai/phoenix-evals for evaluation functionality.
Sources: app/src/pages/project/ProjectTracesPage.tsx:1-50
Architecture Overview
Component Hierarchy
graph TD
A[App] --> B[Routes]
B --> C[ProjectTracesPage]
B --> D[Playground]
B --> E[Settings Pages]
C --> F[TracesTable]
C --> G[SpanFiltersProvider]
D --> H[PromptMenu]
E --> I[AnnotationConfigList]
F --> J[AnnotationLabel]
J --> K[AnnotationTooltip]
G --> L[TracePaginationProvider]
L --> M[TracingRoot]Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Framework | React 19+ | UI rendering |
| Routing | React Router v6 | SPA navigation |
| State | React Context + Hooks | State management |
| Styling | CSS-in-JS (Fela) | Component styling |
| Icons | Custom SVG Components | UI iconography |
| Tables | @tanstack/react-table | Data tables |
| Forms | Adobe React Spectrum | Form components |
Sources: app/src/Routes.tsx:1-100
Routing Structure
The application uses React Router for navigation with a nested route structure supporting breadcrumbs and lazy loading.
Main Routes
| Route | Component | Purpose |
|---|---|---|
/projects/:projectId/traces | ProjectTracesPage | Trace visualization |
/projects/:projectId/traces/:traceId | TraceTree | Single trace view |
/settings/* | SettingsPage | Application settings |
/playground | Playground | LLM prompt playground |
// Routes structure excerpt
<Route path="/settings" element={<SettingsPage />}>
<Route path="general" element={<SettingsGeneralPage />} />
<Route path="secrets" element={<SettingsSecretsPage />} />
<Route path="providers" element={<SettingsAIProvidersPage />} />
<Route path="models" element={<SettingsModelsPage />} />
<Route path="datasets" element={<SettingsDatasetsPage />} />
<Route path="annotations" element={<SettingsAnnotationsPage />} />
<Route path="data" element={<SettingsDataPage />} />
<Route path="prompts" element={<SettingsPromptsPage />} />
</Route>
Sources: app/src/Routes.tsx:30-80
Core Components
Tracing Components
#### ProjectTracesPage
The main container for trace visualization, wrapping content in context providers for tracing state management.
export const ProjectTracesPage = () => {
const { tracesQueryReference } = useProjectPageQueryReferenceContext();
return (
<TracingRoot>
<TracePaginationProvider>
<SpanFiltersProvider>
<Suspense fallback={<Loading />}>
<TracesTabContent tracesQueryReference={tracesQueryReference} />
</Suspense>
</SpanFiltersProvider>
<Suspense>
<Outlet />
</Suspense>
</TracePaginationProvider>
</TracingRoot>
);
};
Sources: app/src/pages/project/ProjectTracesPage.tsx:40-65
#### TracesTable
Displays experiment runs and traces in a paginated, sortable table format. Integrates with useReactTable from TanStack Table for efficient rendering.
| Feature | Implementation |
|---|---|
| Pagination | Custom pagination with fetch on scroll |
| Sorting | Column-based sorting |
| Selection | Row selection for batch operations |
| Navigation | Click to view trace details |
Sources: app/src/pages/example/ExampleExperimentRunsTable.tsx:1-100
Project Setup Components
#### OnboardingSteps
Provides step-by-step guidance for integrating applications with Phoenix. Supports multiple programming languages and package managers.
export function OnboardingSteps({
language,
packages,
implementationCode,
docsHref,
githubHref,
generatedApiKey,
onApiKeyGenerated,
extraEnvVars,
}: {
language: ProgrammingLanguage;
packages: readonly string[];
implementationCode: string;
// ... additional props
}) {
const isHosted = IS_HOSTED_DEPLOYMENT;
const isAuthEnabled = window.Config.authenticationEnabled;
// ...
}
Sources: app/src/pages/project/OnboardingSteps.tsx:50-85
#### TypeScriptProjectGuide
Language-specific setup guide for TypeScript/JavaScript projects with OTEL initialization code generation.
<TypeScriptBlockWithCopy
value={getOtelInitCodeTypescript({ projectName })}
/>
#### PythonProjectGuide
Similar component for Python projects using arize-phoenix-otel package.
| Package | Purpose |
|---|---|
arize-phoenix-otel | OpenTelemetry instrumentation for Python |
opentelemetry-* | Standard OTEL packages |
Sources: app/src/components/project/PythonProjectGuide.tsx:1-60
Document and Annotation Components
#### DocumentItem
Renders document metadata and annotations within trace views. Supports JSON display and annotation overlay.
<ReadonlyJSONBlock basicSetup={{ lineNumbers: false }}>
{JSON.stringify(metadata)}
</ReadonlyJSONBlock>
<DocumentAnnotationsSection
spanNodeId={spanNodeId ?? ""}
documentPosition={documentPosition ?? 0}
documentAnnotations={documentAnnotations ?? []}
/>
Sources: app/src/pages/trace/DocumentItem.tsx:1-80
#### AnnotationConfigList
Dropdown component for selecting annotation configurations. Displays annotation names with color swatches and type tokens.
<MenuItem
id={id}
textValue={name ?? undefined}
leadingContent={
<AnnotationColorSwatch annotationName={name || ""} />
}
trailingContent={
<Token size="S">{annotationType?.toLocaleLowerCase()}</Token>
}
>
<Text>{name}</Text>
</MenuItem>
Sources: app/src/components/trace/AnnotationConfigList.tsx:1-50
Markdown Rendering
#### streamdownComponents
Custom React components for rendering markdown content with Phoenix-specific styling.
| Component | Element | Styling |
|---|---|---|
blockquote | <blockquote> | Blockquote CSS |
inlineCode | <code> | Inline code CSS |
table | <table> | Table wrapper |
th | <th> | Header cell CSS |
td | <td> | Data cell CSS |
img | <img> | Image CSS |
hr | <hr> | Horizontal rule CSS |
blockquote: ({ children, className }) => (
<blockquote css={blockquoteCSS} className={className}>
{children}
</blockquote>
),
inlineCode: ({ children, className }) => (
<code css={inlineCodeCSS} className={className}>
{children}
</code>
),
Sources: app/src/components/markdown/streamdownComponents.tsx:1-60
Playground Components
#### PromptMenu
Tabbed interface for managing prompt versions and tags. Uses lazy-loaded tabs for performance.
<Autocomplete filter={contains}>
<MenuHeader>
<SearchField aria-label="Search tags" variant="quiet" autoFocus>
<SearchIcon />
<Input placeholder="Search tags" />
</SearchField>
</MenuHeader>
</Autocomplete>
| Tab | Content |
|---|---|
| Versions | Prompt version history |
| Tags | Tag management |
Sources: app/src/pages/playground/PromptMenu.tsx:1-80
Tool Components
#### DocsToolDetails
Renders documentation tool outputs with state-based display logic.
function getOutputText(part: ToolInvocationPart): string {
if (part.state !== "output-available" || part.output == null) {
return "";
}
return stringifyToolValue(part.output);
}
// Preview truncation for long outputs
export function truncateDocsOutput(text: string): string {
if (text.length <= OUTPUT_PREVIEW_LENGTH) {
return text;
}
return text.slice(0, OUTPUT_PREVIEW_LENGTH) + "…";
}
Sources: app/src/components/agent/DocsToolDetails.tsx:1-100
UI Components
#### Icons
SVG icon components using currentColor for theme compatibility.
export const ArrowCompareOutline = () => (
<svg width="24" height="24" viewBox="0 0 24 24" fill="none">
<path d="..." fill="currentColor" />
</svg>
);
Available icons include: ArrowCompareOutline, TemplateOutline, FileOutline, CloseOutline, and more.
Sources: app/src/components/core/icon/Icons.tsx:1-50
#### FileListItem
Reusable file upload item component with progress tracking and status indicators.
| Status | Description |
|---|---|
uploading | File transfer in progress |
parsing | File content being processed |
complete | Upload and parse successful |
error | Upload or parse failed |
const showProgress = status !== "complete" && progress !== undefined;
return (
<li className="file-list__item" data-status={status}>
<ProgressBar value={progress} width="100%" height="4px" />
</li>
);
Sources: app/src/components/core/dropzone/FileListItem.tsx:1-70
Context Providers
Provider Hierarchy
graph TD
A[TracingRoot] --> B[TracePaginationProvider]
B --> C[SpanFiltersProvider]
C --> D[TracesTabContent]
E[IsAuthenticated] --> F[IsAdmin]
F --> G[GenerateAPIKeyButton]| Provider | Purpose |
|---|---|
TracingRoot | Global tracing state |
TracePaginationProvider | Pagination state for traces |
SpanFiltersProvider | Filter state for span queries |
IsAuthenticated | Authentication state check |
IsAdmin | Authorization check |
Environment Configuration
Phoenix Environment Variables
The frontend reads configuration from environment variables via the @arizeai/phoenix-config package.
| Variable | Constant | Description |
|---|---|---|
PHOENIX_HOST | ENV_PHOENIX_HOST | Phoenix server host URL |
PHOENIX_API_KEY | ENV_PHOENIX_API_KEY | API key for authentication |
PHOENIX_CLIENT_HEADERS | ENV_PHOENIX_CLIENT_HEADERS | JSON-encoded custom headers |
PHOENIX_COLLECTOR_ENDPOINT | ENV_PHOENIX_COLLECTOR_ENDPOINT | OTel collector endpoint |
PHOENIX_PORT | ENV_PHOENIX_PORT | Phoenix HTTP port |
PHOENIX_GRPC_PORT | ENV_PHOENIX_GRPC_PORT | Phoenix gRPC port |
PHOENIX_PROJECT | ENV_PHOENIX_PROJECT | Default project name |
Sources: js/packages/phoenix-config/README.md:1-50
Runtime Configuration
const isHosted = IS_HOSTED_DEPLOYMENT;
const isAuthEnabled = window.Config.authenticationEnabled;
Data Flow
Trace Navigation Flow
sequenceDiagram
participant User
participant TracesTable
participant ProjectTracesPage
participant API
User->>TracesTable: Click trace row
TracesTable->>TracesTable: Get row trace data
TracesTable->>User: Navigate to trace detail
User->>ProjectTracesPage: Load trace by ID
ProjectTracesPage->>API: Fetch trace data
API-->>ProjectTracesPage: Return trace
ProjectTracesPage-->>User: Render TraceTreeAnnotation Flow
graph LR
A[TraceView] --> B[AnnotationLabel]
B --> C[AnnotationTooltip]
C --> D[External Link]
D --> E[Trace Detail]
A --> F[AnnotationConfigList]
F --> G[Create Annotation]Package Dependencies
Key Dependencies
| Package | Version Constraint | Purpose |
|---|---|---|
react | >=18.0.0 | Core framework |
react-router-dom | >=6.0.0 | Routing |
@tanstack/react-table | Latest | Data tables |
@adobe/react-spectrum | Latest | UI components |
@arizeai/phoenix-evals | Latest | Evaluation functions |
@arizeai/phoenix-config | Latest | Config utilities |
Sources: app/package.json:1-50
Best Practices
Component Patterns
- Lazy Loading: Use
Suspensewith lazy-loaded components for route-based code splitting - Context Composition: Wrap related state in dedicated providers
- Type Safety: Use TypeScript interfaces for all component props
- CSS Organization: Use CSS-in-JS with semantic class naming
Performance Considerations
- Implement virtual scrolling for large trace lists
- Use
React.memofor expensive component re-renders - Lazy load tab panels in tabbed interfaces
- Truncate long outputs in preview contexts
export function truncateDocsOutput(text: string): string {
if (text.length <= OUTPUT_PREVIEW_LENGTH) {
return text;
}
return text.slice(0, OUTPUT_PREVIEW_LENGTH) + "…";
}Server API and GraphQL
Related topics: System Architecture, Database Models and Migrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Database Models and Migrations
Server API and GraphQL
Phoenix provides a comprehensive server API layer built on GraphQL (via Strawberry) and REST endpoints for managing traces, experiments, datasets, and evaluations. The API architecture is designed for observability, authentication, and efficient data loading.
Architecture Overview
Phoenix's server API follows a layered architecture that separates concerns between request handling, authentication, data access, and response rendering.
graph TD
A[Client Request] --> B[API Router Layer]
B --> C[Authentication Middleware]
C --> D[Context Builder]
D --> E[GraphQL Executor / REST Handler]
E --> F[Data Loaders]
F --> G[Database / Storage]
G --> H[Response]Core Components
Server Initialization
The main server module (src/phoenix/server/__init__.py) orchestrates the application lifecycle, including database initialization, API router setup, and background task management.
| Component | Purpose | Key Responsibilities |
|---|---|---|
Server | Main application class | Initialize app, setup routes, manage lifecycle |
App | ASGI application | Handle HTTP/WebSocket connections |
Database | Data persistence | SQLite/PostgreSQL connection management |
Sources: src/phoenix/server/__init__.py:1-50
Request Context
The context module (src/phoenix/server/api/context.py) provides request-scoped data access throughout the API layer. It manages database sessions, user authentication, and configuration access.
class Context:
def __init__(
self,
db: Session,
user: User | None,
data_loaders: dict[str, DataLoader],
) -> None:
self.db = db
self.user = user
self.data_loaders = data_loaders
Key Context Properties:
| Property | Type | Description | |
|---|---|---|---|
db | Session | SQLAlchemy database session | |
user | `User \ | None` | Authenticated user or anonymous |
data_loaders | dict[str, DataLoader] | Batch data loading utilities | |
has_auth | bool | Whether authentication is enabled |
Sources: src/phoenix/server/api/context.py:1-100
Authentication System
The authentication module (src/phoenix/server/api/auth.py) implements token-based authentication for API access.
Authentication Flow
sequenceDiagram
participant Client
participant API
participant Auth
participant DB
Client->>API: Request + API Key
API->>Auth: Validate Token
Auth->>DB: Lookup User
DB-->>Auth: User Record
Auth-->>API: Authenticated Context
API-->>Client: Response with ContextAuthentication Methods
| Method | Header | Description |
|---|---|---|
| API Key | Authorization: Bearer <key> | Token-based authentication |
| Session | Cookie-based | Web UI authentication |
| Anonymous | None | Read-only access when auth disabled |
Permission Classes
Phoenix enforces permission checks on mutations and subscriptions:
IsNotReadOnly- Prevents read-only users from modifying dataIsNotViewer- Prevents viewer roles from write operations
Sources: src/phoenix/server/api/auth.py:1-80
GraphQL Schema
Phoenix uses Strawberry GraphQL for its primary API surface, enabling type-safe queries and mutations with automatic documentation.
Schema Structure
Query
├── datasets
│ ├── get_dataset
│ └── get_dataset_by_id
├── experiments
│ ├── experiments
│ └── experiment_by_id
├── projects
│ ├── projects
│ └── project_by_id
├── spans
│ └── get_spans
└── traces
└── get_traces
Mutation
├── datasets
│ ├── create_dataset
│ ├── upload_dataset
│ └── delete_dataset
├── experiments
│ ├── create_experiment
│ └── run_experiment
├── spans
│ └── create_span_annotation
└── traces
└── create_trace_annotation
Data Loaders
Data loaders (src/phoenix/server/api/dataloaders/) implement the N+1 query problem solution through batch loading and caching within request scope.
class DatasetLoader(DataLoader[int, Dataset]):
def batch_load(self, dataset_ids: list[int]) -> list[Dataset]:
# Single database query for all IDs
datasets = self.db.query(Dataset).filter(Dataset.id.in_(dataset_ids)).all()
return datasets
| DataLoader | Purpose |
|---|---|
DatasetLoader | Batch load datasets by ID |
ProjectLoader | Batch load projects by ID |
SpanLoader | Batch load spans by ID |
AnnotationLoader | Batch load annotations |
UserLoader | Batch load user records |
Sources: src/phoenix/server/api/dataloaders/__init__.py:1-50
REST API Endpoints
API Router Structure
# src/phoenix/server/api/routers/__init__.py
routers = [
datasets_router,
experiments_router,
spans_router,
traces_router,
evaluations_router,
]
Dataset Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/datasets | List all datasets |
POST | /v1/datasets | Create new dataset |
GET | /v1/datasets/{id} | Get dataset by ID |
PUT | /v1/datasets/{id} | Update dataset |
DELETE | /v1/datasets/{id} | Delete dataset |
POST | /v1/datasets/{id}/upload | Upload data to dataset |
Experiment Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/experiments | List experiments |
POST | /v1/experiments | Create experiment |
GET | /v1/experiments/{id} | Get experiment details |
POST | /v1/experiments/{id}/run | Run experiment evaluation |
Span and Trace Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/spans | Query spans with filtering |
POST | /v1/spans/{id}/annotations | Add span annotation |
GET | /v1/traces | Query traces |
POST | /v1/traces/{id}/annotations | Add trace annotation |
Sources: src/phoenix/server/api/routers/__init__.py:1-30
OpenAPI Schema
Phoenix exposes its REST API through an OpenAPI schema that can be compiled for documentation and client generation:
python scripts/ci/compile_openapi_schema.py
This generates the schema at openapi-schema.json for validation and client SDK generation.
Sources: scripts/README.md:1-50
Environment Variables
The API server recognizes the following configuration variables:
| Variable | Description | Default |
|---|---|---|
PHOENIX_HOST | Server host URL | http://localhost:6006 |
PHOENIX_PORT | HTTP port | 6006 |
PHOENIX_GRPC_PORT | gRPC port | 4317 |
PHOENIX_API_KEY | Authentication key | None |
PHOENIXCollectorEndpoint | OTEL collector endpoint | Internal |
Client Integration
Python Client
from phoenix.client import Client
client = Client(
host="http://localhost:6006",
api_key="your-api-key"
)
# Query datasets
datasets = client.datasets.list()
# Get spans as DataFrame
spans = client.spans.get_spans_dataframe(
project_identifier="my-project",
limit=1000
)
TypeScript Client
import { Client } from "@arizeai/phoenix-client";
const client = new Client({
host: "http://localhost:6006",
apiKey: "your-api-key"
});
const datasets = await client.datasets.list();
Request/Response Patterns
GraphQL Query Example
query GetDataset($id: GlobalID!) {
dataset(id: $id) {
id
name
version
createdAt
spanCount
traceCount
}
}
REST Request with Filtering
curl -X GET "http://localhost:6006/v1/spans?project_id=abc123&limit=100" \
-H "Authorization: Bearer $PHOENIX_API_KEY"
Summary
Phoenix's Server API and GraphQL layer provides a unified interface for observability operations:
- GraphQL via Strawberry for type-safe, self-documenting queries
- REST endpoints for compatibility and specialized operations
- Authentication via API keys with role-based permissions
- Data Loaders for efficient batch data loading
- OpenAPI schema for client SDK generation
Sources: src/phoenix/server/__init__.py:1-50
Python SDK (arize-phoenix-client)
Related topics: Evaluation System (Phoenix Evals), OpenTelemetry Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Evaluation System (Phoenix Evals), OpenTelemetry Integration
Python SDK (arize-phoenix-client)
The arize-phoenix-client is the official Python SDK for Arize Phoenix, providing a comprehensive interface for interacting with the Phoenix platform via its REST API. This SDK enables developers to programmatically manage datasets, run experiments, analyze traces, and collect feedback for LLM observability and evaluation workflows.
Overview
Phoenix is an open-source AI observability platform designed for debugging, evaluating, and refining LLM applications. The Python SDK serves as the primary programmatic interface for:
- REST API Integration - Full access to Phoenix's OpenAPI REST interface
- Prompt Management - Create, version, and invoke prompt templates
- Dataset Operations - Create and append datasets from DataFrames, CSV files, or dictionaries
- Experiment Tracking - Run evaluations and track experiment results
- Trace Analysis - Query and analyze traces with powerful filtering capabilities
- Annotation Workflows - Add human feedback and automated evaluations to spans
Sources: packages/phoenix-client/README.md
Installation
Install the SDK using pip:
pip install arize-phoenix-client
Sources: packages/phoenix-client/README.md
Client Initialization
The SDK provides both synchronous and asynchronous client implementations.
Synchronous Client
from phoenix.client import Client
# Automatic configuration from environment variables
client = Client()
# Explicit configuration
client = Client(base_url="http://localhost:6006")
# Cloud instance with API key
client = Client(
base_url="https://app.phoenix.arize.com/s/your-space",
api_key="your-api-key"
)
Asynchronous Client
from phoenix.client import Client, AsyncClient
# Create async client with same configuration options
async_client = AsyncClient()
async_client = AsyncClient(base_url="http://localhost:6006")
async_client = AsyncClient(
base_url="https://app.phoenix.arize.com/s/your-space",
api_key="your-api-key"
)
Sources: packages/phoenix-client/README.md
Configuration
Environment Variables
The client automatically reads configuration from environment variables:
| Variable | Description | Example |
|---|---|---|
PHOENIX_BASE_URL | Base URL of the Phoenix server | http://localhost:6006 |
PHOENIX_API_KEY | API key for authentication | sk-xxxxx |
PHOENIX_CLIENT_HEADERS | Custom headers (JSON stringified) | {"Authorization": "Bearer xxx"} |
Configuration Examples
# Local Phoenix server (default)
export PHOENIX_BASE_URL="http://localhost:6006"
# Cloud Instance
export PHOENIX_API_KEY="your-api-key"
export PHOENIX_BASE_URL="https://app.phoenix.arize.com/s/your-space"
# Custom Headers
export PHOENIX_CLIENT_HEADERS="Authorization=Bearer your-api-key,custom-header=value"
Sources: packages/phoenix-client/README.md
Custom Authentication Headers
For custom authentication scenarios:
from phoenix.client import Client
client = Client(
base_url="https://your-phoenix-instance.com",
headers={"Authorization": "Bearer your-api-key"}
)
Architecture
Client Structure
The SDK follows a resource-based architecture where the main Client provides access to specialized resource objects:
graph TD
A[Client] --> B[prompts]
A --> C[datasets]
A --> D[experiments]
A --> E[spans]
A --> F[annotations]
B --> B1[create, get, list, format]
C --> C1[create, append, get, list]
D --> D1[run, track, evaluate]
E --> E1[query, filter, export]
F --> F1[add, update, evaluate]SDK Package Structure
phoenix/
└── client/
├── client.py # Main Client and AsyncClient classes
├── resources/ # Resource implementations
│ ├── prompts.py
│ ├── datasets.py
│ ├── experiments.py
│ ├── spans.py
│ └── annotations.py
└── helpers/
└── sdk/ # SDK utilities and helpers
Sources: packages/phoenix-client/src/phoenix/client/client.py Sources: packages/phoenix-client/src/phoenix/client/resources
Core Resources
Prompts Resource
Manage prompt templates and versions with versioning support:
from phoenix.client import Client
from phoenix.client.types import PromptVersion
client = Client()
content = """
You're an expert educator in {{ topic }}. Summarize the following article
in a few concise bullet points that are easy for beginners to understand.
{{ article }}
"""
prompt = client.prompts.create(
name="article-bullet-summarizer",
version=PromptVersion(
messages=[{"role": "user", "content": content}],
model_name="gpt-4o-mini",
),
prompt_description="Summarize an article in a few bullet points"
)
# Retrieve and use prompts
prompt = client.prompts.get(prompt_identifier="article-bullet-summarizer")
# Format the prompt with variables
prompt_vars = {
"topic": "Sports",
"article": "Moises Henriques has signed to play for Surrey..."
}
formatted_prompt = prompt.format(variables=prompt_vars)
Prompt Version Type:
| Parameter | Type | Description |
|---|---|---|
messages | List[dict] | Message array with role and content |
model_name | str | LLM model to use for the prompt |
temperature | float | Sampling temperature (optional) |
Sources: packages/phoenix-client/README.md
Datasets Resource
Create and manage datasets from various data sources:
from phoenix.client import Client
import pandas as pd
client = Client()
# Create from DataFrame
dataset = client.datasets.create(
name="my-dataset",
dataframe=df,
description="Training data for sentiment analysis"
)
# Append additional data
client.datasets.append(
dataset_id=dataset.id,
dataframe=additional_df
)
# Get dataset
dataset = client.datasets.get(
dataset_name="my-dataset",
version_id="optional-version-id"
)
Supported Input Formats:
| Format | Method | Example |
|---|---|---|
| DataFrame | dataframe parameter | dataframe=pd.DataFrame(...) |
| CSV | csv_path parameter | csv_path="/path/to/data.csv" |
| Dictionary | dictionary parameter | dictionary=[{"col": "value"}] |
Sources: packages/phoenix-client/README.md
Experiments Resource
Run evaluations and track experiment results:
from phoenix.client import Client
client = Client()
# Run an experiment
experiment = client.experiments.run(
name="sentiment-analysis-v1",
dataset_id="dataset-uuid",
evaluator_config={
"model": "gpt-4",
"metrics": ["accuracy", "f1"]
}
)
# Track results
client.experiments.track(
experiment_id=experiment.id,
results={"accuracy": 0.95, "f1": 0.93}
)
Sources: packages/phoenix-client/README.md
Spans Resource
Query and analyze traces with powerful filtering:
from phoenix.client import Client
client = Client()
# Query spans with filters
spans = client.spans.query(
project_name="my-project",
filter_conditions={
"trace_id": "optional-trace-id",
"start_time": "2024-01-01T00:00:00Z",
"end_time": "2024-01-02T00:00:00Z"
},
limit=100
)
# Get span details
span = client.spans.get(span_id="span-uuid")
Query Parameters:
| Parameter | Type | Description |
|---|---|---|
project_name | str | Phoenix project name |
filter_conditions | dict | Filtering conditions |
limit | int | Maximum results to return |
start_time | datetime | Start of time range |
end_time | datetime | End of time range |
Sources: packages/phoenix-client/src/phoenix/client/resources
Annotations Resource
Add human feedback and automated evaluations:
from phoenix.client import Client
client = Client()
# Add annotation to span
annotation = client.annotations.add(
span_id="span-uuid",
label="correct",
score=1.0,
metadata={"reviewer": "human"}
)
# Bulk add annotations
client.annotations.bulk_add(
annotations=[
{"span_id": "span-1", "label": "correct", "score": 1.0},
{"span_id": "span-2", "label": "incorrect", "score": 0.0}
]
)
Sources: packages/phoenix-client/README.md
Migration from Legacy Client
The legacy phoenix.session.client.Client has been removed. Users must migrate to the new SDK:
# Old (deprecated)
from phoenix.session.client import Client
# New
from phoenix.client import Client
Sources: src/phoenix/__init__.py
Data Flow Diagram
graph LR
A[Python Application] -->|arize-phoenix-client| B[Phoenix REST API]
B --> C[Phoenix Server]
C --> D[(Database)]
A --> E[Prompts]
A --> F[Datasets]
A --> G[Experiments]
A --> H[Traces/Spans]
A --> I[Annotations]
E --> B
F --> B
G --> B
H --> B
I --> BUse Cases
RAG Evaluation Workflow
from phoenix.client import Client
client = Client()
# 1. Create dataset
dataset = client.datasets.create(
name="rag-evaluation-set",
dataframe=evaluation_df
)
# 2. Run evaluation experiment
experiment = client.experiments.run(
name="rag-faithfulness-eval",
dataset_id=dataset.id,
evaluator_config={
"model": "gpt-4",
"metrics": ["faithfulness", "answer_relevance"]
}
)
# 3. Query results
results = client.experiments.get_results(experiment_id=experiment.id)
# 4. Add annotations to spans
for span_id, score in results.items():
client.annotations.add(
span_id=span_id,
label="faithful" if score > 0.8 else "unfaithful",
score=score
)
Prompt Versioning and Management
from phoenix.client import Client
from phoenix.client.types import PromptVersion
client = Client()
# Create initial prompt version
prompt_v1 = client.prompts.create(
name="customer-support-assistant",
version=PromptVersion(
messages=[{"role": "system", "content": "You are a helpful assistant."}],
model_name="gpt-4"
)
)
# Create new version
prompt_v2 = client.prompts.create(
name="customer-support-assistant",
version=PromptVersion(
messages=[{"role": "system", "content": "You are a helpful support agent trained to be concise."}],
model_name="gpt-4"
)
)
# Compare versions
current = client.prompts.get(prompt_identifier="customer-support-assistant")
Summary
The arize-phoenix-client Python SDK provides a comprehensive, Pythonic interface for interacting with the Phoenix observability platform. Key capabilities include:
- Unified Client Interface - Single entry point for all Phoenix operations
- Resource-Based Design - Organized access to prompts, datasets, experiments, spans, and annotations
- Environment-Based Configuration - Zero-config setup via environment variables
- Async Support - Built-in async client for asynchronous applications
- Type Safety - Full type hints and Pydantic models for validation
Sources: packages/phoenix-client/src/phoenix/client/client.py Sources: packages/phoenix-client/README.md
Sources: packages/phoenix-client/README.md
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
Developers may fail before the first successful local run: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: Docs proposal: RAG failure mode checklist for observability and eval workflows
- Severity: high
- Finding: Installation risk is backed by a source signal: Docs proposal: RAG failure mode checklist for observability and eval workflows. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/11472
2. Installation risk: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with App…
- Severity: high
- Finding: Installation risk is backed by a source signal: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with App…. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/12941
3. Installation risk: [agents] investigate clientside tracing for external tools
- Severity: high
- Finding: Installation risk is backed by a source signal: [agents] investigate clientside tracing for external tools. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/13173
4. Installation risk: Developers should check this installation risk before relying on the project: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
- User impact: Developers may fail before the first successful local run: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM. Context: Observed when using python, docker, linux
- Evidence: failure_mode_cluster:github_issue | fmev_b3db5f930ac5e7b7f85e47ff9693c190 | https://github.com/Arize-ai/phoenix/issues/12941 | [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
5. Configuration risk: Developers should check this configuration risk before relying on the project: [sandboxes] per-execute timeout enforcement is incomplete across backends
- Severity: medium
- Finding: Developers should check this configuration risk before relying on the project: [sandboxes] per-execute timeout enforcement is incomplete across backends
- User impact: Developers may misconfigure credentials, environment, or host setup: [sandboxes] per-execute timeout enforcement is incomplete across backends
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [sandboxes] per-execute timeout enforcement is incomplete across backends. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | fmev_5978fbc9db2aea6762b2ab2f9e8d0205 | https://github.com/Arize-ai/phoenix/issues/13313 | [sandboxes] per-execute timeout enforcement is incomplete across backends
6. Configuration risk: [sandboxes] per-execute timeout enforcement is incomplete across backends
- Severity: medium
- Finding: Configuration risk is backed by a source signal: [sandboxes] per-execute timeout enforcement is incomplete across backends. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/13313
7. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:564072810 | https://github.com/Arize-ai/phoenix | README/documentation is current enough for a first validation pass.
8. Project risk: Developers should check this runtime risk before relying on the project: arize-phoenix: v15.5.0
- Severity: medium
- Finding: Developers should check this runtime risk before relying on the project: arize-phoenix: v15.5.0
- User impact: Upgrade or migration may change expected behavior: arize-phoenix: v15.5.0
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: arize-phoenix: v15.5.0. Context: Observed when using docker
- Evidence: failure_mode_cluster:github_release | fmev_77bc9f7097156e71a699d05abced2916 | https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.0 | arize-phoenix: v15.5.0
9. Project risk: arize-phoenix: v15.3.0
- Severity: medium
- Finding: Project risk is backed by a source signal: arize-phoenix: v15.3.0. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.3.0
10. Project risk: arize-phoenix: v15.5.0
- Severity: medium
- Finding: Project risk is backed by a source signal: arize-phoenix: v15.5.0. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.0
11. Project risk: arize-phoenix: v15.5.1
- Severity: medium
- Finding: Project risk is backed by a source signal: arize-phoenix: v15.5.1. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.1
12. Project risk: arize-phoenix: v15.6.0
- Severity: medium
- Finding: Project risk is backed by a source signal: arize-phoenix: v15.6.0. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.6.0
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using phoenix with real data or production workflows.
- [[sandboxes] per-execute timeout enforcement is incomplete across backend](https://github.com/Arize-ai/phoenix/issues/13313) - github / github_issue
- [[BUG]: playground experiment UI rendering issue](https://github.com/Arize-ai/phoenix/issues/13308) - github / github_issue
- Docs proposal: RAG failure mode checklist for observability and eval wor - github / github_issue
- [[BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with pod](https://github.com/Arize-ai/phoenix/issues/12941) - github / github_issue
- [[agents] investigate clientside tracing for external tools](https://github.com/Arize-ai/phoenix/issues/13173) - github / github_issue
- GenerativeModelStore: _last_fetch_id can regress and defeat incremental - github / github_issue
- [[security] setup deepsec](https://github.com/Arize-ai/phoenix/issues/13275) - github / github_issue
- arize-phoenix: v15.10.1 - github / github_release
- arize-phoenix: v15.10.0 - github / github_release
- arize-phoenix: v15.9.0 - github / github_release
- arize-phoenix: v15.8.0 - github / github_release
- arize-phoenix: v15.7.0 - github / github_release
Source: Project Pack community evidence and pitfall evidence