Doramagic Project Pack · Human Manual

phoenix

Related topics: System Architecture

Project Overview

Related topics: System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Package Structure

Continue reading this section for the full explanation and source context.

Section phoenix-config

Continue reading this section for the full explanation and source context.

Related topics: System Architecture

Project Overview

Phoenix is an open-source observability platform for AI applications developed by Arize AI. It provides comprehensive tracing, evaluation, and debugging capabilities for LLM-powered applications, agents, and vector retrieval systems.

What is Phoenix?

Phoenix is a self-hostable observability platform designed to help developers:

  • Trace LLM applications: Capture and analyze spans, prompts, completions, and tool executions
  • Evaluate AI outputs: Run automated evaluations for hallucinations, relevance, toxicity, and custom metrics
  • Debug retrievals: Inspect vector search operations and document retrieval pipelines
  • Monitor performance: Track latency, token usage, and cost metrics across AI workflows

Sources: js/packages/phoenix-config/README.md

Architecture Overview

Phoenix follows a multi-layer architecture with Python backend services and TypeScript/JavaScript client libraries.

graph TD
    A[AI Application] -->|OTel Traces| B[Phoenix OTEL]
    A -->|Direct API| C[Phoenix Client SDK]
    B -->|Traces| D[Phoenix Server]
    C -->|Data| D
    D -->|UI| E[Phoenix Web UI]
    F[Phoenix MCP Server] -->|Tools| A
    G[Phoenix Evals] -->|Evaluations| D

Core Components

ComponentTechnologyPurpose
Phoenix ServerPythonBackend API and data storage
Phoenix OTELPythonOpenTelemetry instrumentation
Phoenix ClientTypeScriptType-safe API access
Phoenix ConfigTypeScriptEnvironment variable parsing
Phoenix MCPTypeScriptModel Context Protocol server
Phoenix EvalsTypeScriptLLM-based evaluation library

Sources: js/pnpm-workspace.yaml

JavaScript/TypeScript Packages

Phoenix provides a comprehensive TypeScript ecosystem organized as a pnpm workspace.

Sources: js/pnpm-workspace.yaml

Package Structure

js/
├── packages/
│   ├── phoenix-client/      # Core API client
│   ├── phoenix-config/      # Environment variable utilities
│   ├── phoenix-evals/       # Evaluation library
│   └── phoenix-mcp/         # Model Context Protocol server
└── examples/
    └── apps/
        └── cli-agent-starter-kit/  # Example CLI agent

phoenix-config

Shared configuration parsing utilities used across other Phoenix packages. Provides typed helpers for reading Phoenix environment variables.

Sources: js/packages/phoenix-config/README.md

phoenix-evals

A vendor-agnostic TypeScript evaluation library for assessing AI output quality. Supports custom classifiers for hallucination detection, relevance scoring, and binary/multi-class classification tasks.

import { createClassifier } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";

const model = openai("gpt-4o-mini");
const classifier = createClassifier({ model, promptTemplate });

Sources: js/packages/phoenix-evals/README.md

phoenix-mcp

A Model Context Protocol server that exposes Phoenix tools for AI agents. Enables agentic workflows to interact with Phoenix data.

Supported Tools:

  • Prompts: list-prompts, get-prompt, get-latest-prompt, upsert-prompt
  • Projects: Project management operations
  • Traces: Query and analyze traces

Sources: js/packages/phoenix-mcp/README.md

Python SDK

Phoenix provides Python instrumentation through the arize-phoenix-otel package.

Sources: app/src/components/project/PythonProjectGuide.tsx

Installation

pip install arize-phoenix-otel

Quick Start

from phoenix.otel import register

# Configure Phoenix tracing
tracer_provider = register(project_name="my-app")

The arize-phoenix-otel package automatically picks up configuration from environment variables, simplifying the setup process for developers.

Sources: app/src/components/project/PythonProjectGuide.tsx

Environment Variables

Phoenix uses standardized environment variables for configuration across all SDKs.

Sources: js/packages/phoenix-config/README.md

VariableConstantDescription
PHOENIX_HOSTENV_PHOENIX_HOSTPhoenix server URL (e.g., http://localhost:6006)
PHOENIX_API_KEYENV_PHOENIX_API_KEYAPI key for authentication
PHOENIX_CLIENT_HEADERSENV_PHOENIX_CLIENT_HEADERSJSON-encoded custom headers
PHOENIX_COLLECTOR_ENDPOINTENV_PHOENIX_COLLECTOR_ENDPOINTOTel collector endpoint
PHOENIX_PORTENV_PHOENIX_PORTHTTP port (integer)
PHOENIX_GRPC_PORTENV_PHOENIX_GRPC_PORTgRPC port for OpenTelemetry
PHOENIX_PROJECTENV_PHOENIX_PROJECTDefault project name

Project Onboarding Flow

Phoenix provides an interactive onboarding system that guides users through setup.

Sources: app/src/pages/project/OnboardingSteps.tsx

Onboarding Steps Component

The OnboardingSteps component accepts the following parameters:

interface OnboardingStepsProps {
  language: ProgrammingLanguage;
  packages: readonly string[];
  implementationCode: string;
  docsHref?: string;
  githubHref?: string;
  generatedApiKey: string | null;
  onApiKeyGenerated: (key: string) => void;
  extraEnvVars?: readonly EnvVar[];
}

Workflow

graph LR
    A[Select Language] --> B[Install Packages]
    B --> C[Configure Environment]
    C --> D{Auth Enabled?}
    D -->|Yes| E[Generate API Key]
    D -->|No| F[Add Environment Variables]
    E --> G[Copy Setup Code]
    F --> G
    G --> H[Run Application]
    H --> I[View Traces in Phoenix]

The onboarding system automatically detects authentication requirements and adjusts the setup flow accordingly. Users can generate API keys directly from the UI when authentication is enabled.

Sources: app/src/components/project/PythonProjectGuide.tsx

CLI Agent Starter Kit

Phoenix includes a complete CLI agent example demonstrating production-ready patterns.

Sources: js/examples/apps/cli-agent-starter-kit/README.md

Project Structure

src/
├── cli.ts              # Entry point
├── agent/              # Agent factory
├── tools/              # Tool definitions
│   ├── index.ts        # Tool exports
│   ├── datetime.ts     # Utility tool
│   └── mcp.ts          # Phoenix docs MCP
├── prompts/            # System instructions
└── ui/                 # CLI interface

Requirements

  • Node.js 22+
  • pnpm
  • Docker Desktop
  • Anthropic API key

Quick Start

pnpm install
cp .env.example .env
# Add ANTHROPIC_API_KEY to .env
pnpm dev

Phoenix UI will be available at http://localhost:6006.

Sources: js/examples/apps/cli-agent-starter-kit/README.md

Integration Ecosystem

Phoenix supports a wide range of LLM providers and frameworks through its integration ecosystem.

Supported Providers

ProviderIconCategory
OpenAISVGLLM
AnthropicSVGLLM
LiteLLMSVGProxy
OpenRouterSVGProxy
LangGraphSVGFramework
MoonshotSVGLLM
xAISVGLLM
OllamaSVGLocal

Sources: app/src/components/project/IntegrationIcons.tsx

Development Workflow

Setting Up the JavaScript Workspace

# From the /js/ directory
pnpm install
pnpm build

Development Mode

pnpm dev

Building

pnpm build

Debugging MCP Server

pnpm inspect

Sources: js/packages/phoenix-mcp/README.md

Summary

Phoenix is a comprehensive observability platform that bridges the gap between development and production monitoring for AI applications. Its multi-language support (Python and TypeScript), OpenTelemetry-native architecture, and extensible evaluation framework make it suitable for teams of all sizes building LLM-powered products.

The platform's modular design allows developers to adopt only the components they need—whether that's basic tracing through OTEL, custom evaluations with the evals library, or full agentic observability through the MCP server.

Sources: js/packages/phoenix-config/README.md

System Architecture

Related topics: Project Overview, Server API and GraphQL, Frontend Application

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Environment Variables Table

Continue reading this section for the full explanation and source context.

Section Multi-Language SDK Support

Continue reading this section for the full explanation and source context.

Section Python Client

Continue reading this section for the full explanation and source context.

Related topics: Project Overview, Server API and GraphQL, Frontend Application

System Architecture

Overview

Phoenix is an LLM observability platform designed to help developers trace, evaluate, and debug AI applications. The system architecture follows a client-server model with support for multiple programming languages and observability standards.

Core Components

The Phoenix platform consists of three primary layers:

Component LayerDescriptionKey Technologies
Frontend ApplicationReact-based web UI for visualization and interactionReact, TypeScript, @arizeai/ui
Configuration LibraryShared utilities for environment parsingTypeScript, npm package (@arizeai/phoenix-config)
Backend ServerPhoenix server for data ingestion and servingPython, FastAPI, OpenTelemetry

Configuration System

Phoenix uses environment variables for configuration across all client SDKs and the server itself.

Environment Variables Table

VariableConstantTypePurpose
PHOENIX_HOSTENV_PHOENIX_HOSTstringPhoenix server host URL (e.g., http://localhost:6006)
PHOENIX_API_KEYENV_PHOENIX_API_KEYstringAPI key for authentication
PHOENIX_CLIENT_HEADERSENV_PHOENIX_CLIENT_HEADERSJSONCustom headers for client requests
PHOENIX_COLLECTOR_ENDPOINTENV_PHOENIX_COLLECTOR_ENDPOINTstringOpenTelemetry collector endpoint URL
PHOENIX_PORTENV_PHOENIX_PORTintegerPhoenix HTTP port
PHOENIX_GRPC_PORTENV_PHOENIX_GRPC_PORTintegerPhoenix gRPC port for OpenTelemetry
PHOENIX_PROJECTENV_PHOENIX_PROJECTstringDefault project name for project-scoped operations

Sources: js/packages/phoenix-config/README.md:1-25

Client Architecture

Multi-Language SDK Support

Phoenix provides language-specific client libraries that interface with the Phoenix server.

graph TD
    A[Application Code] --> B[Phoenix OTEL / SDK]
    B --> C[Phoenix Server]
    C --> D[(Data Storage)]
    
    subgraph Python Ecosystem
        B1[arize-phoenix-otel]
        B1 --> B
    end
    
    subgraph TypeScript Ecosystem
        B2[phoenix-otel]
        B3[phoenix-client]
        B2 --> B
        B3 --> B
    end

Python Client

The Python integration uses OpenTelemetry for automatic instrumentation.

Installation:

pip install arize-phoenix-otel

The arize-phoenix-otel package automatically picks up configuration from environment variables, enabling seamless integration without explicit setup code in most cases.

Sources: app/src/components/project/PythonProjectGuide.tsx:1-35

TypeScript/Node.js Client

The TypeScript ecosystem provides two main packages:

PackagePurpose
@arizeai/phoenix-otelOpenTelemetry instrumentation for tracing
@arizeai/phoenix-clientClient library for API interactions

Quick Start Pattern:

import { register } from '@arizeai/phoenix-otel';

// Project setup with automatic OTEL initialization
register({ projectName: 'my-project' });

Sources: app/src/components/project/TypeScriptProjectGuide.tsx:1-25

Authentication Architecture

Phoenix implements API key-based authentication with role-based access control.

Authentication Flow

sequenceDiagram
    participant Client
    participant PhoenixServer
    participant Config
    
    Client->>PhoenixServer: Request with API Key
    PhoenixServer->>Config: Check authentication settings
    Config-->>PhoenixServer: authenticationEnabled: boolean
    alt Authentication Enabled
        PhoenixServer->>PhoenixServer: Validate API Key
        alt Valid Key
            PhoenixServer-->>Client: 200 OK + Data
        else Invalid Key
            PhoenixServer-->>Client: 401 Unauthorized
        end
    else Authentication Disabled
        PhoenixServer-->>Client: 200 OK + Data
    end

API Key Management

The UI provides different API key management interfaces based on user roles:

RoleAPI Key Management Location
AdminSettings → General
Regular UserProfile Page

Sources: app/src/components/project/OnboardingSteps.tsx:30-55

Dataset Management

Phoenix clients can interact with datasets through the Python and TypeScript client libraries.

Dataset Operations

OperationDescription
CreateInitialize a new dataset in the project
GetRetrieve dataset by name or version
UpdateModify existing dataset metadata
ListEnumerate all available datasets

Python Client Pattern:

client = Client()
dataset = client.datasets.get_dataset(
    dataset="my-dataset",
    version_id="optional-version-id"
)

Sources: app/src/components/experiment/RunExperimentCodeDialog.tsx:1-45

Generative AI Provider Integration

Phoenix integrates with multiple generative AI providers for observability and tracing.

Supported Providers

ProviderSVG IconIntegration Type
OpenAIAPI Key + OTEL
MoonshotAPI Key
LiteLLMUnified API
AgnoAgent Framework
OpenRouterGateway
AnthropicDirect

The GenerativeProviderIcon.tsx component renders provider-specific SVG icons throughout the UI, while the IntegrationIcons.tsx file contains SVG definitions for all supported integrations.

Sources: app/src/components/generative/GenerativeProviderIcon.tsx:1-60

UI Component Architecture

Component Hierarchy

graph TD
    subgraph Project Guides
        PG1[PythonProjectGuide]
        PG2[TypeScriptProjectGuide]
    end
    
    subgraph Core Components
        OC[OnboardingSteps]
        MD[Markdown Components]
        IC[Icons]
    end
    
    subgraph Experiment Features
        RX[RunExperimentCodeDialog]
    end
    
    OC --> PG1
    OC --> PG2
    IC --> PG1
    IC --> PG2
    MD --> RX

Markdown Rendering

The streamdownComponents.tsx module provides custom React components for rendering markdown content:

ComponentPurpose
liTask list items with checkbox support
blockquoteStyled quote blocks
inlineCodeInline code styling
table, thead, tbody, tr, th, tdTable elements
imgStyled images
hrHorizontal rules

Sources: app/src/components/markdown/streamdownComponents.tsx:1-55

Icon System

Phoenix uses a centralized icon system defined in Icons.tsx. Icons follow a consistent design pattern:

export const IconName = () => (
  <svg
    width="24"
    height="24"
    viewBox="0 0 24 24"
    fill="none"
    xmlns="http://www.w3.org/2000/svg"
  >
    {/* SVG path definitions */}
  </svg>
);

Key icon categories include:

  • Navigation: ArrowCompareOutline, MoonOutline
  • Actions: TemplateOutline
  • Status: FireOutline

Sources: app/src/components/core/icon/Icons.tsx:1-120

Documentation Architecture

Phoenix uses Sphinx for API documentation generation with autodoc support.

Documentation Build Process

graph LR
    A[Source Modules] -->|sphinx-apidoc| B[RST Files]
    B -->|autodoc| C[Docstrings]
    C --> D[HTML/Markdown Output]
    
    subgraph Build Tools
        B1[Sphinx]
        B2[ReadTheDocs]
    end

Sphinx-Apidoc Command

sphinx-apidoc -o ./source/output ../path/to/module --separate -M

Key options:

  • --separate: Creates separate .rst files per module
  • -M: Use module names instead of file names for titles

Sources: api_reference/README.md:1-80

OpenTelemetry Integration

Phoenix leverages OpenTelemetry as the standard observability framework:

Collector Endpoint Configuration

graph LR
    A[Application] -->|OTLP| B[Phoenix Collector]
    B --> C[Phoenix Server]
    
    subgraph Transport Protocols
        G[gRPC]
        H[HTTP/protobuf]
    end
    
    B --- G
    B --- H
Port TypePurpose
HTTP (configurable)OTLP HTTP receiver
gRPC (configurable)OTLP gRPC receiver

Sources: js/packages/phoenix-config/README.md:15-16

Security Considerations

Environment Variable Security

VariableSensitivityRecommendation
PHOENIX_API_KEYHighStore in secure secret manager
PHOENIX_CLIENT_HEADERSMediumValidate JSON structure
PHOENIX_HOSTLowEnsure HTTPS for production

API Key Scopes

  • Personal API Keys: Created per-user, managed on Profile page
  • System API Keys: Admin-managed, configured in Settings

Summary

The Phoenix system architecture provides a robust, multi-language observability platform with:

  1. Flexible Configuration: Environment-based configuration works across all SDKs
  2. Multi-language Support: Native Python and TypeScript clients
  3. Standard Observability: OpenTelemetry integration for vendor-neutral tracing
  4. Role-based Access: Enterprise-ready authentication and authorization
  5. Extensible Providers: Support for multiple LLM providers through standardized interfaces

Sources: js/packages/phoenix-config/README.md:1-25

Tracing System

Related topics: OpenTelemetry Integration, Database Models and Migrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Spans

Continue reading this section for the full explanation and source context.

Section Trace Annotations

Continue reading this section for the full explanation and source context.

Section Span Annotations

Continue reading this section for the full explanation and source context.

Related topics: OpenTelemetry Integration, Database Models and Migrations

Tracing System

Phoenix's Tracing System provides comprehensive observability for LLM applications by capturing, storing, and analyzing execution traces. It leverages OpenTelemetry standards to instrument applications and provides both Python and TypeScript clients for interacting with trace data.

Overview

The tracing system enables developers to:

  • Capture detailed execution traces from LLM applications
  • Store and query traces with filtering by time, session, and project
  • Add annotations for evaluation and human feedback
  • Analyze spans within traces for debugging and performance optimization

Sources: js/packages/phoenix-client/README.md:1-30

Architecture

graph TD
    A[Application Code] --> B[Phoenix OTEL]
    B --> C[Span Exporter]
    C --> D[Phoenix Server]
    D --> E[(SQLite/PostgreSQL)]
    
    F[Python Client] --> D
    G[TypeScript Client] --> D
    
    H[Spans] --> E
    I[Trace Annotations] --> E
    J[Span Annotations] --> E
    K[Session Annotations] --> E

The system consists of three main layers:

  1. Instrumentation Layer: OpenTelemetry-based tracing via arize-phoenix-otel
  2. Storage Layer: Database insertion handlers for traces, spans, and annotations
  3. API Layer: REST endpoints for querying and managing trace data

Sources: js/packages/phoenix-otel/README.md:1-50

Core Components

Spans

Spans represent individual units of work within a trace. Each span captures:

  • Timing information: start time, end time, duration
  • Input/output data: prompts, responses, tool parameters
  • Attributes: metadata like model name, token counts, latency

Sources: src/phoenix/db/insertion/span.py

Trace Annotations

Annotations attached at the trace level for evaluation purposes. Used to store:

  • Correctness scores
  • Custom evaluation results
  • Human feedback

Sources: src/phoenix/db/insertion/trace_annotation.py

Span Annotations

Annotations attached to individual spans for fine-grained evaluation:

from phoenix.client import Client

client = Client()

# Add annotation to a span
client.spans.add_span_annotation(
    span_id="span-123",
    project_identifier="my-llm-app",
    name="correctness",
    value=0.95,
    annotator_kind="LLM"
)

Sources: src/phoenix/db/insertion/span_annotation.py

Session Annotations

Annotations at the session level for grouping related spans:

Sources: src/phoenix/db/insertion/session_annotation.py

Trace Retention Policy

Data loaders manage trace retention policies per project:

Sources: src/phoenix/server/api/dataloaders/trace_retention_policy_id_by_project_id.py

Configuration

Environment Variables

VariableConstantDescriptionExample
PHOENIX_HOSTENV_PHOENIX_HOSTPhoenix server host URLhttp://localhost:6006
PHOENIX_API_KEYENV_PHOENIX_API_KEYAPI key for authenticationyour-api-key
PHOENIX_CLIENT_HEADERSENV_PHOENIX_CLIENT_HEADERSJSON-encoded custom headers{"X-Custom":"value"}
PHOENIX_COLLECTOR_ENDPOINTENV_PHOENIX_COLLECTOR_ENDPOINTOTel collector endpoint URLhttps://app.phoenix.arize.com/s/space
PHOENIX_PROJECT_NAMEENV_PHOENIX_PROJECTDefault project namemy-llm-app
PHOENIX_PORTENV_PHOENIX_PORTHTTP port (integer)6006
PHOENIX_GRPC_PORTENV_PHOENIX_GRPC_PORTgRPC port for OTEL4317

Sources: js/packages/phoenix-config/README.md:1-50

Python Client Configuration

from phoenix.client import Client

# Environment-based configuration
client = Client()

# Explicit configuration
client = Client(
    host="http://localhost:6006",
    api_key="your-api-key"
)

Sources: js/packages/phoenix-client/README.md:50-100

OTEL Registration Options

ParameterTypeDefaultDescription
projectNamestring"default"Project name for organizing traces
urlstring"http://localhost:6006"Phoenix instance URL
apiKeystringundefinedAPI key for authentication
headersRecord<string, string>{}Custom headers for OTLP requests
batchbooleantrueUse batch span processing
instrumentationsInstrumentation[]undefinedOpenTelemetry instrumentations

Sources: js/packages/phoenix-otel/README.md:80-120

Querying Traces

Python Client API

from phoenix.client import Client
from datetime import datetime, timedelta

client = Client()

# Get latest traces
traces = client.traces.get_traces(
    project_identifier="my-llm-app",
    limit=10
)

# Filter by time range with span details
traces = client.traces.get_traces(
    project_identifier="my-llm-app",
    start_time=datetime.now() - timedelta(hours=24),
    end_time=datetime.now(),
    include_spans=True,
    sort="latency_ms",
    order="desc"
)

# Filter by session
traces = client.traces.get_traces(
    project_identifier="my-llm-app",
    session_id="my-session-id"
)

Sources: js/packages/phoenix-client/README.md:100-150

TypeScript Client API

import { getTraces } from "@arizeai/phoenix-client/traces";

const result = await getTraces({
  project: { projectName: "my-project" },
  limit: 10,
});

const detailed = await getTraces({
  project: { projectName: "my-project" },
  startTime: "2026-03-01T00:00:00Z",
  endTime: new Date(),
  includeSpans: true,
  sort: "latency_ms",
  order: "desc"
});

Query Parameters

ParameterTypeDefaultDescription
project_identifierstringProject name or ID (required)
start_time`datetime \None`NoneInclusive lower bound on trace start time
end_time`datetime \None`NoneExclusive upper bound on trace start time
sort`"start_time" \"latency_ms" \None`NoneSort field
order`"asc" \"desc" \None`NoneSort direction
include_spansboolFalseInclude full span details
session_id`str \Sequence[str] \None`NoneFilter by session ID(s)
limitint100Maximum traces to return
timeout`int \None`60Request timeout in seconds
Note: Requires Phoenix server >= 13.15.0.

Sources: js/packages/phoenix-client/README.md:150-200

Instrumentation

Basic Setup (Python)

from phoenix.otel import register

tracer_provider = register(
    project_name="my-llm-app",
    auto_instrument=True,      # Auto-trace AI/ML libraries
    batch=True,               # Background batching
    api_key="your-api-key",   # Authentication
    endpoint="https://app.phoenix.arize.com/s/your-space"
)

Using Decorators

from phoenix.otel import register

tracer_provider = register()

# Get a tracer for manual instrumentation
tracer = tracer_provider.get_tracer(__name__)

@tracer.chain
def process_data(data):
    return data + " processed"

@tracer.tool
def weather(location):
    return "sunny"

Sources: js/packages/phoenix-otel/README.md:50-80

LangChain Integration

import { register, traceChain } from "@arizeai/phoenix-otel";

const provider = register({
  projectName: "my-app",
});

const answerQuestion = traceChain(
  async (question: string) => `Handled: ${question}`,
  { name: "answer-question" }
);

await answerQuestion("What is Phoenix?");
await provider.shutdown();

Sources: js/examples/apps/langchain-quickstart/README.md:50-80

Data Flow

graph LR
    A[Application] -->|OpenTelemetry| B[Phoenix OTEL]
    B --> C[Span Processor]
    C --> D[Batch Span Processor]
    D --> E[HTTPSpanExporter]
    E --> F[Phoenix Server API]
    F --> G[DB Insertion Handlers]
    G --> H[(Database)]
    
    I[Query Client] --> F
    J[Annotations] --> G

Trace Visualization

When viewing traces in Phoenix UI, each trace displays:

  • LangGraph (agent) span with input messages and final output
  • Tool calls with their parameters and results
  • Token usage and latency metrics
  • Prompts and responses for each span

After running evaluations, span annotations appear on the relevant spans (e.g., correctness / custom_correctness).

Sources: js/examples/apps/langchain-quickstart/README.md:20-45

Integration with Phoenix Client

Dataset Operations with Traces

from phoenix.client import Client

client = Client()

# Get dataset for evaluation
dataset = client.datasets.get_dataset(
    dataset="my-dataset",
    version_id="v1"
)

# Run experiment with traces
experiment = client.experiments.run(
    dataset_id=dataset.id,
    task=my_task,
    evaluators=[correctness_evaluator],
)
Hint: Tasks and evaluators are instrumented using OpenTelemetry. You can view detailed traces of experiment runs and evaluations directly in the Phoenix UI for debugging and performance analysis.

Sources: js/packages/phoenix-client/README.md:30-60

Sources: js/packages/phoenix-client/README.md:1-30

OpenTelemetry Integration

Related topics: Tracing System, Python SDK (arize-phoenix-client)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section Component Stack

Continue reading this section for the full explanation and source context.

Section Installation

Continue reading this section for the full explanation and source context.

Related topics: Tracing System, Python SDK (arize-phoenix-client)

OpenTelemetry Integration

Overview

OpenTelemetry Integration in Phoenix provides a standardized approach to instrumenting AI applications for observability. Phoenix offers OpenTelemetry (OTel) wrappers for both Python and TypeScript/JavaScript environments, enabling developers to capture traces, spans, and telemetry data from their AI-powered applications.

The integration serves as a bridge between AI frameworks (LangChain, LlamaIndex, OpenAI, etc.) and Phoenix's observability platform, automatically collecting and forwarding telemetry data with minimal configuration.

Sources: packages/phoenix-otel/README.md

Architecture

High-Level Architecture

graph TD
    A[AI Application] --> B[OpenInference Instrumentations]
    B --> C[Phoenix OTEL Wrapper]
    C --> D[OTLP Exporter]
    D --> E[Phoenix Collector]
    E --> F[Phoenix Server]
    
    G[LangChain] --> B
    H[LlamaIndex] --> B
    I[OpenAI] --> B
    J[Haystack] --> B

Component Stack

LayerPython PackageTypeScript Package
Core Wrapperarize-phoenix-otel@arizeai/phoenix-otel
Instrumentationopeninference-instrumentation-*@arizeai/openinference-instrumentation-*
Configurationphoenix.otel.register()register()
ExporterOTLPOTLP

Sources: js/packages/phoenix-otel/README.md

Python Integration

Installation

pip install arize-phoenix-otel

For specific framework instrumentation:

pip install openinference-instrumentation-openai
pip install openinference-instrumentation-langchain
pip install openinference-instrumentation-llama-index

Sources: packages/phoenix-otel/README.md

Core Module: `phoenix.otel`

The Python package exposes the register() function as the primary entry point for configuring OpenTelemetry tracing.

#### Registration Function

from phoenix.otel import register

# Basic setup
tracer_provider = register()

# Production configuration
tracer_provider = register(
    project_name="my-production-app",
    auto_instrument=True,
    batch=True,
    api_key="your-api-key",
    endpoint="https://app.phoenix.arize.com/s/your-space"
)

Environment Variables

VariableDescription
PHOENIX_COLLECTOR_ENDPOINTOTel collector endpoint URL
PHOENIX_PROJECT_NAMEDefault project name for traces
PHOENIX_CLIENT_HEADERSJSON-encoded custom headers
PHOENIX_API_KEYAuthentication API key
PHOENIX_HOSTPhoenix server host URL
PHOENIX_PORTPhoenix HTTP port
PHOENIX_GRPC_PORTPhoenix gRPC port

Sources: js/packages/phoenix-config/README.md

Legacy Module Migration

The legacy phoenix.trace.* instrumentor modules have been removed. The migration path is:

# Old (removed)
from phoenix.trace.openai import OpenAIInstrumentor

# New approach
from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor

tracer_provider = register()
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

Sources: src/phoenix/__init__.py

TypeScript/JavaScript Integration

Installation

npm install @arizeai/phoenix-otel
# or
pnpm add @arizeai/phoenix-otel

Sources: js/packages/phoenix-otel/README.md

Registration Function

import { register } from "@arizeai/phoenix-otel";

// Basic setup
const provider = register({
  projectName: "my-app",
});

// Production setup with Phoenix Cloud
const provider = register({
  projectName: "my-app",
  url: "https://app.phoenix.arize.com",
  apiKey: process.env.PHOENIX_API_KEY,
});

Sources: js/packages/phoenix-otel/src/register.ts

Configuration Options

ParameterTypeDefaultDescription
projectNamestring"default"Project name for organizing traces
urlstring"http://localhost:6006"Phoenix instance URL
apiKeystringundefinedAPI key for authentication
headersRecord<string, string>{}Custom headers for OTLP requests
batchbooleantrueUse batch span processing
instrumentationsInstrumentation[]undefinedOpenTelemetry instrumentations to register
globalbooleantrueRegister as global tracer provider
diagLogLevelDiagLogLeveldepends on NODE_ENVDiagnostic logging level

Sources: js/packages/phoenix-otel/src/register.ts

Non-Global Provider Usage

import { register } from "@arizeai/phoenix-otel";

const provider = register({
  projectName: "my-app",
  global: false,
});

// Use the provider explicitly
const tracer = provider.getTracer("my-tracer");

Supported Integrations

Framework Integrations

Phoenix supports tracing for the following AI frameworks through OpenInference instrumentation:

FrameworkPython PackageTypeScript Package
OpenAIopeninference-instrumentation-openai@arizeai/openinference-instrumentation-openai
LangChainopeninference-instrumentation-langchain@arizeai/openinference-instrumentation-langchain
LlamaIndexopeninference-instrumentation-llama-index@arizeai/openinference-instrumentation-llama-index
Haystackopeninference-instrumentation-haystackN/A
OpenLLMetryopeninference-instrumentation-openllmetryN/A

Sources: app/src/components/project/Integrations.tsx

Python Setup Example

from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor

tracer_provider = register()
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

TypeScript Setup Example

import { register } from "@arizeai/phoenix-otel";
import { OpenAIInstrumentor } from "@arizeai/openinference-instrumentation-openai";

const provider = register();
await OpenAIInstrumentor().instrument();

Tracing Helpers

TypeScript Tracing Utilities

The @arizeai/phoenix-otel package re-exports OpenInference helpers for GenAI patterns:

import {
  observe,
  traceAgent,
  traceChain,
  traceTool,
  withSpan,
  setAttributes,
  setMetadata,
} from "@arizeai/phoenix-otel";

#### Example: traceChain

import { register, traceChain } from "@arizeai/phoenix-otel";

const provider = register({
  projectName: "my-app",
});

const answerQuestion = traceChain(
  async (question: string) => `Handled: ${question}`,
  { name: "answer-question" }
);

await answerQuestion("What is Phoenix?");
await provider.shutdown();

#### Example: traceTool

import { traceTool } from "@arizeai/phoenix-otel";

const searchTool = traceTool(
  async (query: string) => {
    // Tool implementation
    return searchResults;
  },
  {
    name: "web-search",
    description: "Search the web for information",
  }
);

Sources: js/packages/phoenix-otel/README.md

Configuration Flow

Setup Flow

sequenceDiagram
    participant Dev as Developer
    participant App as Application
    participant Phoenix as Phoenix OTEL
    participant OTel as OpenTelemetry
    participant Collector as Phoenix Collector

    Dev->>App: Configure environment variables
    App->>Phoenix: Call register()
    Phoenix->>OTel: Initialize TracerProvider
    OTel->>OTel: Configure BatchSpanProcessor
    OTel->>Collector: Setup OTLP Exporter
    Collector-->>App: Ready to receive traces
    App->>Collector: Send spans via OTLP

Environment-Based Configuration

  1. Local Development: No configuration needed, defaults to http://localhost:6006
# Optional: Explicit local endpoint
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
  1. Phoenix Cloud: Configure cloud endpoint and authentication
export PHOENIX_COLLECTOR_ENDPOINT="https://app.phoenix.arize.com/s/your-space"
export PHOENIX_API_KEY="your-api-key"
export PHOENIX_PROJECT_NAME="my-project"
  1. Self-Hosted: Point to custom deployment
export PHOENIX_COLLECTOR_ENDPOINT="https://your-phoenix.example.com"
export PHOENIX_API_KEY="your-api-key"

Best Practices

Production Recommendations

SettingRecommendationReason
batchtrueReduces network overhead with batch processing
globaltrueEnsures all instrumented libraries use same provider
API KeyRequiredSecure authentication with Phoenix Cloud
EndpointHTTPSSecure data transmission

Zero Code Changes Approach

For automatic instrumentation:

import { register } from "@arizeai/phoenix-otel";

// Enable auto_instrument to automatically trace AI/ML libraries
const provider = register({
  auto_instrument: true,
  projectName: "my-production-app",
});

CLI Integration

The Phoenix CLI provides commands for working with traces:

# List recent traces
px trace list --limit 10

# Save traces to directory
px trace list ./my-traces --limit 50

# Filter by time
px trace list --last-n-minutes 60 --limit 20
px trace list --since 2024-01-13T10:00:00Z
OptionDescription
-n, --limit <number>Number of traces (newest first)
--last-n-minutes <number>Only traces from the last N minutes
--since <timestamp>Traces since ISO timestamp
--format rawPipe-friendly compact JSON

Sources: js/packages/phoenix-cli/README.md

Quick Reference

Python

from phoenix.otel import register

# Simple setup
provider = register()

# With auto-instrumentation
provider = register(auto_instrument=True)

# Production
provider = register(
    project_name="production",
    auto_instrument=True,
    batch=True,
    api_key="your-key",
    endpoint="https://app.phoenix.arize.com/s/your-space"
)

TypeScript

import { register } from "@arizeai/phoenix-otel";

// Simple setup
const provider = register();

// Production
register({
  projectName: "production",
  url: "https://app.phoenix.arize.com",
  apiKey: process.env.PHOENIX_API_KEY,
});

Additional Resources

Sources: packages/phoenix-otel/README.md

Evaluation System (Phoenix Evals)

Related topics: Datasets and Experiments, Python SDK (arize-phoenix-client)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Capabilities

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Evaluator Types

Continue reading this section for the full explanation and source context.

Related topics: Datasets and Experiments, Python SDK (arize-phoenix-client)

Evaluation System (Phoenix Evals)

Overview

Phoenix Evals is a comprehensive evaluation framework that provides lightweight, composable building blocks for writing and running evaluations on LLM applications. It enables developers to assess AI application quality through automated evaluators that measure hallucination detection, relevance scoring, toxicity, correctness, and other custom classification tasks.

Sources: packages/phoenix-evals/README.md

Key Capabilities

FeatureDescription
Multi-SDK SupportWorks with OpenAI, LiteLLM, LangChain, Anthropic via adapters
Input MappingPowerful binding for complex data structures
Pre-built MetricsHallucination detection, relevance, toxicity, correctness
OpenTelemetry IntegrationEvaluators are natively instrumented for observability
High PerformanceUp to 20x speedup with built-in concurrency and batching
Cross-platformAvailable in both Python (arize-phoenix-evals) and TypeScript (@arizeai/phoenix-evals)

Sources: packages/phoenix-evals/README.md

Architecture

System Components

graph TD
    A[User Application] --> B[Phoenix Evals Core]
    B --> C[LLM Adapters]
    B --> D[Evaluator Templates]
    B --> E[Prompt Templates]
    
    C --> F[OpenAI Adapter]
    C --> G[Anthropic Adapter]
    C --> H[LiteLLM Adapter]
    C --> I[LangChain Adapter]
    
    D --> J[Correctness Evaluator]
    D --> K[Faithfulness Evaluator]
    D --> L[Relevance Evaluator]
    D --> M[Toxicity Evaluator]
    D --> N[Custom Classifier]
    
    E --> O[Hallucination Prompts]
    E --> P[Classification Prompts]
    
    B --> Q[OpenTelemetry Traces]

Evaluator Types

Phoenix Evals provides two primary evaluator categories:

  1. Correctness Evaluator - Assesses whether an answer correctly addresses a query based on reference context
  2. Faithfulness Evaluator - Determines if an answer is faithful to the provided reference text (hallucination detection)

Sources: js/packages/phoenix-evals/src/llm/createCorrectnessEvaluator.ts Sources: js/packages/phoenix-evals/src/llm/createFaithfulnessEvaluator.ts

Installation

Python Package

pip install arize-phoenix-evals

Sources: packages/phoenix-evals/README.md

TypeScript Package

# npm
npm install @arizeai/phoenix-evals

# or yarn, pnpm, bun
yarn add @arizeai/phoenix-evals
pnpm add @arizeai/phoenix-evals
bun add @arizeai/phoenix-evals

Sources: js/packages/phoenix-evals/README.md

Creating Evaluators

TypeScript: Correctness Evaluator

import { createClassifier } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";

const model = openai("gpt-4o-mini");

const promptTemplate = `
In this task, you will be presented with a query, a reference text and an answer. The answer is
generated to the question based on the reference text. The answer may contain false information. You
must use the reference text to determine if the answer to the question contains false information.
`;

const classifier = createClassifier({
  model,
  promptTemplate,
});

Sources: js/packages/phoenix-evals/README.md

TypeScript: Faithfulness Evaluator

The faithfulness evaluator detects hallucinations by comparing an answer against reference context.

import { createFaithfulnessEvaluator } from "@arizeai/phoenix-evals/llm";
import { openai } from "@ai-sdk/openai";

const evaluator = createFaithfulnessEvaluator({
  model: openai("gpt-4o-mini"),
});

Sources: js/packages/phoenix-evals/src/llm/createFaithfulnessEvaluator.ts

Python: Using the Client API

The Phoenix server provides evaluator helpers for running evaluations programmatically:

from phoenix.server.api.helpers.evaluators import (
    build_hallucination_evaluator,
    build_correctness_evaluator,
)

Sources: src/phoenix/server/api/helpers/evaluators.py

Prompt Templates

Classification Evaluator Configurations

Phoenix Evals uses structured prompt templates for classification tasks. The system supports multiple evaluator configurations stored in the prompts/classification_evaluator_configs directory.

Sources: prompts/classification_evaluator_configs

Available Evaluators on Server

EvaluatorPurposeInput Fields
hallucination_evaluatorDetects false information in answersquery, reference, response
correctness_evaluatorAssesses answer correctnessquery, reference, response
answer_relevance_evaluatorMeasures relevance of response to queryquery, response
context_relevance_evaluatorMeasures relevance of context to queryquery, context

Sources: src/phoenix/server/api/helpers/evaluators.py:1-100

Evaluator Configuration Schema

Project Dependencies

The Python package defines its dependencies in pyproject.toml:

[project]
name = "arize-phoenix-evals"
version = "5.0.0"

Core Dependencies:

  • anthropic - Anthropic API client
  • httpx - HTTP client
  • joblib - Parallel execution
  • litellm - Unified LLM interface
  • openai - OpenAI API client
  • tqdm - Progress bars

Optional Dependencies:

  • langchain / langchain-core - LangChain integration
  • langsmith - Tracing support

Sources: packages/phoenix-evals/pyproject.toml

Evaluation Workflow

graph LR
    A[Input Data] --> B[Evaluator Selection]
    B --> C[Prompt Template]
    C --> D[LLM Adapter]
    D --> E[Model Inference]
    E --> F[Response Parsing]
    F --> G[Evaluation Result]
    
    H[Reference Context] --> C
    I[Query] --> C

Running Evaluations

Batch Evaluation with Concurrency

Phoenix Evals supports concurrent evaluation for high performance:

from phoenix.evals import run_evaluation

results = run_evaluation(
    model=my_model,
    evaluators=[correctness_evaluator, faithfulness_evaluator],
    data=evaluation_dataset,
    concurrency=10,  # Up to 20x speedup
)

Exporting Results

Evaluation results can be:

  1. Logged to Phoenix for visualization
  2. Exported as JSON/CSV
  3. Stored in datasets for future reference

Integration with Phoenix Observability

Evaluators are natively instrumented via OpenTelemetry tracing, enabling:

  • Trace-level annotations for evaluation results
  • Correlation between evaluations and application spans
  • Dataset curation from production traces

Sources: packages/phoenix-evals/README.md

Configuration Reference

Environment Variables

VariableDescriptionDefault
PHOENIX_HOSTPhoenix server URLhttp://localhost:6006
PHOENIX_API_KEYAuthentication key
PHOENIX_PROJECTDefault project name

Evaluator Options

interface EvaluatorConfig {
  model: any;                    // LLM model instance
  promptTemplate?: string;       // Custom prompt template
  temperature?: number;          // Model temperature (default: 0.0)
  maxTokens?: number;            // Max tokens for response
  batchSize?: number;            // Batch size for evaluation
}

Advanced Usage

Custom Classifier

Create custom binary or multi-class classification evaluators:

import { createClassifier } from "@arizeai/phoenix-evals/llm";

const customClassifier = createClassifier({
  model: myModel,
  promptTemplate: myCustomTemplate,
  labels: ["positive", "negative", "neutral"],
});

LangChain Integration

For LangChain applications, use the pre-built evaluations:

npm run pre_built_evals   # Uses built-in correctness evaluator
npm run custom_evals      # Uses custom rubric with specific LLM

Sources: js/examples/apps/langchain-quickstart/README.md

Performance Considerations

OptimizationExpected Improvement
Concurrent executionUp to 20x speedup
Batch processingReduced API overhead
Streaming responsesLower latency perception

Best Practices

  1. Use reference contexts - Always provide ground truth or reference data for accurate evaluation
  2. Configure appropriate models - Use capable models (GPT-4 class) for accurate classification
  3. Monitor evaluation traces - Review flagged evaluations in Phoenix UI
  4. Iterate on prompts - Fine-tune prompt templates for domain-specific accuracy
  5. Set low temperature - Use temperature=0 for deterministic evaluation results

Sources: packages/phoenix-evals/README.md

Datasets and Experiments

Related topics: Evaluation System (Phoenix Evals), Database Models and Migrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Dataset Structure

Continue reading this section for the full explanation and source context.

Section Example Schema

Continue reading this section for the full explanation and source context.

Section Experiment Workflow

Continue reading this section for the full explanation and source context.

Related topics: Evaluation System (Phoenix Evals), Database Models and Migrations

Datasets and Experiments

Phoenix provides a comprehensive Datasets and Experiments system that enables users to create structured collections of examples, run evaluation tasks against them, and track results over time. This feature is designed for benchmarking models, evaluating LLM outputs, and building evaluation pipelines.

Overview

Datasets in Phoenix are structured containers that hold examples used for experimentation and evaluation. Each example consists of an input, an output, and optional metadata. Experiments allow you to define tasks that process these examples and evaluators that score the results.

graph TD
    A[Dataset] --> B[Example 1]
    A --> C[Example 2]
    A --> D[Example N]
    B --> E[input]
    B --> F[output]
    B --> G[metadata]
    E --> H[Task Function]
    F --> H
    H --> I[Evaluators]
    I --> J[Experiment Results]
    J --> K[Annotations on Traces]

Core Concepts

Dataset Structure

A Dataset is a named collection of examples with the following properties:

PropertyTypeDescription
namestrUnique identifier for the dataset
descriptionstrHuman-readable description
examplesList[Example]Collection of examples
versionintDataset version for tracking changes
created_atdatetimeCreation timestamp

Example Schema

Each Example within a dataset follows this structure:

FieldTypeRequiredDescription
inputdictYesInput data for the task (e.g., prompts, questions)
outputdictYesExpected or reference output
metadatadictNoAdditional context, tags, or auxiliary data
# Example structure
example = {
    "input": {"question": "What is the capital of France?"},
    "output": {"answer": "Paris"},
    "metadata": {"category": "geography", "difficulty": "easy"}
}

Sources: js/packages/phoenix-client/README.md

Experiment Workflow

The typical experiment workflow involves:

  1. Creating a Dataset - Define and populate a dataset with examples
  2. Defining a Task - Create a function that processes each example
  3. Configuring Evaluators - Set up scoring/evaluation functions
  4. Running the Experiment - Execute tasks across all examples
  5. Analyzing Results - Review scores and export results
graph LR
    A1[Create Dataset] --> B[Define Task Function]
    B --> C[Configure Evaluators]
    C --> D[Run Experiment]
    D --> E[Results + Annotations]
    E --> F[Analyze & Iterate]

Sources: js/packages/phoenix-client/README.md

Python Client API

Dataset Resource

The Python client provides a Datasets resource class for managing datasets programmatically.

from phoenix.client import Client

client = Client()

# Get an existing dataset
dataset = client.datasets.get_dataset(
    dataset="my-dataset-name",
    version_id="optional-version-id"  # Optional: specific version
)

#### Key Methods

MethodDescription
get_dataset()Retrieve a dataset by name or ID
create_dataset()Create a new dataset with examples
add_examples()Add examples to an existing dataset
upsert_dataset()Create or update a dataset
get_dataset_columns()Get column information for the dataset

Sources: packages/phoenix-client/src/phoenix/client/resources/datasets/__init__.py

Creating Datasets

from phoenix.client import Client

client = Client()

# Create a dataset with examples
dataset = client.datasets.create_dataset(
    name="qa-dataset",
    description="Questions and answers for evaluation",
    input_keys=["question"],
    output_keys=["answer"],
    metadata_keys=["category", "difficulty"],
    inputs=[
        {"question": "What is the capital of France?"},
        {"question": "What is the capital of the USA?"}
    ],
    outputs=[
        {"answer": "Paris"},
        {"answer": "Washington D.C."}
    ],
    metadata=[
        {"category": "geography", "difficulty": "easy"},
        {"category": "geography", "difficulty": "easy"}
    ]
)

Adding Examples to Datasets

# Add more examples to an existing dataset
await client.datasets.add_examples(
    dataset_id="dataset-uuid",
    examples=[
        {
            "input": {"question": "What is 2 + 2?"},
            "output": {"answer": "4"},
            "metadata": {"category": "math", "difficulty": "easy"}
        }
    ]
)

#### Parameters for add_examples

ParameterTypeRequiredDescription
dataset_idstrYesDataset identifier
examplesList[dict]YesList of example objects
splitstrNoAssign all examples to a split
timeoutintNoRequest timeout in seconds

Sources: packages/phoenix-client/src/phoenix/client/resources/datasets/__init__.py

TypeScript/JavaScript Client API

Creating Datasets

import { createDataset } from "@arizeai/phoenix-client/datasets";

const { datasetId } = await createDataset({
  name: "questions",
  description: "a simple dataset of questions",
  examples: [
    {
      input: { question: "What is the capital of France" },
      output: { answer: "Paris" },
      metadata: {},
    },
    {
      input: { question: "What is the capital of the USA" },
      output: { answer: "Washington D.C." },
      metadata: {},
    },
  ],
});

Sources: js/packages/phoenix-client/README.md

Running Experiments

import { 
  asExperimentEvaluator, 
  runExperiment 
} from "@arizeai/phoenix-client/experiments";

// Define a task to run on each example
const task = async (example) => `hello ${example.input.name}`;

// Define evaluators
const evaluators = [
  asExperimentEvaluator({
    name: "matches",
    kind: "CODE",
    evaluate: async ({ output, expected }) => {
      return output === expected;
    },
  }),
];

// Run the experiment
const results = await runExperiment({
  datasetId,
  task,
  evaluators,
});

Sources: js/packages/phoenix-client/README.md

Experiment Evaluators

Evaluators are functions that assess the quality of task outputs against expected results.

Evaluator Types

KindDescriptionUse Case
CODECustom code-based evaluationExact match, regex patterns
LLMLLM-as-judge evaluationSemantic similarity, relevance
BUILT_INPre-built Phoenix evaluatorsCommon metrics like accuracy

Built-in Evaluators

Phoenix provides pre-built evaluators for common evaluation tasks:

import { 
  asExperimentEvaluator,
  runExperiment,
  createBuiltInEvaluator 
} from "@arizeai/phoenix-client/experiments";

// Use a built-in correctness evaluator
const correctnessEval = createBuiltInEvaluator({
  name: "correctness",
  rubric: "Evaluate if the response correctly answers the question",
});

const evaluators = [correctnessEval];

const results = await runExperiment({
  datasetId,
  task,
  evaluators,
});

Sources: js/examples/apps/langchain-quickstart/README.md

Example Apps

Phoenix Experiment Runner

A complete example application demonstrating dataset and experiment workflows:

# Setup requirements
pnpm install
pnpm -r build

# Run the app
pnpm dev

Features:

  • Loads datasets from CSV files
  • Configures custom task functions
  • Runs experiments with multiple evaluators
  • Stores results in Phoenix

Sources: js/examples/apps/phoenix-experiment-runner/README.md

LangChain Integration

The LangChain quickstart demonstrates how to:

  1. Instrument LangChain agents with Phoenix
  2. Run correctness evaluations on agent outputs
  3. Log annotations back to Phoenix traces
// Fetch spans and run evaluation
const spans = await client.spans.get_spans_dataframe({
  project_identifier: "my-llm-app",
  limit: 100,
});

// Run built-in correctness evaluator
const evalResults = await runBuiltInEvaluator({
  name: "correctness",
  spans,
  rubric: "travel_rubric",
});

// Log annotations back to Phoenix
await client.spans.add_span_annotations(evalResults);

Sources: js/examples/apps/langchain-quickstart/README.md

API Input Types

CreateDatasetInput

The server-side input type for creating datasets:

class CreateDatasetInput:
    name: str                    # Required: unique dataset name
    description: Optional[str]   # Optional: dataset description
    metadata: Optional[dict]     # Optional: dataset-level metadata

AddExamplesToDatasetInput

class AddExamplesToDatasetInput:
    dataset_id: GlobalID          # Required: target dataset
    examples: List[ExampleInput] # Required: examples to add
    split: Optional[str]         # Optional: assign to split

Sources: src/phoenix/server/api/input_types/CreateDatasetInput.py Sources: src/phoenix/server/api/input_types/AddExamplesToDatasetInput.py

Database Operations

Dataset insertion is handled by the database layer:

# In src/phoenix/db/insertion/dataset.py
async def insert_dataset(
    session: AsyncSession,
    dataset: Dataset,
    examples: List[Example]
) -> None:
    # Handles dataset creation and example insertion
    # Supports batch operations for performance

Sources: src/phoenix/db/insertion/dataset.py

UI Integration

Experiment Code Dialog

The Phoenix UI provides a code generation dialog for running experiments:

<RunExperimentCodeDialog
  datasetName="my-dataset"
  version={{ id: "v1", version: 1 }}
  isAuthEnabled={true}
/>

This component generates:

  1. Installation commands for arize-phoenix-client
  2. Base URL configuration
  3. Dataset retrieval code
  4. Experiment execution examples

Sources: app/src/components/experiment/RunExperimentCodeDialog.tsx

Python Project Guide

The UI also provides setup instructions for Python-based experiments:

<PythonProjectGuide
  packages={["arize-phoenix-otel"]}
  isAuthEnabled={true}
/>

Sources: app/src/components/project/PythonProjectGuide.tsx

Authentication

When using datasets and experiments with authentication enabled:

Environment VariableDescription
PHOENIX_API_KEYAPI key for authentication
AuthorizationBearer token for REST/GraphQL APIs
OTEL_EXPORTER_OTLP_HEADERSHeaders for OpenTelemetry SDKs
# Set API key
export PHOENIX_API_KEY="your-api-key"

# Or for OTEL
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer your-token"

Sources: app/src/components/auth/OneTimeAPIKeyDialog.tsx

Summary

The Datasets and Experiments system in Phoenix provides:

  • Structured Data Management: Create, version, and manage datasets with examples
  • Flexible Evaluation: Support for code-based, LLM-based, and built-in evaluators
  • Multi-Language Support: Python and TypeScript/JavaScript clients
  • Trace Integration: Results and annotations can be linked to spans
  • Workflow Automation: Script-based and programmatic experiment execution

This system enables teams to systematically evaluate LLM applications, track performance over time, and build automated quality assurance pipelines.

Sources: js/packages/phoenix-client/README.md

Database Models and Migrations

Related topics: Server API and GraphQL, Tracing System

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Server API and GraphQL, Tracing System

Database Models and Migrations

Phoenix uses SQLAlchemy ORM for database abstraction and Alembic for managing schema migrations. This document covers the database architecture, available models, migration system, engine configuration, and bulk data insertion utilities.

Source: https://github.com/Arize-ai/phoenix / Human Manual

Frontend Application

Related topics: System Architecture, Server API and GraphQL

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Hierarchy

Continue reading this section for the full explanation and source context.

Section Technology Stack

Continue reading this section for the full explanation and source context.

Section Main Routes

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Server API and GraphQL

Frontend Application

Overview

The Phoenix Frontend Application is a React-based web interface that provides observability, tracing, and evaluation capabilities for AI applications. It serves as the primary user interface for interacting with Phoenix's backend services, enabling users to manage projects, visualize traces, configure annotations, and evaluate LLM outputs.

The frontend is built with modern React patterns including hooks, context providers, and component composition. It integrates with @arizeai/phoenix-otel for telemetry data collection and @arizeai/phoenix-evals for evaluation functionality.

Sources: app/src/pages/project/ProjectTracesPage.tsx:1-50

Architecture Overview

Component Hierarchy

graph TD
    A[App] --> B[Routes]
    B --> C[ProjectTracesPage]
    B --> D[Playground]
    B --> E[Settings Pages]
    C --> F[TracesTable]
    C --> G[SpanFiltersProvider]
    D --> H[PromptMenu]
    E --> I[AnnotationConfigList]
    
    F --> J[AnnotationLabel]
    J --> K[AnnotationTooltip]
    
    G --> L[TracePaginationProvider]
    L --> M[TracingRoot]

Technology Stack

LayerTechnologyPurpose
FrameworkReact 19+UI rendering
RoutingReact Router v6SPA navigation
StateReact Context + HooksState management
StylingCSS-in-JS (Fela)Component styling
IconsCustom SVG ComponentsUI iconography
Tables@tanstack/react-tableData tables
FormsAdobe React SpectrumForm components

Sources: app/src/Routes.tsx:1-100

Routing Structure

The application uses React Router for navigation with a nested route structure supporting breadcrumbs and lazy loading.

Main Routes

RouteComponentPurpose
/projects/:projectId/tracesProjectTracesPageTrace visualization
/projects/:projectId/traces/:traceIdTraceTreeSingle trace view
/settings/*SettingsPageApplication settings
/playgroundPlaygroundLLM prompt playground
// Routes structure excerpt
<Route path="/settings" element={<SettingsPage />}>
  <Route path="general" element={<SettingsGeneralPage />} />
  <Route path="secrets" element={<SettingsSecretsPage />} />
  <Route path="providers" element={<SettingsAIProvidersPage />} />
  <Route path="models" element={<SettingsModelsPage />} />
  <Route path="datasets" element={<SettingsDatasetsPage />} />
  <Route path="annotations" element={<SettingsAnnotationsPage />} />
  <Route path="data" element={<SettingsDataPage />} />
  <Route path="prompts" element={<SettingsPromptsPage />} />
</Route>

Sources: app/src/Routes.tsx:30-80

Core Components

Tracing Components

#### ProjectTracesPage

The main container for trace visualization, wrapping content in context providers for tracing state management.

export const ProjectTracesPage = () => {
  const { tracesQueryReference } = useProjectPageQueryReferenceContext();

  return (
    <TracingRoot>
      <TracePaginationProvider>
        <SpanFiltersProvider>
          <Suspense fallback={<Loading />}>
            <TracesTabContent tracesQueryReference={tracesQueryReference} />
          </Suspense>
        </SpanFiltersProvider>
        <Suspense>
          <Outlet />
        </Suspense>
      </TracePaginationProvider>
    </TracingRoot>
  );
};

Sources: app/src/pages/project/ProjectTracesPage.tsx:40-65

#### TracesTable

Displays experiment runs and traces in a paginated, sortable table format. Integrates with useReactTable from TanStack Table for efficient rendering.

FeatureImplementation
PaginationCustom pagination with fetch on scroll
SortingColumn-based sorting
SelectionRow selection for batch operations
NavigationClick to view trace details

Sources: app/src/pages/example/ExampleExperimentRunsTable.tsx:1-100

Project Setup Components

#### OnboardingSteps

Provides step-by-step guidance for integrating applications with Phoenix. Supports multiple programming languages and package managers.

export function OnboardingSteps({
  language,
  packages,
  implementationCode,
  docsHref,
  githubHref,
  generatedApiKey,
  onApiKeyGenerated,
  extraEnvVars,
}: {
  language: ProgrammingLanguage;
  packages: readonly string[];
  implementationCode: string;
  // ... additional props
}) {
  const isHosted = IS_HOSTED_DEPLOYMENT;
  const isAuthEnabled = window.Config.authenticationEnabled;
  // ...
}

Sources: app/src/pages/project/OnboardingSteps.tsx:50-85

#### TypeScriptProjectGuide

Language-specific setup guide for TypeScript/JavaScript projects with OTEL initialization code generation.

<TypeScriptBlockWithCopy
  value={getOtelInitCodeTypescript({ projectName })}
/>

#### PythonProjectGuide

Similar component for Python projects using arize-phoenix-otel package.

PackagePurpose
arize-phoenix-otelOpenTelemetry instrumentation for Python
opentelemetry-*Standard OTEL packages

Sources: app/src/components/project/PythonProjectGuide.tsx:1-60

Document and Annotation Components

#### DocumentItem

Renders document metadata and annotations within trace views. Supports JSON display and annotation overlay.

<ReadonlyJSONBlock basicSetup={{ lineNumbers: false }}>
  {JSON.stringify(metadata)}
</ReadonlyJSONBlock>
<DocumentAnnotationsSection
  spanNodeId={spanNodeId ?? ""}
  documentPosition={documentPosition ?? 0}
  documentAnnotations={documentAnnotations ?? []}
/>

Sources: app/src/pages/trace/DocumentItem.tsx:1-80

#### AnnotationConfigList

Dropdown component for selecting annotation configurations. Displays annotation names with color swatches and type tokens.

<MenuItem
  id={id}
  textValue={name ?? undefined}
  leadingContent={
    <AnnotationColorSwatch annotationName={name || ""} />
  }
  trailingContent={
    <Token size="S">{annotationType?.toLocaleLowerCase()}</Token>
  }
>
  <Text>{name}</Text>
</MenuItem>

Sources: app/src/components/trace/AnnotationConfigList.tsx:1-50

Markdown Rendering

#### streamdownComponents

Custom React components for rendering markdown content with Phoenix-specific styling.

ComponentElementStyling
blockquote<blockquote>Blockquote CSS
inlineCode<code>Inline code CSS
table<table>Table wrapper
th<th>Header cell CSS
td<td>Data cell CSS
img<img>Image CSS
hr<hr>Horizontal rule CSS
blockquote: ({ children, className }) => (
  <blockquote css={blockquoteCSS} className={className}>
    {children}
  </blockquote>
),
inlineCode: ({ children, className }) => (
  <code css={inlineCodeCSS} className={className}>
    {children}
  </code>
),

Sources: app/src/components/markdown/streamdownComponents.tsx:1-60

Playground Components

#### PromptMenu

Tabbed interface for managing prompt versions and tags. Uses lazy-loaded tabs for performance.

<Autocomplete filter={contains}>
  <MenuHeader>
    <SearchField aria-label="Search tags" variant="quiet" autoFocus>
      <SearchIcon />
      <Input placeholder="Search tags" />
    </SearchField>
  </MenuHeader>
</Autocomplete>
TabContent
VersionsPrompt version history
TagsTag management

Sources: app/src/pages/playground/PromptMenu.tsx:1-80

Tool Components

#### DocsToolDetails

Renders documentation tool outputs with state-based display logic.

function getOutputText(part: ToolInvocationPart): string {
  if (part.state !== "output-available" || part.output == null) {
    return "";
  }
  return stringifyToolValue(part.output);
}

// Preview truncation for long outputs
export function truncateDocsOutput(text: string): string {
  if (text.length <= OUTPUT_PREVIEW_LENGTH) {
    return text;
  }
  return text.slice(0, OUTPUT_PREVIEW_LENGTH) + "…";
}

Sources: app/src/components/agent/DocsToolDetails.tsx:1-100

UI Components

#### Icons

SVG icon components using currentColor for theme compatibility.

export const ArrowCompareOutline = () => (
  <svg width="24" height="24" viewBox="0 0 24 24" fill="none">
    <path d="..." fill="currentColor" />
  </svg>
);

Available icons include: ArrowCompareOutline, TemplateOutline, FileOutline, CloseOutline, and more.

Sources: app/src/components/core/icon/Icons.tsx:1-50

#### FileListItem

Reusable file upload item component with progress tracking and status indicators.

StatusDescription
uploadingFile transfer in progress
parsingFile content being processed
completeUpload and parse successful
errorUpload or parse failed
const showProgress = status !== "complete" && progress !== undefined;

return (
  <li className="file-list__item" data-status={status}>
    <ProgressBar value={progress} width="100%" height="4px" />
  </li>
);

Sources: app/src/components/core/dropzone/FileListItem.tsx:1-70

Context Providers

Provider Hierarchy

graph TD
    A[TracingRoot] --> B[TracePaginationProvider]
    B --> C[SpanFiltersProvider]
    C --> D[TracesTabContent]
    
    E[IsAuthenticated] --> F[IsAdmin]
    F --> G[GenerateAPIKeyButton]
ProviderPurpose
TracingRootGlobal tracing state
TracePaginationProviderPagination state for traces
SpanFiltersProviderFilter state for span queries
IsAuthenticatedAuthentication state check
IsAdminAuthorization check

Environment Configuration

Phoenix Environment Variables

The frontend reads configuration from environment variables via the @arizeai/phoenix-config package.

VariableConstantDescription
PHOENIX_HOSTENV_PHOENIX_HOSTPhoenix server host URL
PHOENIX_API_KEYENV_PHOENIX_API_KEYAPI key for authentication
PHOENIX_CLIENT_HEADERSENV_PHOENIX_CLIENT_HEADERSJSON-encoded custom headers
PHOENIX_COLLECTOR_ENDPOINTENV_PHOENIX_COLLECTOR_ENDPOINTOTel collector endpoint
PHOENIX_PORTENV_PHOENIX_PORTPhoenix HTTP port
PHOENIX_GRPC_PORTENV_PHOENIX_GRPC_PORTPhoenix gRPC port
PHOENIX_PROJECTENV_PHOENIX_PROJECTDefault project name

Sources: js/packages/phoenix-config/README.md:1-50

Runtime Configuration

const isHosted = IS_HOSTED_DEPLOYMENT;
const isAuthEnabled = window.Config.authenticationEnabled;

Data Flow

Trace Navigation Flow

sequenceDiagram
    participant User
    participant TracesTable
    participant ProjectTracesPage
    participant API
    
    User->>TracesTable: Click trace row
    TracesTable->>TracesTable: Get row trace data
    TracesTable->>User: Navigate to trace detail
    User->>ProjectTracesPage: Load trace by ID
    ProjectTracesPage->>API: Fetch trace data
    API-->>ProjectTracesPage: Return trace
    ProjectTracesPage-->>User: Render TraceTree

Annotation Flow

graph LR
    A[TraceView] --> B[AnnotationLabel]
    B --> C[AnnotationTooltip]
    C --> D[External Link]
    D --> E[Trace Detail]
    
    A --> F[AnnotationConfigList]
    F --> G[Create Annotation]

Package Dependencies

Key Dependencies

PackageVersion ConstraintPurpose
react>=18.0.0Core framework
react-router-dom>=6.0.0Routing
@tanstack/react-tableLatestData tables
@adobe/react-spectrumLatestUI components
@arizeai/phoenix-evalsLatestEvaluation functions
@arizeai/phoenix-configLatestConfig utilities

Sources: app/package.json:1-50

Best Practices

Component Patterns

  1. Lazy Loading: Use Suspense with lazy-loaded components for route-based code splitting
  2. Context Composition: Wrap related state in dedicated providers
  3. Type Safety: Use TypeScript interfaces for all component props
  4. CSS Organization: Use CSS-in-JS with semantic class naming

Performance Considerations

  • Implement virtual scrolling for large trace lists
  • Use React.memo for expensive component re-renders
  • Lazy load tab panels in tabbed interfaces
  • Truncate long outputs in preview contexts
export function truncateDocsOutput(text: string): string {
  if (text.length <= OUTPUT_PREVIEW_LENGTH) {
    return text;
  }
  return text.slice(0, OUTPUT_PREVIEW_LENGTH) + "…";
}

Sources: app/src/pages/project/ProjectTracesPage.tsx:1-50

Server API and GraphQL

Related topics: System Architecture, Database Models and Migrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Server Initialization

Continue reading this section for the full explanation and source context.

Section Request Context

Continue reading this section for the full explanation and source context.

Section Authentication Flow

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Database Models and Migrations

Server API and GraphQL

Phoenix provides a comprehensive server API layer built on GraphQL (via Strawberry) and REST endpoints for managing traces, experiments, datasets, and evaluations. The API architecture is designed for observability, authentication, and efficient data loading.

Architecture Overview

Phoenix's server API follows a layered architecture that separates concerns between request handling, authentication, data access, and response rendering.

graph TD
    A[Client Request] --> B[API Router Layer]
    B --> C[Authentication Middleware]
    C --> D[Context Builder]
    D --> E[GraphQL Executor / REST Handler]
    E --> F[Data Loaders]
    F --> G[Database / Storage]
    G --> H[Response]

Core Components

Server Initialization

The main server module (src/phoenix/server/__init__.py) orchestrates the application lifecycle, including database initialization, API router setup, and background task management.

ComponentPurposeKey Responsibilities
ServerMain application classInitialize app, setup routes, manage lifecycle
AppASGI applicationHandle HTTP/WebSocket connections
DatabaseData persistenceSQLite/PostgreSQL connection management

Sources: src/phoenix/server/__init__.py:1-50

Request Context

The context module (src/phoenix/server/api/context.py) provides request-scoped data access throughout the API layer. It manages database sessions, user authentication, and configuration access.

class Context:
    def __init__(
        self,
        db: Session,
        user: User | None,
        data_loaders: dict[str, DataLoader],
    ) -> None:
        self.db = db
        self.user = user
        self.data_loaders = data_loaders

Key Context Properties:

PropertyTypeDescription
dbSessionSQLAlchemy database session
user`User \None`Authenticated user or anonymous
data_loadersdict[str, DataLoader]Batch data loading utilities
has_authboolWhether authentication is enabled

Sources: src/phoenix/server/api/context.py:1-100

Authentication System

The authentication module (src/phoenix/server/api/auth.py) implements token-based authentication for API access.

Authentication Flow

sequenceDiagram
    participant Client
    participant API
    participant Auth
    participant DB
    
    Client->>API: Request + API Key
    API->>Auth: Validate Token
    Auth->>DB: Lookup User
    DB-->>Auth: User Record
    Auth-->>API: Authenticated Context
    API-->>Client: Response with Context

Authentication Methods

MethodHeaderDescription
API KeyAuthorization: Bearer <key>Token-based authentication
SessionCookie-basedWeb UI authentication
AnonymousNoneRead-only access when auth disabled

Permission Classes

Phoenix enforces permission checks on mutations and subscriptions:

  • IsNotReadOnly - Prevents read-only users from modifying data
  • IsNotViewer - Prevents viewer roles from write operations

Sources: src/phoenix/server/api/auth.py:1-80

GraphQL Schema

Phoenix uses Strawberry GraphQL for its primary API surface, enabling type-safe queries and mutations with automatic documentation.

Schema Structure

Query
├── datasets
│   ├── get_dataset
│   └── get_dataset_by_id
├── experiments
│   ├── experiments
│   └── experiment_by_id
├── projects
│   ├── projects
│   └── project_by_id
├── spans
│   └── get_spans
└── traces
    └── get_traces

Mutation
├── datasets
│   ├── create_dataset
│   ├── upload_dataset
│   └── delete_dataset
├── experiments
│   ├── create_experiment
│   └── run_experiment
├── spans
│   └── create_span_annotation
└── traces
    └── create_trace_annotation

Data Loaders

Data loaders (src/phoenix/server/api/dataloaders/) implement the N+1 query problem solution through batch loading and caching within request scope.

class DatasetLoader(DataLoader[int, Dataset]):
    def batch_load(self, dataset_ids: list[int]) -> list[Dataset]:
        # Single database query for all IDs
        datasets = self.db.query(Dataset).filter(Dataset.id.in_(dataset_ids)).all()
        return datasets
DataLoaderPurpose
DatasetLoaderBatch load datasets by ID
ProjectLoaderBatch load projects by ID
SpanLoaderBatch load spans by ID
AnnotationLoaderBatch load annotations
UserLoaderBatch load user records

Sources: src/phoenix/server/api/dataloaders/__init__.py:1-50

REST API Endpoints

API Router Structure

# src/phoenix/server/api/routers/__init__.py
routers = [
    datasets_router,
    experiments_router,
    spans_router,
    traces_router,
    evaluations_router,
]

Dataset Endpoints

MethodEndpointDescription
GET/v1/datasetsList all datasets
POST/v1/datasetsCreate new dataset
GET/v1/datasets/{id}Get dataset by ID
PUT/v1/datasets/{id}Update dataset
DELETE/v1/datasets/{id}Delete dataset
POST/v1/datasets/{id}/uploadUpload data to dataset

Experiment Endpoints

MethodEndpointDescription
GET/v1/experimentsList experiments
POST/v1/experimentsCreate experiment
GET/v1/experiments/{id}Get experiment details
POST/v1/experiments/{id}/runRun experiment evaluation

Span and Trace Endpoints

MethodEndpointDescription
GET/v1/spansQuery spans with filtering
POST/v1/spans/{id}/annotationsAdd span annotation
GET/v1/tracesQuery traces
POST/v1/traces/{id}/annotationsAdd trace annotation

Sources: src/phoenix/server/api/routers/__init__.py:1-30

OpenAPI Schema

Phoenix exposes its REST API through an OpenAPI schema that can be compiled for documentation and client generation:

python scripts/ci/compile_openapi_schema.py

This generates the schema at openapi-schema.json for validation and client SDK generation.

Sources: scripts/README.md:1-50

Environment Variables

The API server recognizes the following configuration variables:

VariableDescriptionDefault
PHOENIX_HOSTServer host URLhttp://localhost:6006
PHOENIX_PORTHTTP port6006
PHOENIX_GRPC_PORTgRPC port4317
PHOENIX_API_KEYAuthentication keyNone
PHOENIXCollectorEndpointOTEL collector endpointInternal

Client Integration

Python Client

from phoenix.client import Client

client = Client(
    host="http://localhost:6006",
    api_key="your-api-key"
)

# Query datasets
datasets = client.datasets.list()

# Get spans as DataFrame
spans = client.spans.get_spans_dataframe(
    project_identifier="my-project",
    limit=1000
)

TypeScript Client

import { Client } from "@arizeai/phoenix-client";

const client = new Client({
  host: "http://localhost:6006",
  apiKey: "your-api-key"
});

const datasets = await client.datasets.list();

Request/Response Patterns

GraphQL Query Example

query GetDataset($id: GlobalID!) {
  dataset(id: $id) {
    id
    name
    version
    createdAt
    spanCount
    traceCount
  }
}

REST Request with Filtering

curl -X GET "http://localhost:6006/v1/spans?project_id=abc123&limit=100" \
  -H "Authorization: Bearer $PHOENIX_API_KEY"

Summary

Phoenix's Server API and GraphQL layer provides a unified interface for observability operations:

  • GraphQL via Strawberry for type-safe, self-documenting queries
  • REST endpoints for compatibility and specialized operations
  • Authentication via API keys with role-based permissions
  • Data Loaders for efficient batch data loading
  • OpenAPI schema for client SDK generation

Sources: src/phoenix/server/__init__.py:1-50

Python SDK (arize-phoenix-client)

Related topics: Evaluation System (Phoenix Evals), OpenTelemetry Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Synchronous Client

Continue reading this section for the full explanation and source context.

Section Asynchronous Client

Continue reading this section for the full explanation and source context.

Section Environment Variables

Continue reading this section for the full explanation and source context.

Related topics: Evaluation System (Phoenix Evals), OpenTelemetry Integration

Python SDK (arize-phoenix-client)

The arize-phoenix-client is the official Python SDK for Arize Phoenix, providing a comprehensive interface for interacting with the Phoenix platform via its REST API. This SDK enables developers to programmatically manage datasets, run experiments, analyze traces, and collect feedback for LLM observability and evaluation workflows.

Overview

Phoenix is an open-source AI observability platform designed for debugging, evaluating, and refining LLM applications. The Python SDK serves as the primary programmatic interface for:

  • REST API Integration - Full access to Phoenix's OpenAPI REST interface
  • Prompt Management - Create, version, and invoke prompt templates
  • Dataset Operations - Create and append datasets from DataFrames, CSV files, or dictionaries
  • Experiment Tracking - Run evaluations and track experiment results
  • Trace Analysis - Query and analyze traces with powerful filtering capabilities
  • Annotation Workflows - Add human feedback and automated evaluations to spans

Sources: packages/phoenix-client/README.md

Installation

Install the SDK using pip:

pip install arize-phoenix-client

Sources: packages/phoenix-client/README.md

Client Initialization

The SDK provides both synchronous and asynchronous client implementations.

Synchronous Client

from phoenix.client import Client

# Automatic configuration from environment variables
client = Client()

# Explicit configuration
client = Client(base_url="http://localhost:6006")

# Cloud instance with API key
client = Client(
    base_url="https://app.phoenix.arize.com/s/your-space",
    api_key="your-api-key"
)

Asynchronous Client

from phoenix.client import Client, AsyncClient

# Create async client with same configuration options
async_client = AsyncClient()
async_client = AsyncClient(base_url="http://localhost:6006")
async_client = AsyncClient(
    base_url="https://app.phoenix.arize.com/s/your-space",
    api_key="your-api-key"
)

Sources: packages/phoenix-client/README.md

Configuration

Environment Variables

The client automatically reads configuration from environment variables:

VariableDescriptionExample
PHOENIX_BASE_URLBase URL of the Phoenix serverhttp://localhost:6006
PHOENIX_API_KEYAPI key for authenticationsk-xxxxx
PHOENIX_CLIENT_HEADERSCustom headers (JSON stringified){"Authorization": "Bearer xxx"}

Configuration Examples

# Local Phoenix server (default)
export PHOENIX_BASE_URL="http://localhost:6006"

# Cloud Instance
export PHOENIX_API_KEY="your-api-key"
export PHOENIX_BASE_URL="https://app.phoenix.arize.com/s/your-space"

# Custom Headers
export PHOENIX_CLIENT_HEADERS="Authorization=Bearer your-api-key,custom-header=value"

Sources: packages/phoenix-client/README.md

Custom Authentication Headers

For custom authentication scenarios:

from phoenix.client import Client

client = Client(
    base_url="https://your-phoenix-instance.com",
    headers={"Authorization": "Bearer your-api-key"}
)

Architecture

Client Structure

The SDK follows a resource-based architecture where the main Client provides access to specialized resource objects:

graph TD
    A[Client] --> B[prompts]
    A --> C[datasets]
    A --> D[experiments]
    A --> E[spans]
    A --> F[annotations]
    
    B --> B1[create, get, list, format]
    C --> C1[create, append, get, list]
    D --> D1[run, track, evaluate]
    E --> E1[query, filter, export]
    F --> F1[add, update, evaluate]

SDK Package Structure

phoenix/
└── client/
    ├── client.py          # Main Client and AsyncClient classes
    ├── resources/         # Resource implementations
    │   ├── prompts.py
    │   ├── datasets.py
    │   ├── experiments.py
    │   ├── spans.py
    │   └── annotations.py
    └── helpers/
        └── sdk/           # SDK utilities and helpers

Sources: packages/phoenix-client/src/phoenix/client/client.py Sources: packages/phoenix-client/src/phoenix/client/resources

Core Resources

Prompts Resource

Manage prompt templates and versions with versioning support:

from phoenix.client import Client
from phoenix.client.types import PromptVersion

client = Client()

content = """
You're an expert educator in {{ topic }}. Summarize the following article
in a few concise bullet points that are easy for beginners to understand.

{{ article }}
"""

prompt = client.prompts.create(
    name="article-bullet-summarizer",
    version=PromptVersion(
        messages=[{"role": "user", "content": content}],
        model_name="gpt-4o-mini",
    ),
    prompt_description="Summarize an article in a few bullet points"
)

# Retrieve and use prompts
prompt = client.prompts.get(prompt_identifier="article-bullet-summarizer")

# Format the prompt with variables
prompt_vars = {
    "topic": "Sports",
    "article": "Moises Henriques has signed to play for Surrey..."
}
formatted_prompt = prompt.format(variables=prompt_vars)

Prompt Version Type:

ParameterTypeDescription
messagesList[dict]Message array with role and content
model_namestrLLM model to use for the prompt
temperaturefloatSampling temperature (optional)

Sources: packages/phoenix-client/README.md

Datasets Resource

Create and manage datasets from various data sources:

from phoenix.client import Client
import pandas as pd

client = Client()

# Create from DataFrame
dataset = client.datasets.create(
    name="my-dataset",
    dataframe=df,
    description="Training data for sentiment analysis"
)

# Append additional data
client.datasets.append(
    dataset_id=dataset.id,
    dataframe=additional_df
)

# Get dataset
dataset = client.datasets.get(
    dataset_name="my-dataset",
    version_id="optional-version-id"
)

Supported Input Formats:

FormatMethodExample
DataFramedataframe parameterdataframe=pd.DataFrame(...)
CSVcsv_path parametercsv_path="/path/to/data.csv"
Dictionarydictionary parameterdictionary=[{"col": "value"}]

Sources: packages/phoenix-client/README.md

Experiments Resource

Run evaluations and track experiment results:

from phoenix.client import Client

client = Client()

# Run an experiment
experiment = client.experiments.run(
    name="sentiment-analysis-v1",
    dataset_id="dataset-uuid",
    evaluator_config={
        "model": "gpt-4",
        "metrics": ["accuracy", "f1"]
    }
)

# Track results
client.experiments.track(
    experiment_id=experiment.id,
    results={"accuracy": 0.95, "f1": 0.93}
)

Sources: packages/phoenix-client/README.md

Spans Resource

Query and analyze traces with powerful filtering:

from phoenix.client import Client

client = Client()

# Query spans with filters
spans = client.spans.query(
    project_name="my-project",
    filter_conditions={
        "trace_id": "optional-trace-id",
        "start_time": "2024-01-01T00:00:00Z",
        "end_time": "2024-01-02T00:00:00Z"
    },
    limit=100
)

# Get span details
span = client.spans.get(span_id="span-uuid")

Query Parameters:

ParameterTypeDescription
project_namestrPhoenix project name
filter_conditionsdictFiltering conditions
limitintMaximum results to return
start_timedatetimeStart of time range
end_timedatetimeEnd of time range

Sources: packages/phoenix-client/src/phoenix/client/resources

Annotations Resource

Add human feedback and automated evaluations:

from phoenix.client import Client

client = Client()

# Add annotation to span
annotation = client.annotations.add(
    span_id="span-uuid",
    label="correct",
    score=1.0,
    metadata={"reviewer": "human"}
)

# Bulk add annotations
client.annotations.bulk_add(
    annotations=[
        {"span_id": "span-1", "label": "correct", "score": 1.0},
        {"span_id": "span-2", "label": "incorrect", "score": 0.0}
    ]
)

Sources: packages/phoenix-client/README.md

Migration from Legacy Client

The legacy phoenix.session.client.Client has been removed. Users must migrate to the new SDK:

# Old (deprecated)
from phoenix.session.client import Client

# New
from phoenix.client import Client

Sources: src/phoenix/__init__.py

Data Flow Diagram

graph LR
    A[Python Application] -->|arize-phoenix-client| B[Phoenix REST API]
    B --> C[Phoenix Server]
    C --> D[(Database)]
    
    A --> E[Prompts]
    A --> F[Datasets]
    A --> G[Experiments]
    A --> H[Traces/Spans]
    A --> I[Annotations]
    
    E --> B
    F --> B
    G --> B
    H --> B
    I --> B

Use Cases

RAG Evaluation Workflow

from phoenix.client import Client

client = Client()

# 1. Create dataset
dataset = client.datasets.create(
    name="rag-evaluation-set",
    dataframe=evaluation_df
)

# 2. Run evaluation experiment
experiment = client.experiments.run(
    name="rag-faithfulness-eval",
    dataset_id=dataset.id,
    evaluator_config={
        "model": "gpt-4",
        "metrics": ["faithfulness", "answer_relevance"]
    }
)

# 3. Query results
results = client.experiments.get_results(experiment_id=experiment.id)

# 4. Add annotations to spans
for span_id, score in results.items():
    client.annotations.add(
        span_id=span_id,
        label="faithful" if score > 0.8 else "unfaithful",
        score=score
    )

Prompt Versioning and Management

from phoenix.client import Client
from phoenix.client.types import PromptVersion

client = Client()

# Create initial prompt version
prompt_v1 = client.prompts.create(
    name="customer-support-assistant",
    version=PromptVersion(
        messages=[{"role": "system", "content": "You are a helpful assistant."}],
        model_name="gpt-4"
    )
)

# Create new version
prompt_v2 = client.prompts.create(
    name="customer-support-assistant",
    version=PromptVersion(
        messages=[{"role": "system", "content": "You are a helpful support agent trained to be concise."}],
        model_name="gpt-4"
    )
)

# Compare versions
current = client.prompts.get(prompt_identifier="customer-support-assistant")

Summary

The arize-phoenix-client Python SDK provides a comprehensive, Pythonic interface for interacting with the Phoenix observability platform. Key capabilities include:

  • Unified Client Interface - Single entry point for all Phoenix operations
  • Resource-Based Design - Organized access to prompts, datasets, experiments, spans, and annotations
  • Environment-Based Configuration - Zero-config setup via environment variables
  • Async Support - Built-in async client for asynchronous applications
  • Type Safety - Full type hints and Pydantic models for validation

Sources: packages/phoenix-client/src/phoenix/client/client.py Sources: packages/phoenix-client/README.md

Sources: packages/phoenix-client/README.md

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Docs proposal: RAG failure mode checklist for observability and eval workflows

First-time setup may fail or require extra isolation and rollback planning.

high [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with App…

First-time setup may fail or require extra isolation and rollback planning.

high [agents] investigate clientside tracing for external tools

First-time setup may fail or require extra isolation and rollback planning.

medium Developers should check this installation risk before relying on the project: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM

Developers may fail before the first successful local run: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: Docs proposal: RAG failure mode checklist for observability and eval workflows

  • Severity: high
  • Finding: Installation risk is backed by a source signal: Docs proposal: RAG failure mode checklist for observability and eval workflows. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/11472

2. Installation risk: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with App…

  • Severity: high
  • Finding: Installation risk is backed by a source signal: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with App…. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/12941

3. Installation risk: [agents] investigate clientside tracing for external tools

  • Severity: high
  • Finding: Installation risk is backed by a source signal: [agents] investigate clientside tracing for external tools. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/13173

4. Installation risk: Developers should check this installation risk before relying on the project: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
  • User impact: Developers may fail before the first successful local run: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM. Context: Observed when using python, docker, linux
  • Evidence: failure_mode_cluster:github_issue | fmev_b3db5f930ac5e7b7f85e47ff9693c190 | https://github.com/Arize-ai/phoenix/issues/12941 | [BUG]: Docker image exits immediately (SIGILL) on Apple Silicon with podman — cryptography 47.0.0 incompatible with Apple Hypervisor VM

5. Configuration risk: Developers should check this configuration risk before relying on the project: [sandboxes] per-execute timeout enforcement is incomplete across backends

  • Severity: medium
  • Finding: Developers should check this configuration risk before relying on the project: [sandboxes] per-execute timeout enforcement is incomplete across backends
  • User impact: Developers may misconfigure credentials, environment, or host setup: [sandboxes] per-execute timeout enforcement is incomplete across backends
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [sandboxes] per-execute timeout enforcement is incomplete across backends. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_issue | fmev_5978fbc9db2aea6762b2ab2f9e8d0205 | https://github.com/Arize-ai/phoenix/issues/13313 | [sandboxes] per-execute timeout enforcement is incomplete across backends

6. Configuration risk: [sandboxes] per-execute timeout enforcement is incomplete across backends

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: [sandboxes] per-execute timeout enforcement is incomplete across backends. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/issues/13313

7. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:564072810 | https://github.com/Arize-ai/phoenix | README/documentation is current enough for a first validation pass.

8. Project risk: Developers should check this runtime risk before relying on the project: arize-phoenix: v15.5.0

  • Severity: medium
  • Finding: Developers should check this runtime risk before relying on the project: arize-phoenix: v15.5.0
  • User impact: Upgrade or migration may change expected behavior: arize-phoenix: v15.5.0
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: arize-phoenix: v15.5.0. Context: Observed when using docker
  • Evidence: failure_mode_cluster:github_release | fmev_77bc9f7097156e71a699d05abced2916 | https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.0 | arize-phoenix: v15.5.0

9. Project risk: arize-phoenix: v15.3.0

  • Severity: medium
  • Finding: Project risk is backed by a source signal: arize-phoenix: v15.3.0. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.3.0

10. Project risk: arize-phoenix: v15.5.0

  • Severity: medium
  • Finding: Project risk is backed by a source signal: arize-phoenix: v15.5.0. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.0

11. Project risk: arize-phoenix: v15.5.1

  • Severity: medium
  • Finding: Project risk is backed by a source signal: arize-phoenix: v15.5.1. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.5.1

12. Project risk: arize-phoenix: v15.6.0

  • Severity: medium
  • Finding: Project risk is backed by a source signal: arize-phoenix: v15.6.0. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Arize-ai/phoenix/releases/tag/arize-phoenix-v15.6.0

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using phoenix with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence