agent-lightning Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

agent-lightning

The Agent Lightning architecture follows a producer-consumer pattern centered around trace collection and consumption.

Introduction to Agent Lightning

Related topics: System Architecture, Installation Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Loop

Continue reading this section for the full explanation and source context.

Section Component Hierarchy

Continue reading this section for the full explanation and source context.

Section Task and Rollout

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Installation Guide

Introduction to Agent Lightning

Agent Lightning is a reinforcement learning framework designed to train any AI agent with RL algorithms. The project provides a unified execution stack, instrumentation capabilities, and training infrastructure that enables researchers and developers to improve agent behavior through reward-based learning. Source: README.md:1

What is Agent Lightning?

Agent Lightning bridges the gap between raw agent execution and RL-based training by providing:

Instrumentation Layer: Transparent tracing and logging of agent interactions
Training Infrastructure: Built-in support for RL algorithms like GRPO
Distributed Execution: Multi-worker rollout management with state synchronization
Integration Points: Adapters for popular agent frameworks and execution environments

The framework treats agent training as a continuous feedback loop where traces collected from agent execution are consumed by training algorithms to improve policy behavior over time. Source: CLAUDE.md:3

Architecture Overview

The Agent Lightning architecture follows a producer-consumer pattern centered around trace collection and consumption.

Core Loop

graph TD
    A[Runner] -->|emits spans| B[Tracers]
    B -->|writes traces| C[LightningStore]
    C -->|serves traces| D[Algorithms]
    D -->|updates policy| A
    C -->|serves traces| E[Dashboard]

The continuous execution loop works as follows:

Runners execute agents and emit execution spans
Tracers capture and format these spans with semantic conventions
LightningStore maintains synchronized state across all components
Algorithms consume traces to compute rewards and update agent policies
Dashboard provides real-time visualization for debugging

Source: CLAUDE.md:3

Component Hierarchy

Layer	Components	Responsibility
Execution	`Runner`, `LitAgent`	Execute agent logic and manage lifecycle
Instrumentation	`Tracer`, `OtelTracer`, `AgentOpsTracer`	Capture execution traces
Storage	`LightningStore`, `LightningStoreClient`	Synchronized state management
Training	Algorithms in `agentlightning/algorithm/`	Process traces, compute rewards
CLI	`agl` command	User-facing interface

Source: agentlightning/cli/__init__.py:13-16

Core Data Models

The framework defines several fundamental data structures in agentlightning/types/core.py.

Task and Rollout

classDiagram
    class Task {
        +str task_id
        +Any input
        +Optional~str~ instance_id
        +Optional~str~ dataset
    }
    class Rollout {
        +str rollout_id
        +str status
        +Optional~str~ worker_id
        +List~Attempt~ attempts
    }
    class Attempt {
        +str attempt_id
        +str status
        +List~Span~ spans
        +Optional~float~ reward
    }
    class Triplet {
        +Any prompt
        +Any response
        +Optional reward
    }
    
    Task "1" --> "*" Rollout
    Rollout "1" --> "*" Attempt
    Attempt --> Triplet

Core Type Exports

Type	Purpose
`Task`	Represents a unit of work to be executed by an agent
`Rollout`	Collection of attempts for a single task execution
`Attempt`	Single execution attempt with spans and reward
`Triplet`	Prompt-response-reward tuple for RL training
`LightningStore`	Synchronized state store for distributed execution

Source: agentlightning/types/core.py:1-60

Runner System

The runner system provides the execution context for agents with integrated lifecycle management.

Runner Lifecycle

sequenceDiagram
    participant User
    participant Runner
    participant Store
    participant Agent
    
    User->>Runner: async with Runner(agent, store)
    Runner->>Runner: init(agent)
    Runner->>Runner: init_worker(store)
    Runner->>Store: Register worker
    Loop Until event
        Runner->>Agent: Execute task
        Agent-->>Runner: Result
        Runner->>Store: Update state
    end
    Runner->>Runner: teardown_worker()
    Runner->>Runner: teardown()

Runner Base Class

The Runner class provides context manager support for safe initialization and cleanup:

async with runner:
    runner.init(agent=agent, hooks=hooks)
    runner.init_worker(worker_id=0, store=store)
    # Execute tasks...

Key runner responsibilities:

Initialization: Set up agent and worker state
Execution: Poll store for tasks and execute them
Cleanup: Graceful teardown of worker and agent resources

Source: agentlightning/runner/base.py:1-80

Tracing and Instrumentation

Agent Lightning provides multiple tracing backends for capturing agent execution.

Supported Tracers

Tracer	Use Case	Backend
`OtelTracer`	OpenTelemetry-compatible tracing	OTLP endpoint
`AgentOpsTracer`	AgentOps platform integration	AgentOps service
Custom Tracer	Framework integration	Pluggable

Semantic Conventions

The framework defines semantic conventions in agentlightning/semconv.py for consistent span attributes:

Attribute	Description
`LightningSpanAttributes.REWARD`	Reward values for RL spans
`LightningSpanAttributes.LINK`	Span linking relationships
`LightningSpanAttributes.TAG`	Custom span tagging
`LightningResourceAttributes.ROLLOUT_ID`	Rollout identification
`LightningResourceAttributes.ATTEMPT_ID`	Attempt identification

Source: agentlightning/semconv.py:1-40

Trace Writing Example

The minimal examples demonstrate trace writing with LightningStore:

from agentlightning import AgentOpsTracer, LightningStoreClient, OtelTracer, Span

# Write traces directly to in-memory store
store = InMemoryLightningStore()
tracer = OtelTracer(store=store)

# Or connect to a server-side store
client = LightningStoreClient(endpoint="http://localhost:45993")

Source: examples/minimal/write_traces.py:1-50

LightningStore

LightningStore is the central state management component that keeps all components synchronized.

Store Capabilities

graph LR
    A[Runners] -->|enqueue/dequeue| B[Rollouts]
    A -->|register| C[Workers]
    D[Tracers] -->|write spans| B
    E[Algorithms] -->|query traces| B
    F[Dashboard] -->|inspect state| B

Store Collections

Collection	Data Type	Access Pattern
`rollouts`	`Rollout`	Enqueue/dequeue by worker
`attempts`	`Attempt`	Link to rollout
`spans`	`Span`	Query by attempt
`workers`	`Worker`	Heartbeat management
`resources`	`ResourcesUpdate`	Model/prompt versioning

Source: dashboard/test-utils/python-server.py:1-100

Training Algorithms

Agent Lightning integrates with reinforcement learning algorithms to improve agent behavior.

Algorithm Integration

The framework supports pluggable algorithms defined in agentlightning/algorithm/. Algorithms consume traces from the LightningStore and compute policy updates.

Agent-OS Integration

For production safety-critical deployments, Agent Lightning integrates with Agent-OS:

from agentlightning.contrib.runner.agentos import AgentOSRunner
from agentlightning.contrib.reward.agentos import PolicyReward

runner = AgentOSRunner(kernel, fail_on_violation=False, emit_violations=True)
reward_fn = PolicyReward(kernel)

This integration provides:

Policy enforcement: Kernel-level safety during training
Violation penalties: Unsafe actions convert to negative RL rewards
Audit trail: Complete visibility from training to production

Source: contrib/recipes/agentos/README.md:1-60

Minimal Component Showcase

The examples/minimal/ directory provides isolated demonstrations of individual building blocks.

Available Examples

Component	File	Demonstrates
LightningStore + OTLP	`write_traces.py`	`OtelTracer`, `AgentOpsTracer`, rollout/span emission
MultiMetrics backend	`write_metrics.py`	Console and Prometheus metrics simultaneously
LLM proxying	`llm_proxy.py`	Request routing through `/rollout/<id>/attempt/<id>`
vLLM lifecycle	`vllm_server.py`	Server startup, readiness monitoring, teardown

Each example is self-documenting with CLI arguments and environment variables embedded in module docstrings.

Source: examples/minimal/README.md:1-30

Command-Line Interface

The agl CLI provides entry points for all major framework operations.

Available Subcommands

Command	Module	Description
`agl vllm`	`agentlightning.cli.vllm`	vLLM server with instrumentation
`agl store`	`agentlightning.cli.store`	LightningStore server
`agl prometheus`	`agentlightning.cli.prometheus`	Prometheus metrics endpoint
`agl agentops`	`agentlightning.cli.agentops_server`	AgentOps server manager

Starting a LightningStore Server

agl store --port 45993 --log-level DEBUG

The store server enables distributed execution where multiple workers can connect and synchronize state.

Source: agentlightning/cli/__init__.py:1-35

Dashboard

The Agent Lightning Dashboard is a React-based web application for inspecting store state and debugging experiments.

Features

Real-time state inspection: View rollouts, attempts, and spans
Worker monitoring: Track worker status and heartbeat statistics
Resource visualization: Inspect model configurations and prompts
Experiment debugging: Analyze trace sequences and reward flows

Technology Stack

Layer	Technology
Framework	React
UI Components	Mantine UI
Documentation	Storybook
Testing	Vitest

Source: dashboard/README.md:1-35

Project Structure

agent-lightning/
├── agentlightning/          # Core library
│   ├── algorithm/           # RL training algorithms
│   ├── cli/                # Command-line interface
│   ├── contrib/            # Third-party integrations
│   ├── runner/             # Execution runners
│   ├── store/              # LightningStore implementations
│   ├── tracer/             # Tracing backends
│   ├── types/              # Data models
│   └── semconv.py          # Semantic conventions
├── contrib/
│   └── recipes/            # Integration examples (webshop, agentos)
├── dashboard/              # React web application
├── docs/                   # Documentation (mkdocs)
├── examples/               # Runnable workflows
├── scripts/                # Automation scripts
└── tests/                  # Test suite

Source: CLAUDE.md:5-15

Development Workflow

Setup

uv sync --group dev

Testing

# Full test suite
uv run --no-sync pytest -v

# Specific tests
uv run --no-sync pytest -v tests/path/to/test.py
uv run --no-sync pytest -v -k "test_pattern"

Type Checking

uv run --no-sync pyright

Pre-commit Checks

uv run --no-sync pre-commit run --all-files --show-diff-on-failure

Documentation

uv run --no-sync mkdocs build --strict

Source: CLAUDE.md:18-30

Contributing

Agent Lightning welcomes contributions through a structured process:

Branch naming: feature/<slug>, fix/<slug>, docs/<slug>, or chore/<slug>
Commits: Imperative, scoped commits with issue references (e.g., Fixes #123)
Pre-submission: Run pre-commit hooks and relevant pytest/doc builds
CLA: Contributor License Agreement required (automatically prompted by CLA bot)

Source: README.md:50-70

Citation

If you use Agent Lightning in research, please cite:

@misc{luo2025agentlightningtrainai,
      title={Agent Lightning: Train ANY AI Agents with Reinforcement Learning},
      author={Xufang Luo and Yuge Zhang and Zhiyuan He and Zilong Wang and Siyun Zhao and Dongsheng Li and Luna K. Qiu and Yuqing Yang},
      year={2025},
      eprint={2508.03680},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2508.03680},
}

Source: README.md:15-25

Installation Guide

Related topics: Introduction to Agent Lightning, Tutorial: Train Your First Agent

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Requirements

Continue reading this section for the full explanation and source context.

Section Required Tools

Continue reading this section for the full explanation and source context.

Section Method 1: Install from Source

Continue reading this section for the full explanation and source context.

Installation Guide

This guide covers all supported methods for installing and configuring Agent Lightning in your environment. Agent Lightning is a reinforcement learning framework for training AI agents, with support for GPU acceleration, distributed training, and various algorithm backends.

Prerequisites

System Requirements

Component	Minimum	Recommended
Python	3.10+	3.11 or 3.12
OS	Linux (Ubuntu 20.04+), macOS	Linux with CUDA
RAM	8 GB	32 GB+
GPU	Optional	NVIDIA GPU with CUDA 11.8+
Disk Space	5 GB	20 GB+

Source: contrib/recipes/webshop/agl/requirements.txt:1-3

Required Tools

uv: Modern Python package manager (recommended)
Git: For cloning the repository
CUDA Toolkit (for GPU training): Version 11.8 or later

Installation Methods

Method 1: Install from Source

This is the recommended approach for development and contributing.

# Clone the repository
git clone https://github.com/microsoft/agent-lightning.git
cd agent-lightning

# Install all dependencies including development tools
uv sync --group dev

# Install optional GPU dependencies
uv sync --group GPU

Source: CLAUDE.md:20-22

Method 2: Install with Specific Algorithm Backends

Agent Lightning supports multiple reinforcement learning algorithms through optional dependency groups:

# Install with VERL backend (recommended for GPU training)
uv sync --group VERL

# Install with APO backend
uv sync --group APO

# Install with GPU optimizations
uv sync --group GPU

Source: CLAUDE.md:22

Method 3: Using setup.sh for GPU Training

For GPU-accelerated training with the webshop recipe:

# From the contrib/recipes/webshop directory
./setup.sh

This script installs VERL extras for GPU training support. Source: contrib/recipes/webshop/agl/requirements.txt:1-8

Dependency Groups

The pyproject.toml defines several optional dependency groups:

Group	Purpose	Installation Command
`dev`	Development tools (pytest, pyright, pre-commit)	`uv sync --group dev`
`GPU`	GPU acceleration packages	`uv sync --group GPU`
`VERL`	VERL algorithm backend	`uv sync --group VERL`
`APO`	APO algorithm backend	`uv sync --group APO`

Source: CLAUDE.md:20-23

Environment Setup

Creating a Virtual Environment

Using uv (recommended):

# Create and activate a new virtual environment
uv venv
source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate     # Windows

Verifying Installation

Run the test suite to verify your installation:

# Run all tests
uv run --no-sync pytest -v

# Run specific test
uv run --no-sync pytest -v tests/path/to/test.py

# Run tests matching a pattern
uv run --no-sync pytest -v -k "test_pattern"

Source: CLAUDE.md:21

Type Checking

Verify type annotations are correct:

uv run --no-sync pyright

Source: CLAUDE.md:22

Pre-commit Hooks

Before committing code, run pre-commit checks:

uv run --no-sync pre-commit run --all-files --show-diff-on-failure

Source: CLAUDE.md:23

Dashboard Installation

The Agent Lightning Dashboard is a separate React application:

cd dashboard

# Install dependencies
npm install

# Start development server
npm run dev

# Build for production
npm run build

Source: dashboard/README.md:npm scripts section

Dashboard npm Scripts

Script	Purpose
`dev`	Start development server
`build`	Build production bundle
`preview`	Preview production build locally
`storybook`	Start Storybook dev server
`build-storybook`	Build Storybook bundle
`eslint`	Run ESLint
`stylelint`	Run Stylelint
`prettier`	Run Prettier
`typecheck`	Run TypeScript typecheck
`vitest`	Run vitest tests

Source: dashboard/README.md:npm scripts

Recipe-Specific Installation

Webshop Recipe

The webshop recipe has specific dependencies:

cd contrib/recipes/webshop/agl

# Install requirements
pip install -r requirements.txt

# For GPU training
./setup.sh

Required dependencies include:

pandas>=2.0.0 - Data manipulation
pyarrow>=14.0.0 - Parquet file support
rich>=13.0.0 - Terminal formatting
tqdm>=4.64.0 - Progress bars

Source: contrib/recipes/webshop/agl/requirements.txt:1-15

Development Workflow

Branching Conventions

Create feature branches from a fresh main:

Branch Type	Naming Convention
Feature	`feature/<slug>`
Fix	`fix/<slug>`
Documentation	`docs/<slug>`
Maintenance	`chore/<slug>`

Source: CLAUDE.md:8, AGENTS.md:8

Commit and PR Guidelines

Write imperative, scoped commit messages
Reference issues with Fixes #123
Rerun pre-commit and relevant pytest/doc builds before pushing
Include verification commands in PR descriptions
Update documentation via mkdocs.yml or examples/README.md

Source: CLAUDE.md:9-13, AGENTS.md:9-13

GPU Configuration

For optimal GPU training performance:

Install NVIDIA drivers (CUDA 11.8+)
Install the GPU dependency group
For VERL-based training, use uv sync --group GPU

GPU metrics are tracked via heartbeat statistics in worker nodes:

heartbeat_stats={"queue_depth": 2, "gpu_utilization": 0.82}

Source: dashboard/test-utils/python-server.py:Worker class

Troubleshooting

Common Issues

Issue	Solution
`uv` command not found	Install uv: `pip install uv`
CUDA not found	Ensure NVIDIA drivers and CUDA toolkit are installed
Import errors	Run `uv sync` to ensure all dependencies are installed
Type checking failures	Run `uv run --no-sync pyright` to identify issues

Source: CLAUDE.md:26-30

Lock File Updates

When dependencies change, commit the refreshed uv.lock:

git add uv.lock
git commit -m "chore: update lock file"

Source: CLAUDE.md:24

Next Steps

After installation:

Explore Minimal Component Showcase to understand individual components
Set up the LightningStore for trace storage
Configure tracers for your agent execution
Review the Algorithm Documentation for training options

Source: https://github.com/microsoft/agent-lightning / Human Manual

System Architecture

Related topics: Trainer Component, Runner Component, LightningStore

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Navigation Structure

Continue reading this section for the full explanation and source context.

Section Page Components

Continue reading this section for the full explanation and source context.

Section Worker Heartbeat Flow

Continue reading this section for the full explanation and source context.

System Architecture

Overview

Agent Lightning is a reinforcement learning framework for training AI agents, with a distributed system architecture that supports multi-worker training orchestration, resource management, and distributed tracing. The system consists of three primary layers: a Backend Training Engine, a State Store, and a Dashboard Frontend.

The architecture enables parallel training across multiple workers, centralized resource configuration, and real-time monitoring of training workflows through traces and metrics.

Source: dashboard/src/layouts/AppLayout.tsx:1-50

High-Level Architecture Components

The Agent Lightning system comprises the following core entities:

Component	Description	Key Attributes
Resources	Configuration templates for prompts, models, and sampling parameters	`resources_id`, `version`, `resources` (dict with PromptTemplate/LLM)
Workers	Runner processes that execute training rollouts	`worker_id`, `status`, `heartbeat_stats`, `current_rollout_id`
Rollouts	Complete training episodes with multiple attempts	`rollout_id`, `status`, `mode`, `attempts`
Attempts	Individual training attempts within a rollout	`attempt_id`, `status`, `metrics`
Spans	Distributed tracing spans for observability	`trace_id`, `span_id`, `status`, `attributes`, `start_time`, `end_time`

Source: dashboard/test-utils/python-server.py:1-300

Frontend Dashboard Architecture

The dashboard is a React-based frontend built with Mantine UI components that communicates with the backend via REST APIs.

The application uses a sidebar navigation layout with the following sections:

graph TD
    A[AppLayout] --> B[Navbar]
    A --> C[Main Content Area]
    B --> D[Rollouts]
    B --> E[Workers]
    B --> F[Resources]
    B --> G[Traces]
    B --> H[Settings]
    C --> I[Outlet Component]

Source: dashboard/src/layouts/AppLayout.tsx:20-50

Page Components

Page	File Path	Purpose
Rollouts	`dashboard/src/pages/Rollouts.page.tsx`	Display and manage training rollouts with status filtering
Workers	`dashboard/src/pages/Workers.page.tsx`	Monitor worker health and current assignments
Resources	`dashboard/src/pages/Resources.page.tsx`	View and manage configuration resources
Traces	`dashboard/src/components/TracesTable.component.tsx`	Analyze distributed tracing spans

Source: dashboard/src/pages/Rollouts.page.tsx:1-80

Data Flow Architecture

Worker Heartbeat Flow

Workers periodically send heartbeat signals to indicate their operational state. The dashboard monitors these heartbeats to determine worker availability.

sequenceDiagram
    participant W as Worker
    participant S as Store
    participant D as Dashboard
    
    W->>S: Heartbeat (status, queue_depth, gpu_utilization)
    S->>S: Update last_heartbeat_time
    D->>S: Poll /workers endpoint
    S-->>D: Worker list with status

Source: dashboard/test-utils/python-server.py:100-150

Rollout Execution Flow

Training rollouts follow a multi-attempt execution model:

graph LR
    A[Rollout Created] --> B[Attempt 1]
    B --> C{Success?}
    C -->|Yes| D[Rollout Complete]
    C -->|No| E[Attempt 2]
    E --> F{Success?}
    F -->|Yes| D
    F -->|No| G[Attempt N]
    G --> H[Max Attempts Reached]

Source: dashboard/src/components/TracesTable.component.tsx:50-150

Core Entity Schemas

Resources Entity

Resources define reusable configuration templates used by workers during training.

Field	Type	Description
`resources_id`	string	Unique identifier for the resource
`version`	integer	Version number for tracking changes
`create_time`	timestamp	Creation timestamp
`update_time`	timestamp	Last modification timestamp
`resources`	dict	Configuration dictionary (PromptTemplate, LLM configs)

Source: dashboard/test-utils/python-server.py:50-100

Workers Entity

Field	Type	Description
`worker_id`	string	Unique worker identifier
`status`	enum	Current status: `idle`, `busy`, `offline`
`heartbeat_stats`	dict	Metrics including `queue_depth`, `gpu_utilization`
`last_heartbeat_time`	timestamp	Time of last heartbeat
`current_rollout_id`	string	Currently assigned rollout (if busy)
`current_attempt_id`	string	Currently executing attempt

Source: dashboard/src/components/AppDrawer.component.tsx:1-60

Spans Entity (Distributed Tracing)

Field	Type	Description
`rollout_id`	string	Associated rollout
`attempt_id`	string	Associated attempt
`trace_id`	string	Distributed trace identifier
`span_id`	string	Unique span identifier
`parent_id`	string	Parent span ID for hierarchy
`name`	string	Operation name (e.g., `classification_pipeline`)
`status`	TraceStatus	Status with `status_code` (OK, ERROR) and description
`attributes`	dict	Key-value metadata (model, batch_size, accuracy)
`start_time`	timestamp	Span start time
`end_time`	timestamp	Span end time

Source: dashboard/src/components/TracesTable.component.tsx:50-120

Component Architecture (Frontend)

Table Components Pattern

The dashboard uses a consistent table component pattern across all pages:

graph TD
    A[Page Component] --> B[Table Component]
    B --> C[Column Definitions]
    B --> D[Filtering Logic]
    B --> E[Pagination Controls]
    A --> F[useQuery Hook]
    F --> G[API Endpoints]

Component	Props	Purpose
`RolloutTable`	`rollouts`, `totalRecords`, `statusFilters`, `onViewTraces`	Training rollout display
`WorkersTable`	`workers`, `onShowDetails`	Worker monitoring
`ResourcesTable`	`resourcesList`, `renderRowExpansion`	Resource configuration
`TracesTable`	`spans`, `onShowSpanDetail`	Trace analysis

Source: dashboard/src/components/WorkersTable.component.tsx:1-80

Drawer Container Pattern

The application uses an AppDrawerContainer for displaying detailed information:

graph TD
    A[AppDrawerContainer] --> B[Redux State]
    B --> C{Content Type}
    C -->|worker-detail| D[WorkerDrawerTitle]
    C -->|rollout-detail| E[RolloutDrawer]
    C -->|span-detail| F[SpanDetailDrawer]
    D --> G[ConnectionIndicator]
    G --> H[baseUrl, status, isRefreshing]

Source: dashboard/src/components/AppDrawer.component.tsx:60-120

State Management

The frontend uses Redux for state management with the following key selectors:

Selector	Purpose
`selectConfig`	Application configuration (baseUrl, autoRefreshMs)
`selectDrawerIsOpen`	Drawer visibility state
`selectDrawerContent`	Current drawer content type and data
`selectConnectionState`	Backend connection status

Source: dashboard/src/layouts/AppLayout.tsx:50-80

Connection Management

The dashboard includes a ConnectionIndicator component that displays the connection status to the backend:

Status	Description
`connected`	Successfully connected to backend
`disconnected`	Cannot reach backend
`refreshing`	Actively reconnecting

Source: dashboard/src/layouts/AppLayout.tsx:40-45

Training Workflow Integration

Status Lifecycle

Rollouts and attempts follow a defined status lifecycle:

Status	Description
`pending`	Initial state, not yet started
`running`	Currently executing
`succeeded`	Completed successfully
`failed`	Execution failed
`cancelled`	Manually cancelled

Mode Types

Mode	Description
`train`	Training mode with gradient updates
`eval`	Evaluation mode without updates
`inference`	Production inference mode

Source: dashboard/src/pages/Rollouts.page.tsx:30-60

Observability Architecture

Trace Hierarchy

Traces are organized in a hierarchical structure:

Trace
└── Span (root)
    ├── Span (child - preprocess)
    ├── Span (child - classifier)
    └── Span (child - formatter)

Each span captures:

Execution timing (start_time, end_time, duration)
Status and error information
Custom attributes (model, batch_size, accuracy)
Resource metadata (service name)

Source: dashboard/test-utils/python-server.py:200-300

Attribute Keys

Common span attributes include:

Attribute	Example Value	Description
`type`	`classification`	Operation type
`model`	`bert-classifier`	Model used
`batch_size`	`10`	Processing batch size
`accuracy`	`0.95`	Achieved accuracy
`timeout`	`true`	Whether operation timed out
`retry`	`true`	Whether this was a retry attempt

Source: dashboard/src/components/TracesTable.component.tsx:30-50

Resource Configuration Templates

Resources support multiple template engines:

Engine	Syntax	Example
`f-string`	`{variable}`	`"Classify: {ticket}"`
`jinja`	`{{ variable }}` or `{% for %}`	`"{% for r in results %}{{ r }}{% endfor %}"`

Source: dashboard/test-utils/python-server.py:50-90

Summary

The Agent Lightning system architecture provides:

Distributed Training - Multiple workers executing rollouts in parallel
Centralized Configuration - Versioned resource templates for prompts and models
Real-time Monitoring - Worker heartbeat tracking and status dashboards
Full Observability - Distributed tracing with hierarchical spans
State Persistence - Store-based architecture for maintaining system state

The architecture is designed for horizontal scalability, allowing additional workers to be added to increase training throughput while maintaining centralized configuration management and monitoring through the dashboard frontend.

Source: https://github.com/microsoft/agent-lightning / Human Manual

Core Abstractions and Data Models

Related topics: System Architecture, Trainer Component

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Task Representation

Continue reading this section for the full explanation and source context.

Section Rollout Lifecycle

Continue reading this section for the full explanation and source context.

Section Attempt Model

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Trainer Component

Core Abstractions and Data Models

The Agent Lightning framework relies on a set of foundational abstractions and data models that enable the coordination between runners, tracers, the LightningStore, and training algorithms. These core types are defined in agentlightning/types/ and serve as the canonical data structures used throughout the system for representing tasks, rollouts, attempts, traces, and resources.

Architecture Overview

Agent Lightning operates through a continuous execution loop where multiple components interact. The core abstractions facilitate:

Trace Emission - Runners and tracers emit spans during execution
State Synchronization - LightningStore maintains synchronized state
Algorithm Consumption - Training algorithms in agentlightning/algorithm/ consume traces to improve agent behavior

graph TD
    A[Runners] -->|emit spans| B[Tracers]
    B --> C[LightningStore]
    C --> D[Algorithms]
    D -->|improve behavior| A
    C --> E[Dashboard]
    F[Resources] -->|configure| A

Source: CLAUDE.md

Task and Rollout Models

Task Representation

The Task and related classes define the fundamental unit of work in Agent Lightning. Tasks represent the objectives that agents attempt to accomplish during training and evaluation.

Class	Purpose
`Task`	Core task definition containing input and configuration
`TaskInput`	Input data passed to a task
`TaskIfAny`	Conditional task input supporting optional parameters
`Dataset`	Collection of tasks for batch processing

Source: agentlightning/types/core.py:1-50

Rollout Lifecycle

Rollouts represent complete execution attempts of a task. The rollout model captures the entire lifecycle from enqueue to completion.

stateDiagram-v2
    [*] --> Enqueued: EnqueueRolloutRequest
    Enqueued --> InProgress: Runner picks up
    InProgress --> Attempted: First attempt completes
    Attempted --> InProgress: Retry triggered
    InProgress --> [*]: Final attempt
    Attempted --> [*]: Success/Failure

Class	Description
`Rollout`	Represents a single task execution instance
`RolloutConfig`	Configuration for rollout execution
`RolloutMode`	Execution mode (training, evaluation, etc.)
`RolloutStatus`	Current state of the rollout

Source: agentlightning/types/core.py:50-100

Attempt Model

Attempts represent individual tries within a rollout, enabling retry mechanisms and granular progress tracking.

Property	Type	Description
`attempt_id`	`str`	Unique identifier for the attempt
`rollout_id`	`str`	Parent rollout identifier
`status`	`AttemptStatus`	Current attempt status
`sequence_id`	`int`	Order within the rollout

Source: agentlightning/types/core.py:100-150

AttemptedRollout

The AttemptedRollout class aggregates results from all attempts within a rollout:

class AttemptedRollout(BaseModel):
    rollout: Rollout
    attempts: List[Attempt]
    # Aggregated metrics and results

Source: agentlightning/types/core.py:150-180

Tracing Abstractions

OpenTelemetry Integration

Agent Lightning uses OpenTelemetry for distributed tracing. The tracer types provide serialization and interoperability with the broader observability ecosystem.

Class	Purpose
`Span`	Single unit of work in a trace
`SpanCoreFields`	Core fields shared across span implementations
`OtelResource`	Serializable OpenTelemetry resource representation
`TraceStatus`	Span completion status with error information

Source: agentlightning/types/tracer.py:1-80

Span Structure

Spans form the atomic tracing unit, capturing timing, status, attributes, and relationships:

graph LR
    subgraph Span
        A[name] --> B[status]
        B --> C[attributes]
        C --> D[start_time/end_time]
        D --> E[parent_id/span_id]
        E --> F[resource]
    end

Attribute	Description
`name`	Human-readable span identifier
`status`	`TraceStatus` with status_code and optional description
`attributes`	Key-value metadata dictionary
`parent_id`	Reference to parent span (None for root)
`resource`	`OtelResource` containing service metadata

Source: agentlightning/types/tracer.py:80-120

OtelResource Model

The OtelResource class provides a serializable representation of OpenTelemetry resources:

class OtelResource(BaseModel):
    attributes: Attributes
    schema_url: str

This model avoids confusion with the application's Resource class and enables span serialization for store persistence.

Source: agentlightning/types/tracer.py:120-150

Span Creation Patterns

#### SpanCoreFields for Lightweight Creation

For span creators that don't require the full span model, SpanCoreFields provides a minimal interface:

class SpanCoreFields(BaseModel):
    name: str
    status: TraceStatus
    attributes: Attributes
    start_time: Optional[float]
    end_time: Optional[float]

Source: agentlightning/types/tracer.py:150-180

#### Weave Tracer Span Creation

The Weave tracer implementation demonstrates proper span construction with resource attributes:

resource=OtelResource(
    attributes={
        LightningResourceAttributes.ROLLOUT_ID.value: rollout_id,
        LightningResourceAttributes.ATTEMPT_ID.value: attempt_id,
        LightningResourceAttributes.SPAN_SEQUENCE_ID.value: sequence_id,
        LightningResourceAttributes.TRACER_NAME.value: "weave",
    },
    schema_url="",
)

Source: agentlightning/tracer/weave.py:1-50

Resource Management

ResourcesUpdate Model

Resources define configurable components that can be versioned and updated:

class ResourcesUpdate(BaseModel):
    resources_id: str
    version: int
    create_time: float
    update_time: float
    resources: Dict[str, Any]

Field	Type	Description
`resources_id`	`str`	Unique identifier for the resource set
`version`	`int`	Version number for optimistic concurrency
`create_time`	`float`	Unix timestamp of creation
`update_time`	`float`	Unix timestamp of last update
`resources`	`Dict[str, Any]`	Arbitrary resource configuration

Source: dashboard/test-utils/python-server.py:1-80

Resource Types

Resources support flexible configuration through templates and model definitions:

Resource Type	Description
`PromptTemplate`	Templated prompts with jinja2 or f-string engines
`LLM`	Language model configuration with endpoint and sampling parameters
Custom `Dict[str, Any]`	Arbitrary configuration dictionaries

Source: dashboard/test-utils/python-server.py:80-150

Worker Abstraction

Workers represent execution agents that process rollouts:

classDiagram
    class Worker {
        +worker_id: str
        +status: WorkerStatus
        +heartbeat_stats: Dict
        +last_heartbeat_time: float
        +current_rollout_id: Optional[str]
        +current_attempt_id: Optional[str]
    }

Property	Type	Description
`worker_id`	`str`	Unique worker identifier
`status`	`WorkerStatus`	Current status (busy, idle, etc.)
`heartbeat_stats`	`Dict`	Runtime metrics (queue_depth, gpu_utilization)
`last_heartbeat_time`	`float`	Last check-in timestamp
`current_rollout_id`	`Optional[str]`	Currently executing rollout

Source: agentlightning/types/core.py:180-220

Worker Status States

stateDiagram-v2
    [*] --> Idle: Startup
    Idle --> Busy: Dequeue rollout
    Busy --> Idle: Complete
    Busy --> Busy: Heartbeat
    Idle --> [*]: Shutdown
    Busy --> [*]: Shutdown

Source: dashboard/test-utils/python-server.py:150-200

Filtering and Pagination

Query Models

The store supports filtered and paginated queries for efficient data access:

Class	Purpose
`FilterOptions`	Criteria for filtering results
`FilterField`	Individual filter condition
`SortOptions`	Sorting configuration
`PaginatedResult`	Paginated response wrapper

Source: agentlightning/types/core.py:220-260

Operation Context

The @operation decorator provides a simplified span creation interface for user code:

@operation(name="my_operation")
async def my_function():
    # Automatically creates and manages a span
    pass

OperationContext Parameters

Parameter	Type	Description
`propagate`	`bool`	Whether spans should use active span processor
`name`	`Optional[str]`	Alias populating `OPERATION_NAME` attribute

The decorator supports two usage patterns:

As a bare decorator: @operation
As a context manager factory: with operation(name="custom"):

Source: agentlightning/emitter/annotation.py:1-60

Data Flow Summary

graph TD
    subgraph Input
        A[Dataset] --> B[Task]
        B --> C[EnqueueRolloutRequest]
    end
    
    subgraph Execution
        C --> D[Runner]
        D --> E[Worker]
        E --> F[Attempt]
        F --> G[Span]
    end
    
    subgraph Storage
        G --> H[LightningStore]
        H --> I[PaginatedResult]
    end
    
    subgraph Training
        H --> J[Algorithm]
        J --> K[Improved Policy]
    end

Key Type Exports

The agentlightning/types/core.py module exports the following public API:

__all__ = [
    "Triplet",
    "RolloutLegacy",
    "Task",
    "TaskInput",
    "TaskIfAny",
    "RolloutRawResultLegacy",
    "RolloutRawResult",
    "RolloutMode",
    "GenericResponse",
    "ParallelWorkerBase",
    "Dataset",
    "AttemptStatus",
    "RolloutStatus",
    "RolloutConfig",
    "Rollout",
    "Attempt",
    "AttemptedRollout",
    "EnqueueRolloutRequest",
    "Hook",
    "Worker",
    "WorkerStatus",
    "PaginatedResult",
    "FilterOptions",
    "SortOptions",
    "FilterField",
]

Source: agentlightning/types/core.py:40-60

Usage Patterns

Creating a Rollout Request

request = EnqueueRolloutRequest(
    task_id="task-001",
    config=RolloutConfig(mode=RolloutMode.TRAINING),
    priority=1
)

Querying with Filters

filters = FilterOptions(
    fields=[FilterField(name="status", operator="eq", value="completed")],
    sort=SortOptions(field="create_time", direction="desc"),
    offset=0,
    limit=50
)

Source: agentlightning/types/core.py:260-300

Source: https://github.com/microsoft/agent-lightning / Human Manual

Tutorial: Train Your First Agent

Related topics: Tutorial: Writing Agents, Algorithm Zoo

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Supported Agent Components

Continue reading this section for the full explanation and source context.

Section Resource Types

Continue reading this section for the full explanation and source context.

Section Trainer Configuration Parameters

Continue reading this section for the full explanation and source context.

Related topics: Tutorial: Writing Agents, Algorithm Zoo

Tutorial: Train Your First Agent

Overview

This tutorial guides you through training your first AI agent using Agent Lightning's reinforcement learning framework. You will learn how to set up a training pipeline, define prompts and resources, create a dataset, and run the APO (Agent Prompt Optimization) algorithm to improve your agent's behavior through feedback-driven learning.

Agent Lightning provides a complete training loop where runners and tracers emit spans, LightningStore keeps them synchronized, and algorithms consume those traces to improve behavior. Source: CLAUDE.md

Prerequisites

Before starting this tutorial, ensure you have:

Python 3.10+ installed
Agent Lightning installed following the installation guide
An OpenAI-compatible API service available
APO extra dependencies installed

Architecture Overview

Agent Lightning trains agents through a continuous feedback loop:

graph TD
    A[Runner - Executes Agent] --> B[Tracer - Emits Spans]
    B --> C[LightningStore - Synchronizes Data]
    C --> D[Algorithm - Consumes Traces]
    D --> E[Improved Agent Behavior]
    E --> A
    
    F[Dataset - Training Data] --> D
    G[Resources - Prompts/Models] --> A

Source: CLAUDE.md

Step 1: Create Your Agent

Begin by defining a simple room booking agent that uses function calling. The agent receives a user request and selects an appropriate room from available options.

# examples/apo/room_selector.py

from agentlightning import Runner, DataProto
from typing import Any
import json

class RoomSelector(Runner):
    """Room booking agent using function calling."""

    def run(self, task: str, context: dict | None = None) -> DataProto:
        # Define available rooms
        rooms = [
            {"id": "R001", "name": "Conference A", "capacity": 10},
            {"id": "R002", "name": "Meeting Room B", "capacity": 4},
            {"id": "R003", "name": "Board Room", "capacity": 20},
        ]
        
        # Mock LLM response selecting a room
        selected_room = rooms[1]  # Default to Meeting Room B
        
        return DataProto(
            data={
                "selected_room": selected_room["name"],
                "room_id": selected_room["id"],
            },
            raw_response=json.dumps(selected_room),
        )

Source: examples/apo/room_selector.py

Supported Agent Components

Component	Description	Usage
`Runner`	Base class for agent execution	Extend to define custom agent logic
`Trainer`	Training orchestration	Manages training loop and workers
`LightningStore`	Data synchronization	Stores traces and spans
`OtelTracer`	OpenTelemetry span emission	Records execution traces

Source: examples/apo/apo_debug.py

Step 2: Prepare Your Dataset

Create a training dataset with room booking scenarios. Each task should include the user request and expected room selection.

# examples/apo/room_selector_apo.py

from datasets import load_dataset

def create_room_dataset():
    """Create dataset for room booking tasks."""
    
    # Example tasks for room booking
    tasks = [
        {
            "task": "I need to schedule a meeting for 3 people tomorrow at 2 PM",
            "expected_room": "Meeting Room B",
        },
        {
            "task": "We are hosting a team event for 15 team members",
            "expected_room": "Board Room",
        },
        {
            "task": "Quick 1-on-1 sync needed this afternoon",
            "expected_room": "Meeting Room B",
        },
    ]
    
    return tasks

Source: examples/apo/room_selector_apo.py

Step 3: Define Training Resources

Resources define the prompts and model configurations used by your agent during training. You can tune any resource—typically prompt templates—using reinforcement learning.

from agentlightning.prompts import PromptTemplate
from agentlightning.models import LLM

# Define a tunable prompt template
main_prompt = PromptTemplate(
    template="""You are a helpful assistant that helps users book meeting rooms.
    
    Available rooms:
    - Conference A: capacity 10
    - Meeting Room B: capacity 4
    - Board Room: capacity 20
    
    User request: {user_request}
    
    Select the most appropriate room and explain your choice.""",
    engine="f-string",
)

Source: examples/apo/apo_debug.py

Resource Types

Type	Description	Tunable
`PromptTemplate`	Text templates with variable substitution	Yes
`LLM`	Model configuration (endpoint, sampling params)	No
`SystemPrompt`	System-level instructions	Yes
`SamplingParameters`	Temperature, top_p, max_tokens	No

Source: examples/apo/README.md

Step 4: Configure the Trainer

The Trainer class orchestrates the training loop. It manages workers, coordinates with the LightningStore, and applies the optimization algorithm.

from agentlightning import Trainer

# Initialize trainer with one worker
trainer = Trainer(
    n_workers=1,
    # Resources to tune - only these will be optimized
    initial_resources={
        "main_prompt": main_prompt,
    },
)

# Configure the APO algorithm
trainer.configure(
    algorithm="APO",
    lr=1e-3,
    epochs=10,
)

Source: examples/apo/apo_debug.py

Trainer Configuration Parameters

Parameter	Type	Default	Description
`n_workers`	int	1	Number of parallel training workers
`initial_resources`	dict	Required	Resources to optimize
`algorithm`	str	Required	Optimization algorithm name
`lr`	float	1e-3	Learning rate
`epochs`	int	10	Number of training epochs

Source: examples/apo/apo_debug.py

Step 5: Implement Reward Function

The reward function evaluates agent outputs and provides feedback signals for reinforcement learning.

from typing import Any

def room_booking_reward(output: Any, expected: dict) -> float:
    """
    Calculate reward based on room selection accuracy.
    
    Args:
        output: Agent's room selection
        expected: Expected room from dataset
    
    Returns:
        float: Reward score between 0.0 and 1.0
    """
    if not output or not output.data:
        return 0.0
    
    selected_room = output.data.get("selected_room", "")
    expected_room = expected.get("expected_room", "")
    
    # Exact match gets full reward
    if selected_room == expected_room:
        return 1.0
    
    # Partial match gets partial reward
    if expected_room.lower() in selected_room.lower():
        return 0.5
    
    return 0.0

Source: examples/apo/room_selector_apo.py

Step 6: Run the Training Loop

Execute the training with your runner, dataset, and reward function.

import asyncio
from agentlightning import setup_logging

async def train_room_selector():
    setup_logging()
    
    # Initialize agent and trainer
    agent = RoomSelector()
    dataset = create_room_dataset()
    
    trainer = Trainer(
        n_workers=1,
        initial_resources={"main_prompt": main_prompt},
    )
    
    # Run training
    results = await trainer.train(
        runner=agent,
        dataset=dataset,
        reward_fn=room_booking_reward,
        max_iterations=100,
    )
    
    print(f"Training completed: {results}")

if __name__ == "__main__":
    asyncio.run(train_room_selector())

Source: examples/apo/apo_debug.py

Understanding the Training Flow

sequenceDiagram
    participant User as User Code
    participant Trainer as Trainer
    participant Runner as RoomSelector
    participant Store as LightningStore
    participant Algo as APO Algorithm
    
    User->>Trainer: train(runner, dataset, reward_fn)
    Trainer->>Runner: execute_task(task)
    Runner->>Runner: select_room()
    Runner-->>Trainer: output
    Trainer->>Store: record_span(rollout_id, attempt_id)
    Trainer->>Trainer: calculate_reward(output, expected)
    Trainer->>Algo: optimize_step(rewards, traces)
    Algo-->>Trainer: updated_resources
    Trainer->>Runner: update_resources()
    Note over Trainer,Algo: Repeat for max_iterations

Debugging Your Training

Agent Lightning provides multiple debugging approaches:

Approach 1: Runner Mode

Direct execution without training to verify agent logic:

python apo_debug.py --mode runner

Source: examples/apo/apo_debug.py

Approach 2: Hook Mode

Debug with tracing hooks enabled:

python apo_debug.py --mode hook

Approach 3: Trainer Mode

Full training debug with detailed logging:

python apo_debug.py --mode trainer

Viewing Training Traces

During and after training, spans are recorded to the LightningStore. View them in the dashboard:

graph LR
    A[Training Run] --> B[Spans Emitted]
    B --> C[LightningStore]
    C --> D[Dashboard]
    D --> E[Trace Visualization]
    D --> F[Span Details]

The dashboard displays:

View	Description
Rollouts	Complete training iterations
Spans	Individual function calls and operations
Resources	Tunable prompt templates
Metrics	Reward scores and training statistics

Source: examples/minimal/README.md

Common Issues and Solutions

Issue: Tracer Conflicts

Running multiple modes consecutively in one process may cause tracer conflicts.

Solution: Run each mode in a separate process or ensure proper tracer cleanup between runs.

Source: examples/apo/apo_debug.py

Issue: Missing Dependencies

APO requires additional dependencies not in the core installation.

Solution: Install with extras:

pip install agentlightning[apo]

Source: examples/apo/README.md

Next Steps

After completing this tutorial:

Advanced Algorithms: Explore custom algorithms in apo_custom_algorithm.py
Integration: Learn Agent-OS integration for policy-aware training
Dashboard: Use the dashboard to visualize training progress
Production: Scale training with multiple workers and distributed execution

Summary

This tutorial covered the essential steps to train your first agent with Agent Lightning:

Define a Runner implementing your agent logic
Prepare a dataset with tasks and expected outputs
Configure PromptTemplate resources for tuning
Implement a reward function for RL feedback
Use Trainer to orchestrate the training loop
Debug with multiple modes and visualize traces in the dashboard

The training loop continuously improves your agent by optimizing prompt resources based on reward signals, enabling agents to learn from feedback without manual prompt engineering.

Source: https://github.com/microsoft/agent-lightning / Human Manual

Tutorial: Writing Agents

Related topics: Tutorial: Train Your First Agent, Runner Component

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Basic Agent Structure

Continue reading this section for the full explanation and source context.

Section Setting Up the Tracer

Continue reading this section for the full explanation and source context.

Tutorial: Writing Agents

This tutorial provides a comprehensive guide to building AI agents using the Agent Lightning framework. It covers the core concepts, architecture, and practical implementation patterns for creating agents that can be trained with reinforcement learning.

Overview

Agent Lightning is a framework designed to train AI agents using reinforcement learning. The framework provides a complete execution stack including tracing, storage, and algorithm components that work together in a continuous loop. Source: CLAUDE.md:1-5

Agents in this framework are built using the LightningStore architecture, which synchronizes data between runners, tracers, and algorithms. The tracers emit spans that capture the agent's execution behavior, and these spans are consumed by algorithms to improve the agent's performance over time. Source: AGENTS.md:1-5

Architecture Overview

The Agent Lightning framework follows a continuous loop architecture where multiple components interact to enable training of AI agents.

graph TD
    A[Agent / Runner] -->|Emits Spans| B[Tracer]
    B -->|Traces| C[LightningStore]
    C -->|Synchronized Data| D[Algorithms]
    D -->|Training Signals| A
    E[Dashboard] -->|Inspect & Debug| C

Core Components

Component	Purpose	Location
`LightningStore`	Central data store for traces and rollouts	`agentlightning/store/`
`OtelTracer`	OpenTelemetry-based span emission	Via `OtelTracer` class
`AgentOpsTracer`	AgentOps integration for tracing	Via `AgentOpsTracer` class
`Span`	Individual trace unit	Data model
`emit_reward`	Reward signal emission	API function

Source: examples/minimal/write_traces.py:1-40

Writing Your First Agent

Basic Agent Structure

An agent in Agent Lightning is built around the tracing and store infrastructure. The minimal component showcase in examples/minimal/ demonstrates how individual building blocks behave in isolation. Source: examples/minimal/README.md:1-10

Setting Up the Tracer

The framework supports two primary tracing mechanisms:

OtelTracer: OpenTelemetry-based tracing that can forward spans to a remote store client
AgentOpsTracer: AgentOps integration for agent operations tracking

from agentlightning import OtelTracer, LightningStoreClient, setup_logging

# Initialize logging
setup_logging()

# Create tracer with optional remote store client
tracer = OtelTracer(
    rollout_id="ro-001",
    attempt_id="at-001",
    store_client=None  # Or LightningStoreClient(endpoint="...")
)

Source: examples/minimal/write_traces.py:40-60

Opening Rollouts and Emitting Spans

Rollouts represent a single execution attempt of an agent, and attempts within rollouts allow for retry logic and tracking.

# Open a new rollout
tracer.open_rollout(rollout_id="ro-001", user_id="user-123")

# Open an attempt within the rollout
tracer.open_attempt(attempt_id="at-001", sequence_id=1)

# Emit spans during agent execution
tracer.emit_span(
    name="tool_execution",
    attributes={
        "tool": "web_search",
        "query": "onboarding summary"
    }
)

# Close attempt and rollout
tracer.close_attempt()
tracer.close_rollout()

Source: examples/minimal/write_traces.py:60-85

Span Data Model

Spans are the fundamental unit of tracing in Agent Lightning. Each span captures a discrete unit of work within an agent's execution.

Span Attributes

Attribute	Type	Description
`rollout_id`	string	Unique identifier for the rollout
`attempt_id`	string	Unique identifier for the attempt
`sequence_id`	integer	Order of the span within the attempt
`trace_id`	string	Trace grouping identifier
`span_id`	string	Unique span identifier
`parent_id`	string	Parent span ID for hierarchy
`name`	string	Human-readable span name
`status`	TraceStatus	Execution status (OK, ERROR)
`attributes`	dict	Key-value metadata
`start_time`	datetime	Span start timestamp
`end_time`	datetime	Span end timestamp

Source: dashboard/test-utils/python-server.py:1-100

Example Span Creation

from agentlightning import Span, TraceStatus
from datetime import datetime

span = Span(
    rollout_id="ro-story-001",
    attempt_id="at-story-010",
    sequence_id=3,
    trace_id="trace-001-main",
    span_id="span-003-tool",
    parent_id="span-001-root",
    name="tool_execution",
    status=TraceStatus(status_code="OK", description=None),
    attributes={"tool": "web_search", "query": "onboarding summary"},
    events=[],
    links=[],
    start_time=datetime.now(),
    end_time=datetime.now(),
    context=None,
    parent=None,
    resource=OtelResource(attributes={"service.name": "tool-service"}, schema_url="")
)

Source: dashboard/test-utils/python-server.py:100-130

Using Operations

The framework provides an operation decorator for recording synthetic operation spans with additional linking capabilities.

from agentlightning.operation import operation
from agentlightning.utils.otel import make_link_attributes, make_tag_attributes

# Record an operation span
@operation(name="classify_ticket")
def classify_ticket(ticket: str):
    with make_link_attributes(linked_rollout_id="ro-001", linked_attempt_id="at-001"):
        # Operation execution
        result = llm.classify(ticket)
    
    # Tag the reward
    make_tag_attributes(tags={"accuracy": 0.95})
    emit_reward(reward=0.95, name="classification_accuracy")
    
    return result

Source: examples/minimal/write_traces.py:20-35

LightningStore Integration

The LightningStore keeps tracers and runners synchronized, serving as the central data repository.

from agentlightning.store import InMemoryLightningStore

# Use in-memory store for local development
store = InMemoryLightningStore()

# Or connect to a remote store server
store = LightningStoreClient(endpoint="http://localhost:45993")

Source: examples/minimal/write_traces.py:25-35

Store Server CLI

Start a LightningStore server with OTLP enabled:

agl store --port 45993 --log-level DEBUG

Source: examples/minimal/write_traces.py:15-20

Workflow Execution Model

Agents in Agent Lightning follow a structured execution model with rollouts, attempts, and spans.

graph LR
    subgraph Rollout[Rollout: ro-001]
        subgraph Attempt1[Attempt: at-001]
            S1[Span: root]
            S2[Span: preprocess]
            S3[Span: classify]
            S1 --> S2
            S2 --> S3
        end
        subgraph Attempt2[Attempt: at-002]
            S4[Span: root]
            S5[Span: preprocess]
            S6[Span: classify]
            S4 --> S5
            S5 --> S6
        end
    end

State Transitions

State	Description
`pending`	Rollout/attempt created but not started
`running`	Currently executing
`completed`	Successfully finished
`failed`	Execution failed
`cancelled`	Manually cancelled

Source: dashboard/src/components/RolloutTable.component.tsx:1-50

Reward Emission

Agents emit reward signals that algorithms consume during training.

from agentlightning import emit_reward

# Emit a reward with metadata
emit_reward(
    reward=0.85,
    name="task_success",
    attributes={
        "task_id": "classification",
        "accuracy": 0.85,
        "latency_ms": 150
    }
)

Reward Span Attributes

Attribute	Type	Description
`reward.value`	float	Numeric reward value
`reward.name`	string	Reward signal identifier
`reward.attributes`	dict	Additional metadata

Dashboard Integration

The Agent Lightning Dashboard provides real-time inspection of store data and debugging capabilities for running experiments. Source: dashboard/README.md:1-10

Drawer Components

The dashboard uses drawer components to display detailed information:

// Worker detail drawer
if (content.type === 'worker-detail') {
    const { worker } = content;
    const title = <WorkerDrawerTitle worker={worker} />;
    const body = <JsonEditor value={worker} />;
    return { title, body };
}

// Trace detail drawer
if (content.type === 'trace-detail') {
    const { span } = content;
    const title = <TraceDrawerTitle span={span} />;
    const body = <JsonEditor value={span} />;
    return { title, body };
}

Source: dashboard/src/components/AppDrawer.component.tsx:1-50

Minimal Examples Reference

The examples/minimal/ directory provides documented examples for each building block:

Component	File	Purpose
LightningStore + OTLP	`write_traces.py`	Shows `OtelTracer` and `AgentOpsTracer` for rollouts and spans
MultiMetrics	`write_metrics.py`	Console and Prometheus metrics backends
LLM Proxying	`llm_proxy.py`	Request routing through `/rollout/<id>/attempt/<id>` namespaces
vLLM Lifecycle	`vllm_server.py`	Context manager for vLLM server lifecycle

Source: examples/minimal/README.md:10-30

Best Practices

Use descriptive span names: Names like tool_execution and classification_pipeline make debugging easier in the dashboard.
Set appropriate parent IDs: Maintain span hierarchy for better trace visualization.
Emit rewards consistently: Use emit_reward after each task completion to enable algorithm training.
Handle failures explicitly: Set appropriate TraceStatus codes and descriptions for failed spans.
Use operations for complex workflows: The @operation decorator simplifies recording complex multi-step processes.

Next Steps

Explore the API Reference for detailed method signatures
Learn about Training Algorithms that consume traces
Set up the Dashboard for real-time monitoring
Review Examples for complete agent implementations

Source: https://github.com/microsoft/agent-lightning / Human Manual

Trainer Component

Related topics: Runner Component, LightningStore, Algorithm Zoo

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Interactions

Continue reading this section for the full explanation and source context.

Section Training Loop Flow

Continue reading this section for the full explanation and source context.

Section Constructor Parameters

Continue reading this section for the full explanation and source context.

Related topics: Runner Component, LightningStore, Algorithm Zoo

Trainer Component

The Trainer is the core orchestration component in Agent Lightning responsible for managing the reinforcement learning training loop. It coordinates runners, algorithms, and the LightningStore to execute agent training with scalable execution strategies.

Overview

The Trainer serves as the central control plane that:

Manages worker processes for parallel rollout execution
Coordinates between the agent runner and learning algorithm
Persists training traces to the LightningStore
Provides pluggable execution strategies for different deployment scenarios

Source: agentlightning/trainer/registry.py:1-6

Architecture

Component Interactions

graph TD
    T[Trainer] --> R[Runner<br/>Agent Execution]
    T --> A[Algorithm<br/>Policy Update]
    T --> S[LightningStore<br/>Trace Storage]
    T --> E[ExecutionStrategy]
    
    E --> SHM[SharedMemory<br/>Local Workers]
    E --> CS[ClientServer<br/>Remote Workers]
    
    R --> S
    A --> S

Training Loop Flow

sequenceDiagram
    participant T as Trainer
    participant R as Runner
    participant S as LightningStore
    participant A as Algorithm
    
    T->>R: Initialize with config
    T->>A: Load algorithm
    T->>S: Connect store
    
    loop Training Steps
        T->>R: Execute rollouts
        R->>S: Emit spans
        T->>S: Retrieve traces
        T->>A: Process traces
        A->>T: Policy update
    end

Core Configuration

Constructor Parameters

Parameter	Type	Default	Description
`n_workers`	`int`	`1`	Number of parallel worker processes
`algorithm`	`Algorithm \	str`	`None`	Learning algorithm (name or instance)
`runner`	`Runner \	None`	`None`	Agent runner for execution
`reward_fn`	`RewardFn \	None`	`None`	Reward function for training
`execution_strategy`	`str`	`"shm"`	Strategy: `"shm"`, `"cs"`

Source: examples/apo/apo_custom_algorithm_trainer.py:35-37

Execution Strategy Registry

The Trainer supports multiple execution strategies through a registry pattern:

ExecutionStrategyRegistry = {
    "shm": "agentlightning.execution.shared_memory.SharedMemoryExecutionStrategy",
    "cs": "agentlightning.execution.client_server.ClientServerExecutionStrategy",
}

Source: agentlightning/trainer/registry.py:1-6

Strategy	Description	Use Case
`shm`	Shared Memory - Local multi-process execution	Single-node GPU training
`cs`	Client-Server - Remote worker communication	Distributed deployments

Usage Patterns

Basic Training with GRPO Algorithm

from agentlightning import Trainer

trainer = Trainer(
    runner=runner,
    reward_fn=reward_fn,
    algorithm="GRPO"
)

trainer.train()

Source: contrib/recipes/agentos/README.md:40-47

Custom Algorithm Integration

The Trainer accepts custom algorithms decorated with the @algo decorator:

from agentlightning import Trainer
from agentlightning.algorithm import algo
from agentlightning.store import LightningStore

@algo
async def custom_algorithm(*, store: LightningStore):
    # Process traces from store
    return policy_update

trainer = Trainer(n_workers=1, algorithm=custom_algorithm)
trainer.fit(rollout_fn)

Source: examples/apo/apo_custom_algorithm_trainer.py:28-37

Parallel Training with Multiple Workers

from agentlightning import Trainer

trainer = Trainer(
    n_workers=4,           # 4 parallel workers
    execution_strategy="shm",  # Shared memory for local execution
    algorithm="PPO",
    runner=runner
)

trainer.train()

Integration with Agent-OS

The Trainer integrates with Agent-OS for policy-governed training:

from agentlightning import Trainer
from agentlightning.contrib.runner.agentos import AgentOSRunner
from agentlightning.contrib.reward.agentos import PolicyReward
from agent_os import KernelSpace
from agent_os.policies import SQLPolicy

# Create governed kernel
kernel = KernelSpace(policy=SQLPolicy(deny=["DROP", "DELETE"]))

# Wrap in Agent-OS runner
runner = AgentOSRunner(kernel)

# Train with policy-aware rewards
trainer = Trainer(
    runner=runner,
    reward_fn=PolicyReward(kernel),
    algorithm="GRPO"
)

trainer.train()

Source: contrib/recipes/agentos/README.md:25-45

Workflow Phases

Phase	Description
Initialization	Load algorithm, connect store, spawn workers
Rollout	Execute agent episodes in parallel workers
Trace Collection	Retrieve spans from LightningStore
Algorithm Update	Process traces and update policy
Iteration	Repeat rollout-collect-update cycle

LightningStore Integration

The Trainer maintains bidirectional synchronization with LightningStore:

Span Emission: Workers emit execution traces during rollout
Trace Retrieval: Algorithm reads completed traces for learning
Persistence: Training state survives worker restarts

Source: CLAUDE.md:4-6

Command-Line Interface

The Trainer can be invoked via the agl CLI:

# Start training
agl store
python my_training_script.py algo
python my_training_script.py runner

Or programmatically:

python my_training_script.py

Source: examples/apo/apo_custom_algorithm_trainer.py:12-20

Extending the Trainer

Custom Execution Strategy

Add new strategies to the registry:

# In agentlightning/trainer/registry.py
ExecutionStrategyRegistry["custom"] = "mymodule.CustomExecutionStrategy"

Custom Algorithm

Decorate async functions with @algo:

from agentlightning.algorithm import algo

@algo
async def my_algorithm(*, store: LightningStore):
    traces = await store.traces.get_all()
    # Process traces
    return update

Dependencies

Dependency	Purpose
`LightningStore`	Trace persistence and retrieval
`Algorithm`	Policy learning logic
`Runner`	Agent execution environment
`ExecutionStrategy`	Worker orchestration
`RewardFn`	Training signal computation

Runner Component

Related topics: Trainer Component, Tutorial: Writing Agents

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Runner Hierarchy

Continue reading this section for the full explanation and source context.

Section Lifecycle Methods

Continue reading this section for the full explanation and source context.

Section Context Manager Pattern

Continue reading this section for the full explanation and source context.

Related topics: Trainer Component, Tutorial: Writing Agents

Runner Component

The Runner component is the core execution engine in Agent Lightning responsible for managing agent lifecycle, task processing, and telemetry collection. Runners serve as the bridge between the high-level Trainer orchestration and the underlying LitAgent implementation, handling initialization, worker management, and graceful shutdown.

Overview

Runners execute agents in a continuous loop where they poll the LightningStore for tasks, execute agent logic, and emit tracing spans for algorithm consumption. The Runner architecture supports both standard execution through LitAgentRunner and legacy compatibility through LegacyAgentRunner.

Source: agentlightning/runner/__init__.py:1-11

from .agent import LitAgentRunner
from .base import Runner
from .legacy import LegacyAgentRunner

__all__ = [
    "Runner",
    "LegacyAgentRunner",
    "LitAgentRunner",
]

Architecture

graph TD
    A[Trainer] --> B[Runner Fleet]
    B --> C[LitAgentRunner]
    B --> D[LegacyAgentRunner]
    C --> E[LitAgent]
    D --> F[AgentLightningClient]
    E --> G[LightningStore]
    F --> G
    E --> H[Tracer]
    H --> G

Runner Hierarchy

Class	Purpose	Source
`Runner`	Abstract base class defining the runner interface	base.py
`LitAgentRunner`	Primary runner implementation for standard agent execution	agent.py
`LegacyAgentRunner`	Runner for backward compatibility with AgentOps integration	legacy.py

Runner Base Class

The Runner class defines the core interface that all runner implementations must follow. It establishes the lifecycle methods and execution patterns.

Source: agentlightning/runner/base.py:1-20

Lifecycle Methods

The runner lifecycle consists of four key phases:

graph LR
    A[init] --> B[init_worker]
    B --> C[iter/step]
    C --> D[teardown_worker]
    D --> E[teardown]

Method	Purpose	Must Implement
`init(agent, hooks)`	Initialize runner with agent and hooks	Yes
`init_worker(worker_id, store)`	Per-worker initialization with store	Yes
`teardown()`	Release resources from init()	Yes
`teardown_worker(worker_id)`	Release per-worker resources	Yes

Context Manager Pattern

Runners support a context manager pattern for automatic resource management:

with runner.run_context(agent=agent, store=store, hooks=hooks) as runner:
    # Runner is initialized and ready
    await runner.iter()
# Automatic teardown on exit

Source: agentlightning/runner/base.py:52-86

The run_context helper ensures proper cleanup even when exceptions occur:

try:
    self.init(agent=agent, hooks=hooks)
    _initialized = True
    self.init_worker(worker_id=0, store=store)
    _worker_initialized = True
    yield self
finally:
    try:
        if _worker_initialized:
            self.teardown_worker(worker_id=worker_id if worker_id is not None else 0)
    except Exception:
        logger.error("Error during runner worker teardown", exc_info=True)

    try:
        if _initialized:
            self.teardown()
    except Exception:
        logger.error("Error during runner teardown", exc_info=True)

Execution Methods

Method	Description	Behavior
`iter(event)`	Run continuously until event or no tasks	Abstract - subclasses implement
`step()`	Execute single unit of work	Abstract - subclasses implement
`run()`	Legacy run method	Raises `RuntimeError` - use `iter()` or `step()`

Source: agentlightning/runner/base.py:88-102

Warning: The run() method raises RuntimeError because its behavior is undefined. Always use iter() or step() instead.

LitAgentRunner

LitAgentRunner is the primary runner implementation that manages the agent-runner relationship, hook registration, and tracer integration.

Source: agentlightning/runner/agent.py:1-30

Initialization Flow

sequenceDiagram
    participant Trainer
    participant LitAgentRunner
    participant LitAgent
    participant Tracer
    participant LightningStore

    Trainer->>LitAgentRunner: init(agent, hooks)
    LitAgentRunner->>LitAgent: set_runner(self)
    LitAgentRunner->>Tracer: init()
    Trainer->>LitAgentRunner: init_worker(worker_id, store)
    LitAgentRunner->>Tracer: init_worker(worker_id, store)

Key Properties

Property	Type	Description
`agent`	`LitAgent[T_task]`	The agent instance (via `get_agent()`)
`store`	`LightningStore`	The backing store (via `get_store()`)
`worker_id`	`Optional[int]`	Unique worker identifier
`tracer`	`Tracer`	Tracer for span emission

Source: agentlightning/runner/agent.py:90-110

Accessor Methods

def get_agent(self) -> LitAgent[T_task]:
    """Get the agent instance."""
    if self._agent is None:
        raise ValueError("Agent not initialized. Call init() first.")
    return self._agent

def get_store(self) -> LightningStore:
    """Get the store instance."""
    if self._store is None:
        raise ValueError("Store not initialized. Call init_worker() first.")
    return self._store

def get_worker_id(self) -> str:
    """Get the formatted worker ID string."""
    return f"Worker-{self.worker_id}" if self.worker_id is not None else "Worker-Unknown"

Logging Prefix

The _log_prefix() method generates consistent log prefixes for traceability:

def _log_prefix(self, rollout_id: Optional[str] = None) -> str:
    """Generate a standardized log prefix for the current worker."""
    # Returns format: "[Worker-{id}] [{rollout_id}]"

LegacyAgentRunner

LegacyAgentRunner provides backward compatibility for workflows using the AgentOps integration and AgentLightningClient communication pattern.

Source: agentlightning/runner/legacy.py:1-35

Attributes

Attribute	Type	Description
`agent`	`LitAgent[Any]`	The agent instance
`client`	`AgentLightningClient`	Server communication client
`tracer`	`Tracer`	Tracer instance for span emission
`worker_id`	`Optional[str]`	Worker identifier
`max_tasks`	`Optional[int]`	Maximum tasks before stopping

Architecture

graph TD
    A[LegacyAgentRunner] --> B[LitAgent]
    A --> C[AgentLightningClient]
    A --> D[Tracer]
    C --> E[Server]
    D --> F[LightningStore]
    B --> F

Hook System Integration

Runners integrate with the hook system to provide extensibility at key lifecycle points:

Source: agentlightning/types/core.py:1-30

Hook	Timing	Purpose
`on_trace_start`	Before tracer enters trace context	Logging, metric collection, resource setup
`on_trace_end`	After rollout completes, before tracer exits	Logging, cleanup
`on_rollout_start`	Before rollout attempt begins	Per-attempt initialization
`on_rollout_end`	After rollout attempt completes	Result processing, cleanup

Hooks are registered during initialization and called by the runner at appropriate points during execution.

Trainer Integration

Runners are instantiated and managed by the Trainer class, which orchestrates the entire training loop:

Source: agentlightning/trainer/trainer.py:40-60

class Trainer(TrainerLegacy):
    """High-level orchestration layer that wires Algorithm <-> Runner <-> Store."""
    
    # Runner fleet configuration
    n_runners: int  # Number of agent runners to run in parallel
    max_rollouts: Optional[int]  # Maximum rollouts per runner
    strategy: ExecutionStrategy  # Process management strategy
    tracer: Tracer  # Tracer instance for telemetry
    hooks: Sequence[Hook]  # Lifecycle callbacks

Training Configuration

Parameter	Type	Description
`n_runners`	`int`	Number of parallel agent runners
`max_rollouts`	`Optional[int]`	Stop after N rollouts (None = unlimited)
`strategy`	`ExecutionStrategy`	Spawning strategy (shared memory, client/server)
`tracer`	`Tracer`	Tracer class or config for span collection
`hooks`	`Sequence[Hook]`	Lifecycle callback instances

Execution Flow

graph TD
    A[Trainer.fit/dev] --> B[Spawn Runner Fleet]
    B --> C[For each Runner]
    C --> D[runner.run_context]
    D --> E[init + init_worker]
    E --> F[iter/event loop]
    F --> G{Tasks available?}
    G -->|Yes| H[Execute step]
    H --> I[Emit spans to Store]
    I --> F
    G -->|No| J[Exit loop]
    J --> K[teardown_worker]
    K --> L[teardown]

Context Manager Usage

For debugging or standalone usage outside the Trainer stack:

from agentlightning import LitAgentRunner, InMemoryLightningStore

# Create store and agent
store = InMemoryLightningStore()
agent = MyLitAgent()

# Use context manager
runner = LitAgentRunner(tracer=AgentOpsTracer())
with runner.run_context(agent=agent, store=store) as runner:
    # Runner initialized and ready
    worker_id = runner.get_worker_id()
    print(f"Running on {worker_id}")
    
    # Run until complete
    await runner.iter()
# Automatic cleanup

Source: agentlightning/runner/base.py:88-113

Error Handling

Runners implement robust error handling during teardown:

Phase	Error Behavior	Recovery
`teardown_worker`	Logged but doesn't propagate	Continue to `teardown`
`teardown`	Logged but doesn't propagate	Context manager completes

This ensures that multiple cleanup errors don't mask the original failure and that partial cleanup still occurs.

Summary

The Runner component provides:

Lifecycle Management - Consistent init/teardown patterns via context managers
Worker Isolation - Per-worker initialization with dedicated store connections
Hook Integration - Extensibility through lifecycle callbacks
Telemetry - Built-in tracer integration for span emission
Trainer Integration - Seamless orchestration within the training loop

Runners are the execution backbone of Agent Lightning, translating high-level training commands into agent task processing while maintaining observability through distributed tracing.

Source: https://github.com/microsoft/agent-lightning / Human Manual

LightningStore

Related topics: System Architecture, Trainer Component

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Core Types

Continue reading this section for the full explanation and source context.

Section Rollout Status Lifecycle

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Trainer Component

LightningStore

LightningStore is the central data persistence and synchronization layer in Agent Lightning. It manages the lifecycle of AI agent training workflows, including rollouts, attempts, spans, resources, and worker state. The store serves as the backbone for the training loop, enabling distributed execution, tracing, and experiment tracking.

Overview

LightningStore provides a unified interface for:

Rollout Management: Tracking agent task executions from enqueue to completion
Span Recording: Capturing fine-grained traces of agent operations via OpenTelemetry
Resource Management: Storing and versioning agent configurations, prompts, and model definitions
Worker Coordination: Managing distributed worker states and heartbeats
Metrics Collection: Aggregating training metrics through Prometheus integration

Source: agentlightning/store/base.py

Architecture

LightningStore follows a pluggable backend architecture with a unified async interface.

graph TD
    subgraph "Client Layer"
        Runner[Runner] --> Tracer[Tracers<br/>OtelTracer<br/>AgentOpsTracer]
        Tracer --> Client[LightningStoreClient]
    end
    
    subgraph "Server Layer"
        Client --> |HTTP/gRPC| Server[LightningStoreServer]
        Server --> Collections[LightningCollections]
    end
    
    subgraph "Storage Backends"
        Collections --> InMemory[InMemoryLightningStore]
        Collections --> SQLite[SQLiteLightningStore]
        Collections --> Mongo[MongoLightningStore]
    end
    
    subgraph "Thread Safety"
        Store[Any Store] --> Threaded[LightningStoreThreaded]
    end

Core Components

Component	Purpose
`LightningStore`	Abstract base class defining the store interface
`LightningStoreClient`	HTTP client for remote store communication
`LightningStoreServer`	FastAPI-based server handling store operations
`LightningCollections`	Organized data collections (rollouts, spans, resources, workers)
`LightningStoreThreaded`	Thread-safe wrapper for concurrent access

Source: agentlightning/store/client_server.py

Data Models

Core Types

The store operates on these fundamental data structures:

Model	Description
`Rollout`	A complete task execution with status, timestamps, and metadata
`Attempt`	A single attempt within a rollout (supports retries)
`Span`	Fine-grained trace data for agent operations
`TaskInput`	Input data for a task (prompt, parameters)
`Worker`	Worker node state and heartbeat information
`ResourcesUpdate`	Versioned resource configuration storage
`RolloutConfig`	Configuration for rollout execution

Source: agentlightning/types/core.py

Rollout Status Lifecycle

stateDiagram-v2
    [*] --> Pending: enqueue_rollout
    Pending --> Running: start_rollout
    Running --> Completed: finish_rollout
    Running --> Failed: fail_rollout
    Completed --> [*]
    Failed --> [*]
    
    Running --> Attempted: start_attempt
    Attempted --> Running: finish_attempt

The status values are:

pending - Queued for execution
running - Currently executing
completed - Successfully finished
failed - Execution failed

Source: agentlightning/store/base.py

API Endpoints

The server exposes REST endpoints under /v1/agl:

Endpoint	Method	Description
`/rollouts`	POST	Enqueue new rollouts
`/rollouts/{id}`	GET	Retrieve rollout by ID
`/rollouts/{id}/start`	POST	Mark rollout as started
`/rollouts/{id}/finish`	POST	Complete a rollout
`/rollouts/{id}/attempt`	POST	Start a new attempt
`/rollouts/{id}/attempt/{aid}/finish`	POST	Finish an attempt
`/spans`	POST	Record span data
`/spans/search`	POST	Query spans with filters
`/resources`	POST	Add new resources
`/resources/{id}`	GET/PUT	Get or update resources
`/workers`	POST	Register worker
`/workers/{id}/heartbeat`	POST	Worker heartbeat
`/statistics`	GET	Store statistics

Source: agentlightning/store/client_server.py

Implementation Backends

In-Memory Store

The InMemoryLightningStore provides a lightweight, zero-dependency backend suitable for single-node execution and testing.

Key characteristics:

All data stored in process memory
Supports collections with atomic transactions
Built-in size estimation for memory monitoring
Fast for development and small-scale experiments

from agentlightning import InMemoryLightningStore

store = InMemoryLightningStore()

Source: agentlightning/store/memory.py

SQLite Store

SQLite backend provides persistent storage with ACID guarantees, suitable for single-node deployments requiring durability.

MongoDB Store

MongoDB backend supports distributed deployments with horizontal scaling, providing high throughput for large-scale training runs.

Thread Safety

The LightningStoreThreaded class wraps any store implementation to provide thread-safe access:

from agentlightning.store.threading import LightningStoreThreaded

# Wrap any store with thread safety
threaded_store = LightningStoreThreaded(store)

Thread safety features:

Uses threading.Lock for synchronization
Guarantees atomic operations across concurrent requests
Maintains all original store capabilities
Exposes thread_safe: True and async_safe: True in capabilities

Source: agentlightning/store/threading.py

Collection Operations

LightningStore uses a collection-based data organization pattern:

# Atomic write operation
async with store.collections.atomic(mode="w", snapshot=..., labels=["resources"]) as collections:
    await collections.resources.insert([update])

Supported Collections

Collection	Purpose
`rollouts`	Task execution records
`attempts`	Individual attempt tracking
`spans`	OpenTelemetry trace spans
`resources`	Versioned configurations
`workers`	Worker state management

Source: agentlightning/store/collection_based.py

Decorators and Instrumentation

The store layer uses several decorators for observability and reliability:

Decorator	Purpose
`@tracked`	Records operation metrics and timing
`@healthcheck_before`	Validates store health before operations
`@_with_collections_execute`	Manages collection lifecycle and error handling

Integration with Tracers

LightningStore integrates with OpenTelemetry through tracers:

from agentlightning import OtelTracer, AgentOpsTracer

tracer = OtelTracer(store=store)

Tracing workflow:

sequenceDiagram
    participant Agent
    participant Tracer
    participant Store
    participant OTLP
    
    Agent->>Tracer: Create span
    Tracer->>Store: Record span data
    Store->>Store: Persist to backend
    Tracer->>OTLP: Export spans (optional)

Source: examples/minimal/write_traces.py

Usage Examples

Basic Store Operations

from agentlightning import InMemoryLightningStore

# Create store
store = InMemoryLightningStore()

# Enqueue a task
rollout = await store.enqueue_rollout(
    input={"prompt": "Solve this problem"},
    mode="train"
)

# Dequeue for processing
task = await store.dequeue_rollout(worker_id="worker-1")

# Complete the rollout
await store.finish_rollout(
    rollout_id=task.rollout.rollout_id,
    attempt_id=task.attempt.attempt_id,
    response={"answer": "42"}
)

Server Setup

# Start a LightningStore server
agl store --port 45993 --log-level DEBUG

Client Connection

from agentlightning import LightningStoreClient

client = LightningStoreClient(base_url="http://localhost:45993")

# All operations work through the client
rollouts = await client.list_rollouts()

Capabilities

The store reports its capabilities through the capabilities property:

Capability	Description
`async_safe`	Supports async operations
`thread_safe`	Supports concurrent thread access
`distributed`	Supports multi-node deployment
`persistence`	Data survives restarts

Source: agentlightning/store/threading.py

CLI Commands

The agl CLI provides store management:

# Start store server
agl store --port 45993 --log-level DEBUG

# Prometheus metrics endpoint
agl prometheus

Source: agentlightning/cli/__init__.py

Source: https://github.com/microsoft/agent-lightning / Human Manual

Algorithm Zoo

Related topics: Tutorial: Train Your First Agent

Section Related Pages

Continue reading this section for the full explanation and source context.

Section APO (Adaptive Prompt Optimization)

Continue reading this section for the full explanation and source context.

Section VERL (Value-Enhanced Reinforcement Learning)

Continue reading this section for the full explanation and source context.

Section FAST (Fast Algorithm Suite Toolkit)

Continue reading this section for the full explanation and source context.

Related topics: Tutorial: Train Your First Agent

Algorithm Zoo

Overview

The Algorithm Zoo is a modular collection of training algorithms that consume execution traces from the Agent Lightning runtime to improve agent behavior through reinforcement learning and prompt optimization. Source: CLAUDE.md

Agent Lightning runs through a continuous loop where runners and tracers emit spans, LightningStore keeps them synchronized, and algorithms in agentlightning/algorithm/ consume those traces to improve behavior. Source: CLAUDE.md

Architecture

The Algorithm Zoo follows a producer-consumer pattern where the store acts as the central synchronization hub:

graph TD
    A[Runners] -->|emit spans| B[LightningStore]
    C[Tracers] -->|emit spans| B
    B -->|traces| D[Algorithm Zoo]
    D -->|policy updates| E[Improved Agent Behavior]
    B -->|traces| F[Dashboard]

Available Algorithms

APO (Adaptive Prompt Optimization)

APO is a prompt optimization algorithm that iteratively refines prompt templates based on reward signals collected from agent rollouts.

#### How APO Works

The APO algorithm maintains a collection of prompt candidates and evaluates each one against task objectives. Based on the reward signals, it selects and refines the most effective prompts. Source: examples/apo/apo_custom_algorithm.py:34-37

async def apo_algorithm(*, store: agl.LightningStore):
    """
    An example of how a prompt optimization works.
    """
    prompt_candidates = [
        "You are a helpful assistant. {any_question}",
        "You are a knowledgeable AI. {any_question}",
        "You are a friendly chatbot. {any_question}",
    ]

    prompt_and_rewards: list[tuple[str, float]] = []

#### Custom APO Algorithm

To create a custom algorithm, wrap your async function with the @algo decorator. Source: examples/apo/apo_custom_algorithm_trainer.py:28-39

from agentlightning.algorithm import algo

@algo
async def apo_algorithm_usable_in_trainer(*, store: LightningStore):
    """
    You need to wrap the apo_algorithm in an algo decorator to make it usable in trainer.
    """
    return await apo_algorithm(store=store)

VERL (Value-Enhanced Reinforcement Learning)

VERL is a full training algorithm that integrates with the VERL library for GPU-accelerated reinforcement learning. Source: examples/tinker/q20_train.py:43-52

algo_verl_parser = subparsers.add_parser("verl", help="Launch the full training algorithm with VERL.")
algo_verl_parser.add_argument("--port", type=int, default=4747, help="Port for the AgentLightning store.")
algo_verl_parser.add_argument(
    "--model",
    choices=("qwen25", "qwen3"),
    default="qwen3",
    help="Model variant to train.",
)
algo_verl_parser.add_argument("--search", action="store_true", help="Enable search tool.")

FAST (Fast Algorithm Suite Toolkit)

The FAST algorithm provides lightweight optimization capabilities for rapid experimentation.

Running Algorithms

Option A: Separate Components

Start the store, algorithm, and runner in three separate terminals: Source: examples/apo/README.md:10-24

# Terminal 1: Start the store
agl store

# Terminal 2: Run the algorithm
python apo_custom_algorithm.py algo

# Terminal 3: Run the rollout runner
python apo_custom_algorithm.py runner

Option B: Integrated Trainer

Use the integrated trainer that handles all components: Source: examples/apo/apo_custom_algorithm_trainer.py:47-49

from agentlightning import Trainer, setup_logging

trainer = Trainer(n_workers=1, algorithm=apo_algorithm_usable_in_trainer)
trainer.fit(apo_rollout)

Algorithm Decorator

The @algo decorator transforms any async algorithm function into a component that can be used with the Trainer. It injects the LightningStore as a keyword argument. Source: examples/apo/apo_custom_algorithm_trainer.py:28-39

Algorithm Configuration

Common Parameters

Parameter	Type	Description
`store`	`LightningStore`	Central store for traces and resources
`n_workers`	`int`	Number of parallel workers
`port`	`int`	Port for store connection (default: 4747)

VERL-Specific Options

Option	Choices	Default	Description
`--model`	qwen25, qwen3	qwen3	Model variant to train
`--port`	int	4747	Store connection port
`--search`	flag	False	Enable search tool

Workflow

graph LR
    A[Define Prompt Candidates] --> B[Loop Through Candidates]
    B --> C[Update Resources in Store]
    C --> D[Run Rollout with Runner]
    D --> E[Collect Reward Signal]
    E --> F[Update Prompt Template]
    F --> B

Extending the Algorithm Zoo

Creating Custom Algorithms

Define an async function that takes store: LightningStore as a keyword argument
Wrap it with the @algo decorator
Implement your optimization logic
Use the trainer or run separately

Example pattern: Source: examples/apo/apo_custom_algorithm.py:54-72

async def apo_algorithm(*, store: agl.LightningStore):
    for prompt in prompt_candidates:
        # 1. The optimization algorithm updates the prompt template
        console.print(f"[Algo] Updating prompt template to: '{prompt}'")
        resources: agl.NamedResources = {
            # The "main_prompt" can be replaced with any name
        }
        # 2. Update resources in store
        # 3. Collect reward signals
        # 4. Refine prompt based on rewards

Requirements for Custom Algorithms

Must be async functions
Must accept store as keyword argument
Should be wrapped with @algo decorator for trainer integration
Must interact with LightningStore for state synchronization

Integration with RAG

The Algorithm Zoo can be extended to work with retrieval-augmented generation systems. See the RAG example for integrating FAISS-based retrieval with prompt optimization. Source: examples/rag/README.md

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium README/documentation is current enough for a first validation pass.

The project should not be treated as fully validated until this signal is reviewed.

medium Maintainer activity is unknown

Users cannot judge support quality until recent activity, releases, and issue response are checked.

medium no_demo

The project may affect permissions, credentials, data exposure, or host boundaries.

medium No sandbox install has been executed yet; downstream must verify before user use.

The project may affect permissions, credentials, data exposure, or host boundaries.

Doramagic Pitfall Log

Doramagic extracted 7 source-linked risk signals. Review them before installing or handing real data to the project.

1. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | README/documentation is current enough for a first validation pass.

2. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | last_activity_observed missing

3. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | no_demo; severity=medium

4. Security or permission risk: No sandbox install has been executed yet; downstream must verify before user use.

Severity: medium
Finding: No sandbox install has been executed yet; downstream must verify before user use.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.safety_notes | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | No sandbox install has been executed yet; downstream must verify before user use.

5. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | no_demo; severity=medium

6. Maintenance risk: issue_or_pr_quality=unknown

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | issue_or_pr_quality=unknown

7. Maintenance risk: release_recency=unknown

Severity: low
Finding: release_recency=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | art_9b504779cfa046a894eeb7c9d3a298c6 | https://github.com/microsoft/agent-lightning#readme | release_recency=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using agent-lightning with real data or production workflows.

Intermittent missing openai.chat.completion spans from query_spans (RLin - github / github_issue
Question about Code Availability for EMPO^2 Paper - github / github_issue
calc-x example fails on next - github / github_issue
APO's TraceToMessages adapter fails with multi-turn agent rollouts (KeyE - github / github_issue
Installation Problem - github / github_issue
GRPO grouping in multi-turn agent RL: is it valid to mix samples with di - github / github_issue
Announcing Solantra: Next-Gen Blockchain on Solana - github / github_issue
Add interaction scripts and token utilities - github / github_issue
blockchain project - github / github_issue
Agent Lightning v0.3.0 - github / github_release
Agent Lightning v0.2.2 - github / github_release
Agent Lightning v0.2.1 - github / github_release

Source: Project Pack community evidence and pitfall evidence

agent-lightning

Introduction to Agent Lightning

Related Pages

Introduction to Agent Lightning

What is Agent Lightning?

Architecture Overview

Core Loop

Component Hierarchy

Core Data Models

Task and Rollout

Core Type Exports

Runner System

Runner Lifecycle

Runner Base Class

Tracing and Instrumentation

Supported Tracers

Semantic Conventions

Trace Writing Example

LightningStore

Store Capabilities

Store Collections

Training Algorithms

Algorithm Integration

Agent-OS Integration

Minimal Component Showcase

Available Examples

Command-Line Interface

Available Subcommands

Starting a LightningStore Server

Dashboard

Features

Technology Stack

Project Structure

Development Workflow

Setup

Testing

Type Checking

Pre-commit Checks

Documentation

Contributing

Citation

Further Reading

Installation Guide

Related Pages

Installation Guide

Prerequisites

System Requirements

Required Tools

Installation Methods

Method 1: Install from Source

Method 2: Install with Specific Algorithm Backends

Method 3: Using setup.sh for GPU Training

Dependency Groups

Environment Setup

Creating a Virtual Environment

Verifying Installation

Type Checking

Pre-commit Hooks

Dashboard Installation

Dashboard npm Scripts

Recipe-Specific Installation

Webshop Recipe

Development Workflow

Branching Conventions

Commit and PR Guidelines

GPU Configuration

Troubleshooting

Common Issues

Lock File Updates

Next Steps

System Architecture

Related Pages

System Architecture

Overview

High-Level Architecture Components

Frontend Dashboard Architecture

Navigation Structure

Page Components

Data Flow Architecture

Worker Heartbeat Flow