# https://github.com/langfuse/langfuse 项目说明书

生成时间：2026-05-15 10:18:59 UTC

## 目录

- [Project Introduction](#project-introduction)
- [Project Structure](#project-structure)
- [Quickstart Guide](#quickstart-guide)
- [System Architecture](#system-architecture)
- [Monorepo Configuration](#monorepo-structure)
- [Database Schema (Prisma)](#database-schema)
- [ClickHouse Analytics Layer](#clickhouse-analytics)
- [Queue System (Redis/BullMQ)](#queue-system)
- [Worker Service](#worker-service)
- [API Layer](#api-layer)

<a id='project-introduction'></a>

## Project Introduction

### 相关页面

相关主题：[System Architecture](#system-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/langfuse/langfuse/blob/main/README.md)
- [packages/shared/src/constants/VERSION.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/constants/VERSION.ts)
- [web/src/components/ui/AdvancedJsonViewer/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/ui/AdvancedJsonViewer/README.md)
- [web/src/components/layouts/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/layouts/README.md)
- [web/src/features/mcp/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/mcp/README.md)
- [web/src/features/entitlements/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/entitlements/README.md)
- [web/src/features/feature-flags/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/feature-flags/README.md)
- [web/src/components/design-system/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/design-system/README.md)
</details>

# Project Introduction

Langfuse is an open-source **observability and analytics platform** designed for Large Language Model (LLM) applications. It provides comprehensive tracing, evaluation, and prompt management capabilities that enable developers to monitor, debug, and optimize their AI-powered applications.

## Overview

Langfuse serves as a centralized platform for capturing and analyzing interactions between AI models and end users. The project is MIT licensed and supports both cloud-hosted and self-hosted deployment models, making it accessible for teams of various sizes and requirements.

### Core Purpose

The platform addresses several critical needs in LLM application development:

- **Observability**: Track and visualize traces, observations, and model interactions in real-time
- **Evaluation**: Measure and analyze AI application performance through configurable scoring systems
- **Prompt Management**: Create, version, and manage prompts with support for complex dependency resolution
- **Collaboration**: Enable team collaboration through commenting and sharing features
- **Analytics**: Provide insights into AI application behavior through comprehensive analytics dashboards

### High-Level Architecture

Langfuse follows a modern microservices-inspired architecture with clear separation between frontend, backend processing, and data storage components.

```mermaid
graph TD
    subgraph Frontend["Frontend (Next.js/React)"]
        UI[User Interface]
        DesignSystem[Design System Components]
        FeatureFlags[Feature Flags]
    end
    
    subgraph Backend["Backend Services"]
        API[API Server]
        MCP[MCP Server]
        Worker[Worker/Queue Processing]
    end
    
    subgraph Storage["Data Layer"]
        Postgres[(PostgreSQL)]
        ClickHouse[(ClickHouse)]
        Redis[(Redis)]
        S3[(S3 Storage)]
    end
    
    UI --> API
    MCP --> API
    Worker --> API
    API --> Postgres
    API --> ClickHouse
    API --> Redis
    API --> S3
```

## Technology Stack

Langfuse is built using a modern technology stack optimized for performance and developer experience.

### Frontend Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Framework | Next.js | Server-side rendering and routing |
| UI Library | React | Component-based UI development |
| State Management | React Context + Hooks | Local and global state |
| Virtualization | TanStack Virtual | Efficient rendering of large lists |
| Styling | Tailwind CSS + cva | Utility-first styling with variant handling |
| Forms | Zod | Schema validation |
| Data Fetching | tRPC | Type-safe API communication |

资料来源：[web/src/components/design-system/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/design-system/README.md)

### Backend Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Runtime | Node.js/TypeScript | Server-side logic |
| Database | PostgreSQL | Primary data storage |
| Analytics | ClickHouse | High-performance analytics queries |
| Cache | Redis | Caching and queue management |
| Queue | BullMQ | Background job processing |
| Storage | S3-compatible | File and event storage |

### Key Frontend Components

The frontend architecture is organized around several key systems:

#### Design System

The design system (`web/src/components/design-system/`) provides reusable, primitive UI components following strict principles:

- **Presentational only**: No business logic in components
- **Explicit, typed APIs**: Strict TypeScript definitions
- **No className/style props**: Prevents style leakage
- **cva for variants**: Consistent variant handling

```mermaid
graph LR
    A[Design System] --> B[Button]
    A --> C[Input]
    A --> D[Modal]
    B --> E[Consistent Styling]
    C --> E
    D --> E
```

资料来源：[web/src/components/design-system/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/design-system/README.md)

#### Layout System

All pages use a standardized `Page` wrapper component that ensures:

- Consistent layout structure
- Sticky header behavior
- Proper scroll management (`"content-scroll"` or `"page-scroll"`)
- Breadcrumb navigation support
- Custom header actions

```mermaid
graph TD
    Page[Page Component] --> Header[Sticky Header]
    Page --> Content[Scrollable Content]
    Header --> Breadcrumb[Breadcrumb Navigation]
    Header --> Actions[Action Buttons]
```

资料来源：[web/src/components/layouts/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/layouts/README.md)

#### JSON Viewer Component

The `AdvancedJsonViewer` component provides efficient rendering of large JSON datasets:

- **Virtualization**: Uses TanStack Virtual for row-based rendering
- **Iterative algorithms**: Explicit stack-based iteration to prevent stack overflow
- **Client-side search**: In-memory matching with binary search navigation
- **Theme support**: Customizable JSON syntax highlighting

```mermaid
graph TD
    Input[Large JSON Data] --> Parser[JSON Parser]
    Parser --> TreeBuilder[Tree Builder]
    TreeBuilder --> Virtualizer[TanStack Virtual]
    Virtualizer --> Renderer[Row Renderer]
    
    Search[Search Query] --> Matcher[In-Memory Matcher]
    Matcher --> Navigator[Binary Search Navigator]
```

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/ui/AdvancedJsonViewer/README.md)

## Core Features

### Tracing and Observability

Langfuse provides comprehensive tracing capabilities that capture the full lifecycle of AI interactions:

- **Traces**: Complete request/response cycles
- **Observations**: Individual components within a trace (spans, events, generations)
- **Metadata**: Custom metadata attachment for context
- **Tree Structure**: Hierarchical representation of nested observations

The tree-building system uses iterative algorithms to handle millions of observations without stack overflow:

```typescript
// Iterative traversal pattern
function traverse(rootNode: TreeNode) {
  const stack = [rootNode];
  while (stack.length > 0) {
    const node = stack.pop()!;
    process(node);
    node.children.forEach((child) => stack.push(child));
  }
}
```

资料来源：[web/src/components/trace/lib/tree-building.clienttest.ts](https://github.com/langfuse/langfuse/blob/main/web/src/components/trace/lib/tree-building.clienttest.ts)

### Prompt Management

Langfuse supports sophisticated prompt management with dependency resolution:

- **Prompt Stacking**: Compose prompts from multiple sources
- **Dependency Tags**: Reference other prompts using `@@@langfusePrompt:...@@@` syntax
- **Resolution Modes**:
  - `getPromptResolved`: Returns fully resolved prompt with dependencies inlined
  - `getPromptUnresolved`: Returns raw prompt with tags preserved for analysis

```mermaid
graph LR
    A[Prompt A] -->|references| B[Prompt B]
    A -->|references| C[Prompt C]
    B -->|references| D[Prompt D]
    A -->|resolved| E[Final Prompt]
```

资料来源：[web/src/features/mcp/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/mcp/README.md)

### Score Analytics

The scoring system enables quantitative evaluation of AI application performance:

- **Multiple Score Types**: Supports numeric, categorical, and boolean scores
- **Time Series Analysis**: Track score changes over configurable intervals
- **Distribution Analysis**: Visualize score distributions with bins and categories
- **Comparison Mode**: Compare two scores side-by-side

The analytics layer provides interpretive functions for common metrics:

| Metric | Interpretation | Threshold |
|--------|----------------|-----------|
| Agreement (Cohen's Kappa) | Excellent | ≥ 0.9 |
| Agreement (Cohen's Kappa) | Good | ≥ 0.8 |
| Agreement (Cohen's Kappa) | Fair | ≥ 0.6 |
| Agreement (Cohen's Kappa) | Poor | ≥ 0.4 |

资料来源：[web/src/features/score-analytics/lib/statistics-utils.ts](https://github.com/langfuse/langfuse/blob/main/web/src/features/score-analytics/lib/statistics-utils.ts)

### Entitlements System

Access control is managed through a hierarchical entitlements system:

```mermaid
graph TD
    Plan[Plan] -->|contains| Entitlements[Entitlements]
    Plan -->|contains| Limits[Entitlement Limits]
    
    Entitlements -->|grants| Features[Feature Access]
    Limits -->|restricts| Resources[Resource Quotas]
    
    PlanTypes[Plan Types] --> OSS[OSS]
    PlanTypes --> CloudPro[Cloud Pro]
    PlanTypes --> SelfHostedEnterprise[Self-Hosted Enterprise]
```

Available entitlements include:

- **Feature Flags**: Enable/disable features via `useIsFeatureEnabled` hook
- **Entitlement Limits**: Quotas on resources (e.g., annotation queue count)
- **Plan-based Access**: Cloud and self-hosted enterprise plans

资料来源：[web/src/features/entitlements/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/entitlements/README.md)

### Feature Flags

Feature flags control feature availability dynamically:

```typescript
const isFeatureEnabled = useIsFeatureEnabled("feature-flag-name");
```

A feature flag is enabled when:

1. Flag is in user's `feature_flags` list
2. `LANGFUSE_ENABLE_EXPERIMENTAL_FEATURES` environment variable is set
3. User has admin privileges

资料来源：[web/src/features/feature-flags/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/feature-flags/README.md)

### Collaboration Features

Langfuse includes team collaboration capabilities:

- **Mention Parser**: Extract and resolve user mentions in comments
- **User References**: Syntax `@[Display Name](user:userId)` for linking users
- **Sanitization**: Clean user-generated content for safe display

```typescript
// Mention format: @[Alice](user:alice123)
// Parser extracts: alice123
```

资料来源：[web/src/features/comments/lib/mentionParser.clienttest.ts](https://github.com/langfuse/langfuse/blob/main/web/src/features/comments/lib/mentionParser.clienttest.ts)

## Data Flow Architecture

### Ingestion Pipeline

Events flow through the system in a structured pipeline:

```mermaid
graph LR
    S3[S3 Event Storage] --> Worker[Worker Processing]
    Worker -->|Standard| IngestionQueue[IngestionSecondaryQueue]
    Worker -->|OTEL| OtelQueue[OtelIngestionQueue]
    IngestionQueue --> Postgres[(PostgreSQL)]
    OtelQueue --> ClickHouse[(ClickHouse)]
```

Event processing includes:

- **Checkpointing**: Resume from failures using `.checkpoint` files
- **Rate Limiting**: Client-side and server-side throttling
- **Retry Logic**: Exponential backoff with jitter for transient failures
- **Error Logging**: Failed events appended to `errors.csv`

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/replayIngestionEventsV2/README.md)

### API Architecture

Langfuse uses tRPC for type-safe API communication:

- **Server-side validation**: Zod schemas for input validation
- **tRPC routers**: Organized endpoint handlers
- **API versioning**: V1 (legacy) and V2 (current) API support

| API Version | GET Support | Notes |
|-------------|-------------|-------|
| V1 | `traceId` required | Legacy, trace-focused |
| V2 | `traceId` optional | Adds `sessionId` support |

资料来源：[packages/shared/src/features/scores/interfaces/README.md](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/features/scores/interfaces/README.md)

### MCP Server Architecture

The Model Context Protocol (MCP) server provides external access to Langfuse:

- **Stateless per-request**: Fresh server instance for each request
- **Context via closures**: Authentication captured in handler closures
- **No session storage**: Request-disposable architecture

```mermaid
graph TD
    Request[MCP Request] --> Instance[New Server Instance]
    Instance --> Auth[Auth Context Closure]
    Auth --> Handler[Request Handler]
    Handler --> Response[Response]
    Response --> Discard[Instance Discarded]
```

资料来源：[web/src/features/mcp/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/mcp/README.md)

## Filtering System

Langfuse implements a sophisticated filtering system with type-safe encoding:

### Filter Types

| Type | Description | Example |
|------|-------------|---------|
| `string` | Simple string matching | Trace name |
| `number` | Numeric comparison | Latency values |
| `datetime` | Date/time filtering | Time ranges |
| `boolean` | True/false matching | Flag states |
| `arrayOptions` | Multi-value selection | Tags |
| `categoryOptions` | Categorical filtering | Status values |
| `positionInTrace` | Nested location | Span hierarchy |

### State Management

Filters support multiple storage locations:

- **URL**: Persisted in query parameters
- **Session Storage**: In-memory per session
- **Peek Context**: Temporary state for preview panels

```typescript
const filterOptions: UseSidebarFilterStateOptions = {
  stateLocation: "urlAndSessionStorage",
  sessionFilterContextId: projectId,
  implicitDefaultConfig: DEFAULT_SIDEBAR_IMPLICIT_ENVIRONMENT_CONFIG,
};
```

资料来源：[web/src/components/table/peek/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/table/peek/README.md)

## Deployment Models

Langfuse supports multiple deployment configurations:

| Model | Plan Options | Authentication | License |
|-------|--------------|----------------|---------|
| Cloud | `cloud:pro` | JWT via NextAuth | Proprietary |
| Self-Hosted | `self-hosted:enterprise` | JWT + License Key | Proprietary |
| Open Source | `oss` | Basic auth | MIT |

### Self-Hosted Configuration

Self-hosted deployments require:

- PostgreSQL database
- ClickHouse for analytics
- Redis for caching/queues
- S3-compatible storage for events
- License key for enterprise features

资料来源：[web/src/features/entitlements/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/entitlements/README.md)

## Performance Considerations

### Large Dataset Handling

Langfuse handles large datasets through several mechanisms:

| Scale | Mechanism | Threshold |
|-------|-----------|-----------|
| 10k+ rows | TanStack Virtual | Row-based rendering |
| 1M+ nodes | Iterative algorithms | No stack overflow |
| 10k+ search matches | Binary search | Efficient navigation |

### Known Limitations

- No horizontal virtualization for wide rows
- Client-side search only (can be slow with many matches)
- Memory constraints at 1M+ nodes
- Read-only JSON viewer (no inline editing)
- Wrap mode may cause layout thrashing

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/ui/AdvancedJsonViewer/README.md)

## Development Guidelines

### Testing

The project uses Jest with custom extensions:

| File Pattern | Purpose | Location |
|--------------|---------|----------|
| `.clienttest.ts` | Client-side tests | Colocated with components |
| `*.test.ts` | Standard unit tests | `__tests__` or inline |

```bash
# Run client tests
pnpm --filter=web run test-client --testPathPattern="ComponentName"
```

### Debugging

Enable debug logging via localStorage:

```typescript
localStorage.setItem("debug:ComponentName", "true");
```

## Summary

Langfuse is a comprehensive observability platform that bridges the gap between AI application development and operational monitoring. Its modular architecture, built on proven technologies like PostgreSQL, ClickHouse, and React, provides a scalable foundation for teams to understand, evaluate, and optimize their LLM applications.

The platform's emphasis on type safety, performance optimization, and developer experience makes it suitable for both small development teams and large-scale enterprise deployments.

---

<a id='project-structure'></a>

## Project Structure

### 相关页面

相关主题：[Monorepo Configuration](#monorepo-structure), [System Architecture](#system-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [pnpm-workspace.yaml](https://github.com/langfuse/langfuse/blob/main/pnpm-workspace.yaml)
- [turbo.json](https://github.com/langfuse/langfuse/blob/main/turbo.json)
- [packages/shared/package.json](https://github.com/langfuse/langfuse/blob/main/packages/shared/package.json)
- [web/package.json](https://github.com/langfuse/langfuse/blob/main/web/package.json)
- [worker/package.json](https://github.com/langfuse/langfuse/blob/main/worker/package.json)
</details>

# Project Structure

## Overview

Langfuse is a comprehensive observability and analytics platform for LLM applications, structured as a **monorepo** using pnpm workspaces and Turborepo for build orchestration. The repository contains multiple packages including the frontend web application, backend worker services, and shared libraries.

## Monorepo Architecture

### Workspace Configuration

Langfuse uses pnpm workspaces defined in `pnpm-workspace.yaml` to manage multiple packages within a single repository.

```yaml
packages:
  - "packages/*"
  - "web"
  - "worker"
```

资料来源：[pnpm-workspace.yaml](https://github.com/langfuse/langfuse/blob/main/pnpm-workspace.yaml)

### Build System (Turborepo)

The project uses Turborepo for efficient incremental builds and task caching. The turbo configuration defines the build pipeline and dependencies between packages.

资料来源：[turbo.json](https://github.com/langfuse/langfuse/blob/main/turbo.json)

## Package Structure

### 1. Web Application (`/web`)

The frontend application built with Next.js and React, containing the user interface and client-side logic.

| Directory | Purpose |
|-----------|---------|
| `src/components/` | Reusable UI components including layouts, tables, and specialized viewers |
| `src/features/` | Feature-specific modules with their own logic and components |
| `src/lib/` | Utility functions and helpers |
| `src/hooks/` | Custom React hooks |

资料来源：[web/package.json](https://github.com/langfuse/langfuse/blob/main/web/package.json)

### 2. Worker Service (`/worker`)

Backend service handling background processing, event ingestion, and queue management.

| Directory | Purpose |
|-----------|---------|
| `src/scripts/` | Utility scripts for data operations and migrations |
| `src/queues/` | Queue handlers for async processing |

资料来源：[worker/package.json](https://github.com/langfuse/langfuse/blob/main/worker/package.json)

### 3. Shared Packages (`/packages/shared`)

Common utilities, types, and validation schemas shared across web and worker packages.

| Module | Purpose |
|--------|---------|
| Validation | Zod schemas for type-safe data validation |
| Types | Shared TypeScript type definitions |
| Utilities | Common helper functions |

资料来源：[packages/shared/package.json](https://github.com/langfuse/langfuse/blob/main/packages/shared/package.json)

## Web Application Structure

### Component Architecture

```mermaid
graph TD
    A[Web Application] --> B[Layout Components]
    A --> C[UI Components]
    A --> D[Feature Modules]
    
    B --> B1[Page Wrapper]
    B --> B2[ContainerPage]
    B --> B3[Breadcrumb]
    
    C --> C1[AdvancedJsonViewer]
    C --> C2[Table Components]
    C --> C3[Peek Components]
    
    D --> D1[MCP]
    D --> D2[Score Analytics]
    D --> D3[Comments]
    D --> D4[Filters]
    D --> D5[Entitlements]
```

### Layout System

The `Page` component is the standard wrapper for all pages, providing:

- **Sticky Header**: Consistent header across pages
- **Scroll Management**: Supports `"content-scroll"` and `"page-scroll"` modes
- **Breadcrumb Navigation**: Easy navigation path display
- **Custom Header Actions**: Flexible button/link placement

For content that doesn't scale well with page width (e.g., settings pages), use `ContainerPage` instead.

资料来源：[web/src/components/layouts/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/layouts/README.md)

### Key Feature Modules

#### AdvancedJsonViewer

A virtualized JSON tree viewer with the following characteristics:

- **Performance**: Uses TanStack Virtual for rendering large JSON structures
- **Search**: Client-side search with regex support
- **Theme Support**: Multiple color themes (GitHub, Monokai, Solarized)
- **Tree Navigation**: Binary search for efficient node access

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/ui/AdvancedJsonViewer/README.md)

#### Score Analytics

Provides analytics dashboard capabilities for score data:

- **Score Comparison**: Compare two scores over time
- **Distribution Analysis**: Histogram and heatmap visualizations
- **Time Series**: Temporal trend analysis with configurable intervals
- **Data Transformation**: Pure functions for data processing

资料来源：[web/src/features/score-analytics/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/score-analytics/README.md)

#### MCP (Model Context Protocol)

Enables integration with external systems through MCP:

- **Stateless Architecture**: Fresh server instance per request
- **Prompt Management**: Support for `getPrompt` and `getPromptUnresolved`
- **Resource Handling**: MCP resources and tool support

资料来源：[web/src/features/mcp/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/mcp/README.md)

#### Entitlements

Feature availability control system:

- **Plan-based Access**: `oss`, `cloud:pro`, `self-hosted:enterprise`
- **Entitlement Limits**: Resource quotas per plan
- **Server/Client Support**: Available in both frontend hooks and backend

资料来源：[web/src/features/entitlements/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/entitlements/README.md)

### Table and Peek System

#### PeekTableStateProvider

Manages table state for peek views (slide-over panels showing item details):

```mermaid
graph LR
    A[PeekTableStateProvider] --> B[Filters]
    A --> C[Sorting]
    A --> D[Pagination]
    A --> E[Search]
```

**State Persistence**: Filter, sort, and pagination state persists across K/J navigation between items of the same type.

**State Reset**: State clears when the peek view closes (via X button, Escape, or click outside).

资料来源：[web/src/components/table/peek/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/table/peek/README.md)

## Worker Service Structure

### Scripts

The worker contains utility scripts for data operations:

| Script | Purpose |
|--------|---------|
| `replayIngestionEventsV2` | Replay events from CSV to ingestion queues |
| `refillQueueEvent` | Backfill queue with events from local files |

#### Replay Ingestion Events V2

Replays S3-stored events back to Langfuse:

- **Batch Processing**: Processes events in configurable batches
- **Checkpoint Support**: Resume capability via checkpoint files
- **Rate Limiting**: Respects server-side rate limits with exponential backoff
- **Error Handling**: Retries transient failures, logs permanent failures

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/replayIngestionEventsV2/README.md)

#### Refill Queue Event

Backfills queues with events from local JSONL files:

1. Create `./worker/events.jsonl` with one JSON event per line
2. Configure Redis connection and supporting services
3. Run via `pnpm run --filter=worker refill-queue-event`

资料来源：[worker/src/scripts/refillQueueEvent/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/refillQueueEvent/README.md)

## Public API Architecture

### Adding New API Routes

The project follows a structured pattern for public API development:

1. **Implementation**: Wrap routes with `withMiddleware` and `createAuthedAPIRoute`
2. **Type Definition**: Add Zod types to `/features/public-api/types` using `coerce` for primitives
3. **Validation**: Use `validateZodSchema` for response validation
4. **Documentation**: Add to Fern with `docs` attributes
5. **SDK Updates**: Copy types to Python and JS SDKs

```typescript
// Response type example
const responseSchema = z.object({
  data: z.string(),
  timestamp: z.coerce.date(),
}).strict();
```

资料来源：[web/src/features/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/README.md)

## Testing Infrastructure

### Client-Side Testing

Tests use `.clienttest.ts` extension and are colocated with components:

```bash
pnpm --filter=web run test-client --testPathPattern="ComponentName"
```

Example: `AdvancedJsonViewer` tests cover tree building, navigation, expansion, and search operations.

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/ui/AdvancedJsonViewer/README.md)

### Trace Tree Building Tests

Performance tests for large observation sets:

| Scale | Threshold | Structure Types |
|-------|-----------|-----------------|
| 10k | 500ms | flat, deep, balanced, realistic |
| 25k | 2s | flat, realistic |
| 50k | 5s | flat, realistic |
| 500k | 60s | realistic |

资料来源：[web/src/components/trace/lib/tree-building.clienttest.ts](https://github.com/langfuse/langfuse/blob/main/web/src/components/trace/lib/tree-building.clienttest.ts)

## Development Workflow

### Component Development Guidelines

1. **Use Page Wrapper**: Always wrap pages with `<Page>` component
2. **Use ContainerPage**: For settings/setup pages with non-scalable content
3. **Follow Naming**: Use `.clienttest.ts` for client-side tests
4. **State Management**: Use `useSidebarFilterState` for filters, `useOrderByState` for sorting

### Prompt Composition

For MCP prompt features:

- **Resolved Prompt**: Use `getPrompt` for executable prompts with dependencies resolved
- **Unresolved Prompt**: Use `getPromptUnresolved` for debugging and analysis

资料来源：[web/src/features/mcp/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/mcp/README.md)

## Summary

The Langfuse project is organized as a well-structured monorepo with clear separation between the frontend web application, backend worker services, and shared packages. The architecture emphasizes:

- **Modularity**: Feature-based organization with isolated modules
- **Performance**: Virtualization and incremental builds
- **Type Safety**: Zod schemas and TypeScript throughout
- **Observability**: Built-in tracing and analytics capabilities

---

<a id='quickstart-guide'></a>

## Quickstart Guide

### 相关页面

相关主题：[Project Introduction](#project-introduction)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/langfuse/langfuse/blob/main/README.md)
- [docker-compose.yml](https://github.com/langfuse/langfuse/blob/main/docker-compose.yml)
- [CONTRIBUTING.md](https://github.com/langfuse/langfuse/blob/main/CONTRIBUTING.md)
- [web/src/components/layouts/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/layouts/README.md)
- [worker/src/scripts/refillQueueEvent/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/refillQueueEvent/README.md)
- [web/src/features/slack/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/slack/README.md)
</details>

# Quickstart Guide

Langfuse is an open-source LLM engineering platform that provides observability, analytics, and prompt management for LLM applications. This guide walks you through setting up a local development environment, understanding the core architecture, and deploying Langfuse for production use.

## Overview

Langfuse supports multiple deployment scenarios:

| Deployment Type | Use Case | Key Components |
|----------------|----------|----------------|
| Local Development | Testing and development | Docker Compose with all services |
| Self-Hosted | Production deployment | Docker/Kubernetes with externalized services |
| Cloud | Managed SaaS offering | Langfuse-hosted infrastructure |

资料来源：[README.md:1-20]()

## Architecture Overview

Langfuse consists of three main components:

```mermaid
graph TD
    A[Web UI] --> B[Server API]
    B --> C[(PostgreSQL)]
    B --> D[(ClickHouse)]
    B --> E[(Redis)]
    F[Workers] --> C
    F --> D
    F --> E
    G[S3 Storage] --> F
```

### Core Components

| Component | Purpose | Technology |
|-----------|---------|------------|
| `web/` | Frontend UI application | Next.js, React, TanStack |
| `server/` | API server and business logic | Node.js, tRPC, Prisma |
| `worker/` | Background job processing | BullMQ, Redis |
| `clickhouse/` | Analytics storage | ClickHouse |

资料来源：[docker-compose.yml:1-50]()

## Local Development Setup

### Prerequisites

- Node.js 20+ (using pnpm as package manager)
- Docker and Docker Compose
- Git

### 1. Clone the Repository

```bash
git clone https://github.com/langfuse/langfuse.git
cd langfuse
```

### 2. Environment Configuration

Create a `.env` file in the root directory with the following required variables:

```bash
# Database
DATABASE_URL=postgresql://langfuse:langfuse@localhost:5432/langfuse

# ClickHouse
CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USER=langfuse
CLICKHOUSE_PASSWORD=langfuse

# Redis
REDIS_URL=redis://localhost:6379

# Auth (NextAuth.js)
NEXTAUTH_SECRET=your-secret-key
NEXTAUTH_URL=http://localhost:3000

# S3 Storage (MinIO for local dev)
S3_ACCESS_KEY=langfuse
S3_SECRET_KEY=langfuse
S3_REGION=us-east-1
S3_ENDPOINT_URL=http://localhost:9000
S3_EVENT_UPLOAD_BUCKET=langfuse
```

资料来源：[docker-compose.yml:50-120]()

### 3. Start Infrastructure Services

Launch all supporting services using Docker Compose:

```bash
docker compose up -d
```

This starts the following services:

| Service | Port | Purpose |
|---------|------|---------|
| postgres | 5432 | Primary database |
| clickhouse | 8123 | Analytics storage |
| redis | 6379 | Job queue broker |
| minio | 9000/9001 | S3-compatible storage |

资料来源：[docker-compose.yml:120-180]()

### 4. Install Dependencies

```bash
pnpm install
```

### 5. Run Database Migrations

```bash
pnpm db:migrate
```

### 6. Start Development Servers

Langfuse uses a monorepo structure with multiple development servers:

```bash
# Start all services in development mode
pnpm run dev

# Or start individual services
pnpm --filter=server run dev    # API server
pnpm --filter=web run dev       # Web UI
pnpm --filter=worker run dev    # Background workers
```

资料来源：[CONTRIBUTING.md:50-100]()

## Project Structure

```
langfuse/
├── web/                      # Next.js frontend
│   ├── src/
│   │   ├── components/       # Reusable UI components
│   │   ├── features/         # Feature modules
│   │   ├── pages/            # Next.js pages
│   │   └── lib/              # Utilities
│   └── public/               # Static assets
├── server/                   # API server
│   ├── src/
│   │   ├── api/              # tRPC routers
│   │   ├── services/         # Business logic
│   │   └── lib/              # Utilities
├── worker/                   # Background workers
│   └── src/
│       ├── workers/          # Queue processors
│       └── scripts/          # Utility scripts
├── clickhouse/               # ClickHouse migrations
└── docker-compose.yml        # Local infrastructure
```

### Page Component Pattern

The `Page` component is the standard wrapper for all pages in the application:

```tsx
import Page from "@/src/components/layouts/Page";

export default function MyPage() {
  return (
    <Page
      title="My Page"
      scrollable
      headerProps={{
        breadcrumb: [{ name: "Home", href: "/" }, { name: "My Page" }],
      }}
    >
      <div>Content here...</div>
    </Page>
  );
}
```

**Important**: Every page must be wrapped inside `<Page>`—do not use `<main>` directly.

资料来源：[web/src/components/layouts/README.md:1-40]()

## Development Workflow

### Running Tests

Langfuse uses different test patterns for client and server code:

```bash
# Run all tests
pnpm run test

# Client-side tests (Vitest with .clienttest.ts extension)
pnpm --filter=web run test-client --testPathPattern="ComponentName"

# Server-side tests (Jest)
pnpm --filter=server run test
```

#### Client Test Pattern

Client tests use stack-based iteration to avoid stack overflow:

```typescript
// ✅ Safe for deep trees - iterative approach
function traverse(rootNode: TreeNode) {
  const stack = [rootNode];
  while (stack.length > 0) {
    const node = stack.pop()!;
    process(node);
    node.children.forEach((child) => stack.push(child));
  }
}
```

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md:80-100]()

### Debug Mode

Enable detailed logging for specific components:

```javascript
localStorage.setItem("debug:AdvancedJsonViewer", "true");
```

### Code Quality

```bash
# Lint code
pnpm run lint

# Format code
pnpm run format

# Type check
pnpm run typecheck
```

资料来源：[CONTRIBUTING.md:100-150]()

## Feature Modules

Langfuse organizes functionality into feature modules under `web/src/features/`:

| Module | Purpose |
|--------|---------|
| `comments/` | User comments and mentions |
| `entitlements/` | Feature access control |
| `feature-flags/` | Feature toggle system |
| `filters/` | Query filtering and search |
| `mcp/` | Model Context Protocol integration |
| `score-analytics/` | Score analytics and visualization |
| `slack/` | Slack integration |
| `migrations/` | Database migrations |

### Feature Flags

Enable experimental features using the `useIsFeatureEnabled` hook:

```tsx
const isEnabled = useIsFeatureEnabled("feature-flag-name");
```

A feature is enabled when:
1. The flag is in `user.feature_flags`
2. `LANGFUSE_ENABLE_EXPERIMENTAL_FEATURES` is set
3. The user has admin privileges

资料来源：[web/src/features/feature-flags/README.md:1-15]()

### Entitlements System

Feature availability is controlled through entitlements:

- **Plans**: Tiers of features (`oss`, `cloud:pro`, `self-hosted:enterprise`)
- **Entitlements**: Available features per plan (e.g., `playground`)
- **EntitlementLimits**: Resource limits (e.g., `annotation-queue-count`)

资料来源：[web/src/features/entitlements/README.md:1-25]()

## Worker Scripts

The worker module includes utility scripts for data operations:

### Refill Queue Event

Backfill any queue with events from local machines:

```bash
# 1. Create events file (./worker/events.jsonl)
{"projectId": "project-123", "orgId": "org-456"}

# 2. Configure environment
REDIS_CONNECTION_STRING=redis://:password@127.0.0.1:6379
CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USER=clickhouse
CLICKHOUSE_PASSWORD=clickhouse

# 3. Run the script
pnpm run --filter=worker refill-queue-event
```

资料来源：[worker/src/scripts/refillQueueEvent/README.md:1-40]()

### Replay Ingestion Events V2

Re-process S3-stored ingestion events:

```bash
npx tsx worker/src/scripts/replayIngestionEventsV2/index.ts \
  --input=/path/to/events.csv \
  --batch-size=500 \
  --concurrency=4
```

| Parameter | Default | Description |
|-----------|---------|-------------|
| `--input` | - | Path to CSV file (required) |
| `--batch-size` | 500 | Keys per API request |
| `--concurrency` | 4 | Parallel API requests |
| `--rate-limit` | 50 | Requests per second |
| `--dry-run` | false | Validate without sending |
| `--resume` | false | Continue from checkpoint |

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md:1-50]()

## Slack Integration Setup

For local Slack OAuth development, HTTPS is required:

### 1. Generate SSL Certificates

```bash
# Install mkcert
brew install mkcert
mkcert -install
mkcert localhost 127.0.0.1

# Move certificates to web directory
mv localhost+1*.pem web/
```

### 2. Configure Environment

```bash
SLACK_CLIENT_ID=your_client_id
SLACK_CLIENT_SECRET=your_client_secret
SLACK_STATE_SECRET=your_state_secret
```

### 3. Start HTTPS Server

```bash
pnpm run dev:https
```

资料来源：[web/src/features/slack/README.md:1-50]()

## Production Deployment

### Docker Compose Production Mode

For production, use externalized services:

```yaml
services:
  web:
    image: langfuse/langfuse-web:latest
    environment:
      - DATABASE_URL=${DATABASE_URL}
      - CLICKHOUSE_URL=${CLICKHOUSE_URL}
      - REDIS_URL=${REDIS_URL}
      - NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
      - S3_ACCESS_KEY=${S3_ACCESS_KEY}
      - S3_SECRET_KEY=${S3_SECRET_KEY}
    ports:
      - "3000:3000"

  server:
    image: langfuse/langfuse-server:latest
    # Configuration similar to web

  worker:
    image: langfuse/langfuse-worker:latest
    depends_on:
      - redis
      - postgres
      - clickhouse
```

### S3 Event Storage

Configure S3 bucket for event storage:

```sql
-- Example: Create external table for S3 access logs
CREATE EXTERNAL TABLE s3_access_logs (
  bucketowner STRING,
  bucket_name STRING,
  requestdatetime STRING,
  remoteip STRING,
  requester STRING,
  requestid STRING,
  operation STRING,
  key STRING,
  uri STRING,
  statuscode INT,
  errorcode STRING,
  bytessent BIGINT,
  objectsize BIGINT,
  totaltime STRING,
  turnaroundtime STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  'input.regex'='([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ...'
)
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
LOCATION 's3://your-bucket/logs/'
```

资料来源：[docker-compose.yml:180-220]()

## Common Issues and Solutions

| Issue | Solution |
|-------|----------|
| Stack overflow in tree operations | Use iterative algorithms with explicit stacks |
| Large dataset performance | Enable virtualization (TanStack Virtual) |
| Horizontal scroll performance | Avoid wrap mode for wide datasets |
| Multiple tables in peek view | Share pagination state intentionally |

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md:40-60]()

## Next Steps

After setting up your development environment:

1. **Explore the UI** - Navigate through traces, observations, and evaluations
2. **Integrate SDK** - Connect your LLM application using [Langfuse Python/JS SDK](https://langfuse.com/docs)
3. **Configure Features** - Set up feature flags and entitlements for your organization
4. **Deploy** - Move to production using Docker Compose or Kubernetes

资料来源：[CONTRIBUTING.md:1-50]()

---

<a id='system-architecture'></a>

## System Architecture

### 相关页面

相关主题：[Project Structure](#project-structure), [Database Schema (Prisma)](#database-schema), [Queue System (Redis/BullMQ)](#queue-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/shared/src/server/clickhouse/client.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/clickhouse/client.ts)
- [packages/shared/src/server/redis/redis.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/redis/redis.ts)
- [packages/shared/src/db.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/db.ts)
- [packages/shared/src/server/otel/index.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/otel/index.ts)
- [web/src/components/layouts/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/layouts/README.md)
- [web/src/components/design-system/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/design-system/README.md)
- [web/src/components/table/peek/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/table/peek/README.md)
- [web/src/features/entitlements/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/entitlements/README.md)
- [packages/shared/src/features/scores/interfaces/README.md](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/features/scores/interfaces/README.md)
- [worker/src/scripts/replayIngestionEventsV2/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/replayIngestionEventsV2/README.md)
</details>

# System Architecture

Langfuse is a comprehensive observability and analytics platform designed for Large Language Model (LLM) applications. The system architecture is built on a modern, modular design that separates concerns across frontend, backend worker services, and shared infrastructure layers.

## Overview

Langfuse follows a distributed architecture pattern with the following primary components:

| Layer | Technology Stack | Purpose |
|-------|-----------------|---------|
| Frontend | Next.js, React, TypeScript | User interface and visualization |
| Backend Worker | Node.js, BullMQ, TypeScript | Event processing and queue management |
| Shared Packages | TypeScript | Common utilities, types, and infrastructure clients |
| Database | PostgreSQL | Primary data storage |
| Cache/Queue | Redis | Queue management and caching |
| Analytics | ClickHouse | High-performance analytics queries |
| Observability | OpenTelemetry | Distributed tracing |

## High-Level Architecture

```mermaid
graph TD
    subgraph Client["Frontend (Next.js)"]
        UI[User Interface]
        Pages[Page Components]
        DesignSystem[Design System]
    end
    
    subgraph Shared["Shared Packages"]
        DB[(PostgreSQL)]
        Redis[(Redis)]
        ClickHouse[(ClickHouse)]
        Otel[OpenTelemetry]
    end
    
    subgraph Backend["Worker Service"]
        Queues[Queue Workers]
        Scripts[Utility Scripts]
    end
    
    Client <-->|tRPC API| Shared
    Backend <-->|Event Processing| Shared
    Client -->|Ingestion| Backend
```

## Infrastructure Layer

### Database Connection

The PostgreSQL database is the central data store for Langfuse, managed through a shared database client module. The connection is centralized in the `packages/shared/src/db.ts` module, which provides a unified interface for all database operations across the application.

资料来源：[packages/shared/src/db.ts:1-50]()

### Redis Client

Redis serves dual purposes in the Langfuse architecture:

1. **Queue Management**: BullMQ queues for asynchronous event processing
2. **Caching**: Session and temporary data caching

The Redis client is configured in `packages/shared/src/server/redis/redis.ts` and is shared across worker services.

资料来源：[packages/shared/src/server/redis/redis.ts:1-30]()

### ClickHouse Integration

ClickHouse provides the analytical query engine for high-performance aggregations. The client is initialized in `packages/shared/src/server/clickhouse/client.ts` and is primarily used for:

- Score analytics aggregations
- Time-series data analysis
- Large dataset transformations

资料来源：[packages/shared/src/server/clickhouse/client.ts:1-40]()

### OpenTelemetry

The OpenTelemetry integration (`packages/shared/src/server/otel/index.ts`) provides distributed tracing across all services. This enables:

- Request tracing across frontend and backend
- Event processing workflow visibility
- Performance monitoring

资料来源：[packages/shared/src/server/otel/index.ts:1-60]()

## Frontend Architecture

### Page Structure

All frontend pages use a standardized layout system defined in `web/src/components/layouts/`. The `Page` component is the required wrapper for all application pages, ensuring consistent layout behavior.

Key layout patterns:

| Pattern | Component | Use Case |
|---------|-----------|----------|
| Standard Pages | `Page` | Most application pages |
| Wide Content | `ContainerPage` | Settings, setup pages with wide content |

The page wrapper provides:
- Sticky header management
- Scroll behavior control (`content-scroll` or `page-scroll`)
- Breadcrumb navigation
- Custom header actions

资料来源：[web/src/components/layouts/README.md:1-60]()

### Design System

The design system (`web/src/components/design-system/`) provides primitive, reusable UI components following strict architectural principles:

**Principles:**
- Presentational only (no business logic)
- Explicit, strictly typed APIs
- Props over context (no React Context)

**Component Structure:**
```
design-system/
  Button/
    Button.tsx
    Button.stories.tsx
```

**Styling Rules:**
- No arbitrary CSS values
- Explicit enums for variants (`size: "sm" | "md" | "lg"`)
- CVA (Class Variance Authority) for variant management
- Boolean props use positive naming (`isLoading`, `shouldTruncate`)

资料来源：[web/src/components/design-system/README.md:1-80]()

### State Management Patterns

Langfuse uses a sophisticated state management approach with the Peek Table State system:

```mermaid
graph TD
    A[Page Load] --> B{Peek Context?}
    B -->|Yes| C[PeekTableStateProvider]
    B -->|No| D[URL/Session State]
    C --> E[Table State Preserved]
    D --> F[Standard State]
    
    E --> G[K/J Navigation]
    G --> H[State Retained ✓]
```

The `PeekTableStateProvider` maintains table state (filters, sorting, pagination) across K/J keyboard navigation between items of the same type. State resets only when the peek view closes.

资料来源：[web/src/components/table/peek/README.md:1-100]()

## Feature Modules

### Entitlements System

The entitlements feature controls feature availability at the organization level:

| Concept | Definition |
|---------|------------|
| Plan | Feature tier (OSS, cloud:pro, self-hosted:enterprise) |
| Entitlement | Available feature (e.g., playground, score analytics) |
| EntitlementLimit | Resource limits (e.g., annotation-queue-count) |

**Plan Resolution:**
- Cloud: Added to organization via JWT from NextAuth
- Self-hosted: From license key or environment configuration

资料来源：[web/src/features/entitlements/README.md:1-50]()

### Score Analytics

The score analytics feature provides comprehensive statistical analysis of evaluation scores:

**Architecture Components:**

| Component | Location | Responsibility |
|-----------|----------|----------------|
| Provider | `ScoreAnalyticsProvider.tsx` | Context management |
| Hook | `useScoreAnalyticsQuery` | Data fetching and transformation |
| Transformers | `scoreAnalyticsTransformers.ts` | Data transformation pipeline |
| Router | `scoreAnalyticsRouter.ts` | tRPC API endpoint |

**Data Flow:**
```mermaid
graph LR
    A[API Request] --> B[tRPC Router]
    B --> C[ClickHouse Query]
    C --> D[Transformers]
    D --> E[ScoreAnalyticsProvider]
    E --> F[Chart Components]
```

资料来源：[packages/shared/src/features/scores/interfaces/README.md:1-80]()
资料来源：[web/src/features/score-analytics/README.md:1-100]()

### Score Interfaces Architecture

Langfuse maintains a versioned API structure for scores:

```
interfaces/
├── api/
│   ├── v1/    # Legacy API (trace-focused)
│   ├── v2/    # Current API (supports traces, sessions)
│   └── shared.ts
├── application/
├── ingestion/
└── ui/
```

**API Versioning Strategy:**
- POST/DELETE APIs: Support all score types across v1 and v2
- GET APIs:
  - V1: Requires `traceId`, trace-level only
  - V2: `traceId` optional, adds `sessionId` support

资料来源：[packages/shared/src/features/scores/interfaces/README.md:1-50]()

## Backend Worker Architecture

### Event Processing

The worker service processes ingestion events asynchronously using BullMQ queues:

**Queue Types:**
| Queue | Purpose | Consumer |
|-------|---------|----------|
| `IngestionSecondaryQueue` | Standard event processing | Worker |
| `OtelIngestionQueue` | OpenTelemetry events | Worker |

**Event Flow:**
```mermaid
graph LR
    A[Ingestion API] --> B{Event Type?}
    B -->|Standard| C[S3 Key Parse]
    B -->|OTEL| D[OTEL Key Parse]
    C --> E[Queue Payload]
    D --> E
    E --> F[BullMQ]
    F --> G[Worker Processing]
    G --> H[(ClickHouse)]
    G --> I[(PostgreSQL)]
```

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md:1-60]()

### Replay Ingestion Events V2

The `replayIngestionEventsV2` script enables replaying historical events from S3 storage:

**Key Features:**
- Batch processing with configurable size
- Checkpoint/resume capability
- Rate limiting with exponential backoff
- Error handling with detailed logging

**Differences from V1:**

| Aspect | V1 | V2 |
|--------|----|----|
| Infrastructure | Redis, ClickHouse, PostgreSQL, S3 | Langfuse host URL only |
| Setup | Full repo clone + `.env` | `npx tsx` + env vars |
| Event Delivery | BullMQ `addBulk` to Redis | HTTP POST to admin API |
| Resume | Manual | Built-in checkpoint |

**Event Transformation:**
- Standard keys: `{projectId}/{type}/{eventBodyId}/{eventId}.json`
- OTEL keys: `otel/{projectId}/{yyyy}/{mm}/{dd}/{hh}/{mm}/{eventId}.json`

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md:1-120]()

### Refill Queue Event

The `refillQueueEvent` utility script backfills queues with events from local files:

**Usage Pattern:**
```bash
pnpm run --filter=worker refill-queue-event
```

**Requirements:**
- `./worker/events.jsonl` file with JSON events
- Redis connection via `REDIS_CONNECTION_STRING`
- Supporting services: S3, ClickHouse

资料来源：[worker/src/scripts/refillQueueEvent/README.md:1-60]()

## MCP Server Architecture

Langfuse includes an MCP (Model Context Protocol) server for programmatic prompt management:

**Stateless Design:**
1. Fresh server instance per request
2. Authentication context captured in handler closures
3. Server discarded after request completes
4. No state between requests

**Available Tools:**
| Tool | Purpose |
|------|---------|
| `getPrompt` | Fetch resolved prompt with dependencies |
| `getPromptUnresolved` | Fetch raw prompt without resolution |
| `listPrompts` | List prompts with filtering |
| `createTextPrompt` | Create text prompt version |
| `createChatPrompt` | Create chat prompt version |
| `updatePromptLabels` | Manage prompt labels |

**Prompt Resolution:**
- Resolved: Recursively replaces `@@@langfusePrompt:...@@@` tags
- Unresolved: Returns raw content with tags intact

资料来源：[web/src/features/mcp/README.md:1-100]()

## Component Communication Flow

```mermaid
graph TD
    subgraph Pages["Page Layer"]
        P[Page Component]
        PC[PeekTableStateProvider]
    end
    
    subgraph Features["Feature Layer"]
        FP[Feature Provider]
        FH[Feature Hook]
        FT[Feature Transformers]
    end
    
    subgraph Services["Service Layer"]
        TRPC[tRPC Router]
        API[API Routes]
    end
    
    subgraph Data["Data Layer"]
        Repo[Repositories]
        DB[(PostgreSQL)]
        CH[(ClickHouse)]
        Redis[(Redis)]
    end
    
    P --> PC
    PC --> FP
    FP --> FH
    FH --> FT
    FT --> TRPC
    TRPC --> Repo
    Repo --> DB
    Repo --> CH
    Repo --> Redis
```

## Configuration Management

Langfuse uses environment-based configuration across layers:

| Environment | Scope | Examples |
|-------------|-------|----------|
| `REDIS_CONNECTION_STRING` | Worker, Queue | Redis URL |
| `CLICKHOUSE_URL` | Analytics | ClickHouse connection |
| `LANGFUSE_S3_EVENT_UPLOAD_BUCKET` | Storage | S3 bucket name |
| `ADMIN_API_KEY` | Admin API | Authentication |

## Security Architecture

**Key Security Components:**

1. **JWT Authentication**: Organization and user context embedded in JWT tokens
2. **API Key Validation**: Admin API uses dedicated key authentication
3. **Scope-based Authorization**: Project-level access control
4. **Plan Entitlements**: Feature availability based on subscription tier

**Self-hosted Considerations:**
- License key validation for enterprise features
- Environment-based plan override capability

## Performance Optimizations

**Frontend Optimizations:**
- Memoized transformations in hooks
- Virtualized table rendering (TanStack Virtual)
- Iterative algorithms (no recursion, preventing stack overflow)

**Backend Optimizations:**
- Batch processing for event replay
- Checkpoint/resume for long-running operations
- Client-side rate limiting with exponential backoff

## Summary

The Langfuse architecture demonstrates a well-structured approach to observability platforms:

- **Separation of Concerns**: Clear boundaries between UI, business logic, and data layers
- **Scalability**: Asynchronous processing via queues enables horizontal scaling
- **Extensibility**: Feature modules and MCP server support programmatic access
- **Observability**: Built-in OpenTelemetry integration for distributed tracing
- **Performance**: ClickHouse for analytics, iterative algorithms, and batch processing

---

<a id='monorepo-structure'></a>

## Monorepo Configuration

### 相关页面

相关主题：[Project Structure](#project-structure), [System Architecture](#system-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [turbo.json](https://github.com/langfuse/langfuse/blob/main/turbo.json)
- [packages/config-eslint/index.js](https://github.com/langfuse/langfuse/blob/main/packages/config-eslint/index.js)
- [packages/config-typescript/base.json](https://github.com/langfuse/langfuse/blob/main/packages/config-typescript/base.json)
- [packages/shared/tsconfig.json](https://github.com/langfuse/langfuse/blob/main/packages/shared/tsconfig.json)
- [web/package.json](https://github.com/langfuse/langfuse/blob/main/web/package.json)
- [worker/src/scripts/replayIngestionEventsV2/README.md](https://github.com/langfuse/langfuse/blob/main/worker/src/scripts/replayIngestionEventsV2/README.md)
</details>

# Monorepo Configuration

Langfuse uses a **monorepo architecture** managed with [Turborepo](https://turbo.build/), [pnpm](https://pnpm.io/) workspaces, and shared configuration packages. This setup enables efficient builds, consistent code quality standards, and streamlined dependency management across the project's multiple packages.

## Architecture Overview

The Langfuse repository is organized as a pnpm workspace monorepo with the following core structure:

```yaml
langfuse/
├── web/                    # Next.js frontend application
├── worker/                 # Background job processing
├── packages/
│   ├── shared/            # Shared utilities and types
│   ├── config-eslint/     # Shared ESLint configuration
│   └── config-typescript/ # Shared TypeScript configurations
├── turbo.json             # Turborepo pipeline definition
└── pnpm-workspace.yaml    # Workspace package definitions
```

## Turborepo Pipeline Configuration

The `turbo.json` file defines the build pipeline and task orchestration across packages.

### Core Pipeline Tasks

| Task | Description | Cache Strategy |
|------|-------------|----------------|
| `build` | Compiles TypeScript and bundles assets | Enabled |
| `dev` | Starts development servers | Local only |
| `test` | Runs unit and integration tests | Enabled |
| `lint` | ESLint code quality checks | Enabled |
| `typecheck` | TypeScript type validation | Enabled |

### Task Dependencies

Turborepo automatically resolves dependencies between packages. For example:

```mermaid
graph TD
    A[packages/shared] -->|build| B[Type definitions]
    B --> C[packages/config-*]
    C --> D[web]
    C --> E[worker]
    D --> F[Build output]
    E --> F
```

## Shared ESLint Configuration

The `packages/config-eslint/index.js` provides standardized ESLint rules across all packages.

### Configuration Features

- **React/Next.js support** for the web application
- **TypeScript-aware linting** via `@typescript-eslint`
- **Import ordering** rules for consistent module organization
- **JSX accessibility** checks

### Usage

Packages extend the shared configuration:

```javascript
// In package's .eslintrc.js
module.exports = {
  extends: ['@langfuse/config-eslint'],
  // Package-specific overrides
  rules: {
    // Custom rules
  }
};
```

资料来源：[packages/config-eslint/index.js]()

## Shared TypeScript Configuration

The `packages/config-typescript/` directory contains base TypeScript configurations.

### Base Configuration (`base.json`)

```json
{
  "compilerOptions": {
    "target": "ES2020",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  }
}
```

### Package-Specific Configurations

Individual packages extend the base configuration:

```json
// packages/shared/tsconfig.json
{
  "extends": "@langfuse/config-typescript/base.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src"
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}
```

资料来源：[packages/config-typescript/base.json](), [packages/shared/tsconfig.json]()

## Package Scripts and Commands

Each package defines its own scripts in `package.json`. The web package demonstrates the typical pattern:

### Build Commands

| Command | Purpose |
|---------|---------|
| `pnpm build` | Production build with `INLINE_RUNTIME_CHUNK=false` |
| `pnpm build:check` | Build without emitting (type-checking) |
| `pnpm dev` | Development server on localhost:3000 |
| `pnpm dev:http` | HTTPS development server for local testing |

### Quality Assurance Commands

| Command | Purpose |
|---------|---------|
| `pnpm lint` | ESLint with caching enabled |
| `pnpm lint:fix` | Auto-fix linting issues |
| `pnpm typecheck` | TypeScript validation with incremental compilation |
| `pnpm test` | Vitest with server and in-source test projects |

资料来源：[web/package.json]()

## Development Workflow

### Starting Development

```bash
# Install dependencies
pnpm install

# Start all dev servers based on turbo.json
pnpm dev

# Or start a specific package
cd web && pnpm dev
```

### Running Tests

```bash
# All tests
pnpm test

# Client-side tests only (e.g., AdvancedJsonViewer)
pnpm --filter=web run test-client --testPathPattern="AdvancedJsonViewer"

# Server-side tests
pnpm --filter=web run test --project server
```

### Build Pipeline

```mermaid
graph LR
    A[pnpm build] --> B[turbo build]
    B --> C{Cache hit?}
    C -->|Yes| D[Use cached output]
    C -->|No| E[Build dependencies]
    E --> F[packages/shared]
    F --> G[packages/config-*]
    G --> H[web/worker]
    H --> I[Save to cache]
    I --> J[Build artifacts]
```

## Environment Configuration

The project uses `.env` files for environment-specific configuration:

| File | Purpose |
|------|---------|
| `.env` | Default environment variables |
| `.env.local` | Local overrides (git-ignored) |
| `.env.test` | Test environment variables |

Scripts use `dotenv` to load these files:

```bash
dotenv -e ../.env -- next build
dotenv -e ../.env.test -e ../.env -- vitest run
```

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md]()

## Best Practices

### Adding a New Package

1. Create the package under `packages/` or `web/` directories
2. Extend the shared TypeScript and ESLint configurations
3. Add the package to `pnpm-workspace.yaml` if needed
4. Define tasks in `turbo.json` if custom pipeline is required
5. Add appropriate scripts to `package.json`

### Caching Strategy

- **Build caching**: Enabled by default via Turborepo
- **Lint caching**: ESLint caches to `.next/cache/eslint/`
- **TypeScript incremental**: Uses `.tsbuildinfo` files

### CI/CD Integration

In CI environments, clear caches to ensure fresh builds:

```bash
# Clear turbo cache
rm -rf .turbo node_modules/.cache

# Clear ESLint cache
rm -rf .next/cache/eslint/

# Fresh build
pnpm install --frozen-lockfile
pnpm build
```

## Key Configuration Files Summary

| File | Purpose |
|------|---------|
| `turbo.json` | Task pipeline and dependency graph |
| `pnpm-workspace.yaml` | Workspace package definitions |
| `packages/config-eslint/index.js` | Shared ESLint rules |
| `packages/config-typescript/base.json` | Shared TypeScript base config |
| `.eslintrc.js` (per package) | Package-specific lint overrides |
| `tsconfig.json` (per package) | Package-specific TypeScript config |

---

<a id='database-schema'></a>

## Database Schema (Prisma)

### 相关页面

相关主题：[System Architecture](#system-architecture), [ClickHouse Analytics Layer](#clickhouse-analytics)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [packages/shared/prisma/schema.prisma](https://github.com/langfuse/langfuse/blob/main/packages/shared/prisma/schema.prisma)
- [packages/shared/src/server/repositories/traces.ts](https://packages/shared/src/server/repositories/traces.ts)
- [packages/shared/src/server/repositories/observations.ts](https://packages/shared/src/server/repositories/observations.ts)
- [packages/shared/src/server/repositories/scores.ts](https:///packages/shared/src/server/repositories/scores.ts)
- [packages/shared/src/features/scores/interfaces/README.md](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/features/scores/interfaces/README.md)
</details>

# Database Schema (Prisma)

## Overview

Langfuse uses Prisma ORM to manage its PostgreSQL database schema. The schema defines the core data models for the application, including organizations, projects, users, traces, observations, and scores. Prisma serves as the primary interface between the application's business logic and the relational database.

The Prisma schema is located at `packages/shared/prisma/schema.prisma` and is shared across multiple packages in the monorepo structure. This centralized schema approach ensures consistency in data modeling across the web application, worker services, and shared libraries.

### Design Philosophy

The schema follows several key principles:

- **Normalized relationships**: Related entities are linked through foreign keys with proper cascading behaviors
- **Soft deletes**: Key entities support soft deletion for data recovery and audit purposes
- **Audit fields**: Most tables include `createdAt`, `updatedAt`, and `createdBy` fields
- **Multi-tenancy**: The schema supports multi-tenant architecture with organization and project isolation
- **Extensible metadata**: JSON fields allow flexible storage of custom attributes

## Core Data Models

### Organization and User Models

The foundation of Langfuse's multi-tenant architecture begins with the `Organization` model, which represents the top-level tenant entity. Each organization can have multiple users with different roles and permission levels.

The `User` model stores authentication and profile information, linked to organizations through the `Membership` junction table. This many-to-many relationship enables users to belong to multiple organizations with potentially different roles in each.

```mermaid
erDiagram
    Organization ||--o{ User : "contains"
    Organization ||--o{ Project : "contains"
    User ||--o{ Membership : "has"
    Membership }o--|| Organization : "belongs to"
    Membership }o--|| User : "belongs to"
```

### Project Model

Projects serve as the primary container for observability data. Each project belongs to exactly one organization and contains all traces, observations, and scores related to a specific application or use case.

| Field | Type | Description |
|-------|------|-------------|
| `id` | `String` | UUID primary key |
| `name` | `String` | Project display name |
| `organizationId` | `String` | Foreign key to organization |
| `createdAt` | `DateTime` | Creation timestamp |
| `updatedAt` | `DateTime` | Last modification timestamp |
| `deletedAt` | `DateTime?` | Soft delete timestamp |
| `settings` | `Json` | Project-specific configuration |

资料来源：[packages/shared/prisma/schema.prisma]()

## Traces

Traces represent the top-level unit of observability in Langfuse. A trace encapsulates a complete interaction or request, typically corresponding to a single LLM call or a multi-step workflow.

### Trace Model Schema

```prisma
model Trace {
  id            String   @id @default(cuid())
  name          String?
  project       Project  @relation(fields: [projectId], references: [id])
  projectId     String
  user          String?
  metadata      Json?
  sessionId     String?
  release       String?
  version       String?
  tags          String[]
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Soft delete
  deletedAt     DateTime?
  
  // Relations
  observations  Observation[]
  scores        Score[]
  
  @@index([projectId])
  @@index([sessionId])
  @@index([createdAt])
}
```

资料来源：[packages/shared/prisma/schema.prisma]()

### Key Fields

| Field | Type | Description |
|-------|------|-------------|
| `id` | `String` | Unique identifier using CUID algorithm |
| `name` | `String?` | Optional human-readable trace name |
| `projectId` | `String` | Reference to parent project |
| `user` | `String?` | Identifier for the end user |
| `sessionId` | `String?` | Groups related traces into sessions |
| `release` | `String?` | Application release version |
| `version` | `String?` | Trace format version |
| `tags` | `String[]` | Array of string tags for categorization |

### Repository Pattern

Traces are accessed through the repository pattern defined in `packages/shared/src/server/repositories/traces.ts`. This abstraction provides a clean interface for CRUD operations while encapsulating query logic.

资料来源：[packages/shared/src/server/repositories/traces.ts]()

```typescript
// Repository interface pattern (simplified)
interface ITraceRepository {
  create(data: CreateTraceInput): Promise<Trace>;
  getById(id: string, projectId: string): Promise<Trace | null>;
  list(projectId: string, options?: ListTracesOptions): Promise<Trace[]>;
  update(id: string, data: UpdateTraceInput): Promise<Trace>;
  softDelete(id: string): Promise<void>;
}
```

## Observations

Observations represent the individual components within a trace, such as LLM calls, retrievals, or custom events. They form a hierarchical structure that can be nested to represent complex workflows.

### Observation Model Schema

```prisma
model Observation {
  id            String   @id @default(cuid())
  
  // Type discrimination
  type          ObservationType
  
  // Relations
  trace         Trace    @relation(fields: [traceId], references: [id])
  traceId       String
  parent        Observation? @relation("ObservationHierarchy", fields: [parentId], references: [id])
  parentId      String?
  children      Observation[] @relation("ObservationHierarchy")
  
  // Project reference for efficient querying
  projectId     String
  
  // Core data
  name          String?
  startTime     DateTime
  endTime       DateTime?
  status        String?
  metadata      Json?
  
  // LLM-specific fields
  model         String?
  modelId       String?
  provider      String?
  promptTokens  Int?
  completionTokens Int?
  totalTokens   Int?
  unitPrice     Float?
  currency      String?
  calculatedUnitCost Float?
  
  // Retrieval-specific fields
  input         Json?
  output        Json?
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Soft delete
  deletedAt     DateTime?
  
  @@index([traceId])
  @@index([projectId])
  @@index([startTime])
}
```

资料来源：[packages/shared/prisma/schema.prisma]()
资料来源：[packages/shared/src/server/repositories/observations.ts]()

### Observation Types

Langfuse supports several observation types through an enum:

| Type | Description |
|------|-------------|
| `CHAT` | Chat completion calls |
| `GENERATION` | Text generation calls |
| `RETRIEVAL` | Retrieval augmented generation steps |
| `EVENT` | Custom events and markers |
| `TOOL` | Tool/function calls |

### Hierarchical Structure

Observations support nested hierarchies through self-referential relationships. This enables representing complex multi-step workflows where parent observations contain child observations representing sub-tasks or parallel operations.

```mermaid
graph TD
    A[Trace] --> B[Observation: Chat]
    B --> C[Observation: Retrieval]
    B --> D[Observation: Generation]
    C --> E[Observation: Event: Cache Hit]
    D --> F[Observation: Tool: Calculator]
    D --> G[Observation: Tool: Search]
```

资料来源：[packages/shared/prisma/schema.prisma]()

## Scores

Scores provide a mechanism for evaluating trace and observation quality. They can be human-generated or automated evaluations attached to specific traces or observations.

### Score Model Schema

```prisma
model Score {
  id            String   @id @default(cuid())
  
  // Target discrimination
  traceId       String?
  observationId String?
  
  // Project reference
  projectId     String
  
  // Score data
  name          String
  value         Float
  dataType      ScoreDataType
  comment       String?
  
  // Source tracking
  source        String?
  
  // Author
  authorId      String?
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Relations
  trace         Trace?      @relation(fields: [traceId], references: [id])
  observation   Observation? @relation(fields: [observationId], references: [id])
  
  @@index([projectId])
  @@index([traceId])
  @@index([observationId])
  @@index([name, createdAt])
}
```

资料来源：[packages/shared/prisma/schema.prisma]()
资料来源：[packages/shared/src/server/repositories/scores.ts]()

### Score Data Types

The `ScoreDataType` enum defines the type of value stored:

| Data Type | Description |
|-----------|-------------|
| `NUMERIC` | Continuous numerical value |
| `CATEGORICAL` | Categorical label or classification |
| `BOOLEAN` | True/false indicator |

### Score Interfaces Architecture

Scores in Langfuse follow a layered interface architecture that separates concerns across different parts of the system:

```mermaid
graph LR
    A[UI Types] --> B[Application Validation]
    B --> C[Ingestion Validation]
    C --> D[API v1 Schemas]
    C --> E[API v2 Schemas]
    D --> F[Database Models]
    E --> F
```

资料来源：[packages/shared/src/features/scores/interfaces/README.md]()

## Indexing Strategy

The schema defines strategic indexes to optimize common query patterns:

| Table | Indexes | Purpose |
|-------|---------|---------|
| `Trace` | `projectId`, `sessionId`, `createdAt` | Fast project filtering and time-based queries |
| `Observation` | `traceId`, `projectId`, `startTime` | Trace traversal and time-series queries |
| `Score` | `projectId`, `traceId`, `observationId`, `name, createdAt` | Score lookups and time-series analytics |

The composite index on `Score(name, createdAt)` specifically supports the score analytics feature's need to retrieve scores by name over time intervals.

资料来源：[packages/shared/prisma/schema.prisma]()

## Prisma Client Usage

Prisma Client is generated from the schema and used throughout the application. The generated client provides type-safe access to all database operations.

### Client Configuration

```typescript
import { PrismaClient } from "@langfuse/shared/prisma";

const prisma = new PrismaClient({
  log: process.env.NODE_ENV === "development" ? ["query", "error"] : ["error"],
});
```

### Transaction Support

The schema supports atomic operations through Prisma's transaction API:

```typescript
await prisma.$transaction([
  prisma.trace.create({ data: traceData }),
  prisma.observation.createMany({ data: observations }),
  prisma.score.createMany({ data: scores }),
]);
```

## Migrations

Database migrations are managed through Prisma Migrate. Migration files are stored in `packages/shared/prisma/migrations/` and version-controlled alongside the schema.

### Running Migrations

```bash
# Apply pending migrations
pnpm --filter=langfuse-prisma migrate deploy

# Create a new migration
pnpm --filter=langfuse-prisma migrate dev --name add_new_field
```

## Related Components

### Repository Layer

The repository pattern abstracts database access behind domain-specific interfaces:

| Repository | File | Purpose |
|------------|------|---------|
| `TraceRepository` | `repositories/traces.ts` | Trace CRUD and querying |
| `ObservationRepository` | `repositories/observations.ts` | Observation management |
| `ScoreRepository` | `repositories/scores.ts` | Score operations |

资料来源：[packages/shared/src/server/repositories/traces.ts]()
资料来源：[packages/shared/src/server/repositories/scores.ts]()

### ClickHouse Integration

While PostgreSQL (via Prisma) stores transactional data like traces and scores, Langfuse also uses ClickHouse for analytics workloads. The Prisma schema defines PostgreSQL models for the primary application data, while ClickHouse handles high-volume analytical queries.

资料来源：[packages/shared/scripts/seeder/utils/README.md]()

## Summary

The Prisma schema forms the backbone of Langfuse's data layer, defining:

- **Multi-tenant structure**: Organizations, projects, and user memberships
- **Observability core**: Traces and observations with hierarchical support
- **Evaluation framework**: Scores with multiple data types and sources
- **Operational metadata**: Timestamps, soft deletes, and JSON fields for flexibility

The schema design prioritizes query performance through strategic indexing, data integrity through proper relationships, and extensibility through JSON metadata fields.

---

<a id='clickhouse-analytics'></a>

## ClickHouse Analytics Layer

### 相关页面

相关主题：[Database Schema (Prisma)](#database-schema), [Worker Service](#worker-service)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql](https://github.com/langfuse/langfuse/blob/main/packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql)
- [packages/shared/src/server/clickhouse/schema.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/clickhouse/schema.ts)
- [packages/shared/src/server/repositories/clickhouse.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/repositories/clickhouse.ts)
- [worker/src/services/ClickhouseWriter/index.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/services/ClickhouseWriter/index.ts)
- [packages/shared/src/server/repositories/score-analytics.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/repositories/score-analytics.ts)
- [web/src/features/score-analytics/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/score-analytics/README.md)
- [packages/shared/scripts/seeder/utils/README.md](https://github.com/langfuse/langfuse/blob/main/packages/shared/scripts/seeder/utils/README.md)
</details>

# ClickHouse Analytics Layer

## Overview

The ClickHouse Analytics Layer is a core infrastructure component in Langfuse that provides high-performance analytical capabilities for processing and querying large-scale observability data. ClickHouse serves as the primary OLAP (Online Analytical Processing) database for storing traces, observations, and score analytics with optimized columnar storage and efficient aggregation queries.

Langfuse leverages ClickHouse for:

- High-throughput event ingestion during trace collection
- Complex analytical queries for score comparisons and distributions
- Time-series analysis with efficient aggregation
- Large dataset sampling and optimization strategies
  资料来源：[packages/shared/scripts/seeder/utils/README.md]()

## Architecture Overview

```mermaid
graph TD
    subgraph Ingestion["Ingestion Layer"]
        W[Worker Service] --> CW[ClickhouseWriter]
        CW --> CH[ClickHouse Cluster]
    end
    
    subgraph Storage["Storage Layer"]
        CH --> TS[Traces Table]
        CH --> OS[Observations Table]
        CH --> SS[Scores Table]
    end
    
    subgraph Query["Query Layer"]
        CR[ClickHouse Repository] --> CH
        SA[Score Analytics] --> CR
        WEB[Web Frontend] --> SA
    end
    
    subgraph Optimization["Optimization Layer"]
        CR --> HASH[cityHash64 Sampling]
        CR --> FINAL[Adaptive FINAL]
        CR --> INTERVAL[Time Interval Alignment]
    end
```

### Components Overview

| Component | Location | Purpose |
|-----------|----------|---------|
| ClickhouseWriter | `worker/src/services/ClickhouseWriter/index.ts` | Writes ingestion events to ClickHouse |
| ClickHouse Repository | `packages/shared/src/server/repositories/clickhouse.ts` | Provides query interface and optimization |
| Score Analytics | `packages/shared/src/server/repositories/score-analytics.ts` | Specialized analytics queries |
| Schema Definitions | `packages/shared/src/server/clickhouse/schema.ts` | TypeScript types for ClickHouse data |
| Migrations | `packages/shared/clickhouse/migrations/clustered/` | Database schema migrations |

资料来源：[worker/src/services/ClickhouseWriter/index.ts]()
资料来源：[packages/shared/src/server/repositories/clickhouse.ts]()

## Data Schema

### Traces Table

The traces table stores the fundamental trace records with hierarchical observation data. The clustered migration defines the primary schema with optimized column types for analytical queries.

Key columns include:

| Column | Type | Description |
|--------|------|-------------|
| id | UUID | Unique trace identifier |
| project_id | String | Project association |
| timestamp | DateTime64 | Event timestamp with millisecond precision |
| name | String | Trace name |
| user_id | String | User identifier |
| metadata | JSON | Flexible metadata storage |
| tags | Array(String) | Tag-based categorization |
| input | Text | Input data |
| output | Text | Output data |

资料来源：[packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql]()

### Observations Table

Observations represent individual events within a trace, storing:

- Model inputs/outputs
- Function calls
- Embeddings
- Generation events

Each observation is linked to its parent trace via `trace_id` and supports nested hierarchies through `parent_observation_id`.

资料来源：[packages/shared/src/server/clickhouse/schema.ts]()

### Scores Table

Scores store evaluation metrics associated with traces and observations:

| Column | Type | Purpose |
|--------|------|---------|
| trace_id | UUID | Associated trace |
| observation_id | UUID | Optional observation link |
| name | String | Score identifier |
| value | Float64 | Numeric score value |
| data_type | Enum | NUMERIC, BOOLEAN, or CATEGORICAL |
| source | String | Score origin (e.g., "framework-trace") |

资料来源：[packages/shared/src/server/repositories/score-analytics.ts]()

## Ingestion Pipeline

### Event Flow

```mermaid
sequenceDiagram
    participant API as Ingestion API
    participant Queue as Redis Queue
    participant Worker as Worker Service
    participant Writer as ClickhouseWriter
    participant CH as ClickHouse
    
    API->>Queue: Enqueue OtelIngestionEvent
    Worker->>Queue: Dequeue Event
    Worker->>Writer: Process Event
    Writer->>CH: Insert Batch (ClickHouseQueryBuilder)
    CH-->>Writer: Confirmation
    Writer->>Worker: Acknowledge
```

### ClickhouseWriter Service

The `ClickhouseWriter` handles the actual data insertion into ClickHouse:

```typescript
// Simplified flow from worker/src/services/ClickhouseWriter/index.ts
class ClickhouseWriter {
  async writeBatch(events: IngestionEvent[]): Promise<void> {
    const queryBuilder = new ClickHouseQueryBuilder();
    
    for (const event of events) {
      queryBuilder.addEvent(event);
    }
    
    await this.executeQuery(queryBuilder.build());
  }
}
```

Key responsibilities:

1. **Batch Processing**: Aggregates multiple events for efficient insertion
2. **Schema Validation**: Ensures events match expected schema
3. **Query Building**: Uses `ClickHouseQueryBuilder` for optimized INSERT queries
4. **Error Recovery**: Handles failed insertions with retry logic

资料来源：[worker/src/services/ClickhouseWriter/index.ts]()

### ClickHouseQueryBuilder

The `ClickHouseQueryBuilder` class constructs optimized ClickHouse SQL queries with:

- Proper escaping for special characters
- Type-aware value formatting
- Batch insert optimization
- Efficient column mapping

资料来源：[packages/shared/scripts/seeder/utils/README.md]()

## Query Layer

### Repository Pattern

The `clickhouse.ts` repository provides a clean interface for all ClickHouse operations:

```typescript
// packages/shared/src/server/repositories/clickhouse.ts
class ClickHouseRepository {
  // Query execution with automatic connection management
  async query<T>(sql: string, params?: QueryParams): Promise<T[]>
  
  // Stream processing for large result sets
  async streamQuery(sql: string, handler: (row: T) => void): Promise<void>
  
  // Batch inserts with transaction support
  async insertBatch(table: string, rows: Record<string, unknown>[]): Promise<void>
}
```

资料来源：[packages/shared/src/server/repositories/clickhouse.ts]()

### Score Analytics Queries

The score analytics module provides specialized queries for evaluating model performance:

```typescript
// packages/shared/src/server/repositories/score-analytics.ts
interface ScoreAnalyticsQuery {
  getScoreIdentifiers(projectId: string): Promise<ScoreIdentifier[]>;
  
  estimateScoreComparisonSize(
    projectId: string,
    score1Id: string,
    score2Id?: string
  ): Promise<QueryEstimate>;
  
  getScoreComparisonAnalytics(
    params: ScoreAnalyticsParams
  ): Promise<ScoreAnalyticsResult>;
}
```

#### Query Estimation

Before executing expensive analytics queries, the system estimates query size:

| Metric | Description |
|--------|-------------|
| scoreCount | Total number of scores matching criteria |
| matchedCount | Estimated rows that will match |
| willSample | Whether hash-based sampling is needed |
| estimatedQueryTime | Predicted query duration |

This estimation enables adaptive query optimization based on dataset size.

资料来源：[packages/shared/src/server/repositories/score-analytics.ts]()
资料来源：[web/src/features/score-analytics/README.md]()

## Optimization Strategies

### Hash-Based Sampling

For large datasets (>100,000 matches), Langfuse uses `cityHash64` for consistent sampling:

```sql
SELECT * FROM scores
WHERE cityHash64(trace_id) < 0.1  -- 10% sample
```

Benefits:

- Consistent sampling across query executions
- Reproducible results for the same query parameters
- Reduced query load while maintaining statistical validity

### Adaptive FINAL Optimization

ClickHouse's FINAL modifier ensures up-to-date data but adds significant overhead. Langfuse uses adaptive application:

| Dataset Size | FINAL Applied |
|--------------|---------------|
| < 70,000 scores | Yes |
| > 70,000 scores | No |

资料来源：[web/src/features/score-analytics/README.md]()

### Time Interval Alignment

Time series queries use proper interval alignment for accurate aggregation:

```typescript
// ISO 8601 weeks
const weekInterval = "1W";

// Calendar months
const monthInterval = "1MONTH";
```

Proper alignment ensures:

- Consistent bucket boundaries
- Accurate period-over-period comparisons
- Correct aggregation across daylight saving time transitions

## Seeding and Testing

The seeder utility (`packages/shared/scripts/seeder/`) generates realistic test data for ClickHouse:

### Data Types

| Type | Environment | Purpose |
|------|-------------|---------|
| Experiment Traces | `langfuse-prompt-experiment` | Realistic traces from actual datasets |
| Evaluation Data | `langfuse-evaluation` | Metrics and scoring for evaluations |
| Synthetic Data | `default` | Large-scale hierarchical test data |

### ID Patterns

- Experiment: `trace-dataset-{datasetName}-{itemIndex}-{projectId}-{runNumber}`
- Evaluation: `trace-eval-{index}-{projectId}`
- Synthetic: `trace-synthetic-{index}-{projectId}`

资料来源：[packages/shared/scripts/seeder/utils/README.md]()

### DataGenerator

The `DataGenerator` class creates realistic data for all three types:

| Method | Output |
|--------|--------|
| `generateDatasetTrace()` | Traces linked to dataset items |
| `generateSyntheticTraces()` | Hierarchical traces with scores |
| `generateEvaluationTraces()` | Evaluation-focused traces |

资料来源：[packages/shared/scripts/seeder/utils/README.md]()

## Framework Traces

Framework traces are real traces produced through official Langfuse framework instrumentation. They can be added to the system for UI testing and demo purposes.

### Adding New Framework Traces

1. Generate a trace using framework instrumentation
2. Download from UI using the download button
3. Convert to JSON format via "Log View (Beta)"
4. Merge observations using the provided script:
   ```bash
   npx ts-node merge-observations.ts trace-file.json observations.json trace-merged.json
   ```
5. Save the merged file with date-based naming

### Discovery

Framework traces use the ID pattern `framework-frameworkName-traceId`. Filter by:

- `source: "framework-trace"` in trace table
- "All Time" date range (timestamps not rewritten)

资料来源：[packages/shared/scripts/seeder/utils/framework-traces/README.md]()

## TypeScript Integration

### Schema Types

The `schema.ts` file provides TypeScript type definitions:

```typescript
// packages/shared/src/server/clickhouse/schema.ts
interface ClickHouseTrace {
  id: string;
  project_id: string;
  timestamp: Date;
  name: string;
  user_id?: string;
  metadata?: Record<string, unknown>;
  tags?: string[];
  input?: string;
  output?: string;
  session_id?: string;
}

interface ClickHouseObservation {
  id: string;
  trace_id: string;
  parent_observation_id?: string;
  type: ObservationType;
  timestamp: Date;
  name?: string;
  // ... additional fields
}
```

These types ensure compile-time safety when interacting with ClickHouse data.

资料来源：[packages/shared/src/server/clickhouse/schema.ts]()

## Configuration

### Required Environment Variables

| Variable | Description | Example |
|----------|-------------|---------|
| `CLICKHOUSE_URL` | ClickHouse server URL | `http://localhost:8123` |
| `CLICKHOUSE_USER` | Database user | `clickhouse` |
| `CLICKHOUSE_PASSWORD` | User password | `clickhouse` |
| `CLICKHOUSE_DATABASE` | Target database | `default` |

### Cluster Configuration

Migrations support clustered deployments:

```bash
# Clustered migration path
packages/shared/clickhouse/migrations/clustered/
```

The clustered migrations ensure schema consistency across all nodes in a ClickHouse cluster.

资料来源：[packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql]()

## Best Practices

### Query Optimization

1. **Use projections** for frequently accessed columns
2. **Leverageskipping indexes** for high-cardinality columns
3. **Batch inserts** to reduce overhead
4. **Filter early** to minimize data processed

### Data Management

1. **Partition by date** for efficient time-range queries
2. **Use TTL policies** for automatic data expiration
3. **Compress data** using ClickHouse's native compression

### Integration Guidelines

1. Always use the repository pattern for query abstraction
2. Implement query estimation before expensive operations
3. Use hash-based sampling for large analytical queries
4. Consider adaptive FINAL optimization for query performance

## See Also

- [Score Analytics Module](../features/score-analytics/README.md)
- [Ingestion API Documentation](../ingestion/README.md)
- [Worker Service Documentation](../worker/README.md)
- [Seeding Utilities](../packages/shared/scripts/seeder/utils/README.md)

---

<a id='queue-system'></a>

## Queue System (Redis/BullMQ)

### 相关页面

相关主题：[System Architecture](#system-architecture), [Worker Service](#worker-service)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [packages/shared/src/server/redis/ingestionQueue.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/redis/ingestionQueue.ts)
- [packages/shared/src/server/redis/evalExecutionQueue.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/redis/evalExecutionQueue.ts)
- [packages/shared/src/server/redis/batchActionQueue.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/redis/batchActionQueue.ts)
- [packages/shared/src/server/redis/webhookQueue.ts](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/server/redis/webhookQueue.ts)
- [worker/src/queues/workerManager.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/queues/workerManager.ts)
</details>

# Queue System (Redis/BullMQ)

Langfuse employs a distributed queue system built on **Redis** for storage and **BullMQ** for job orchestration. This architecture enables asynchronous processing of high-volume operations including event ingestion, evaluation execution, batch actions, and webhook delivery.

## Architecture Overview

The queue system follows a producer-consumer pattern where the web application enqueues jobs and worker processes consume them asynchronously.

```mermaid
graph TD
    subgraph "Langfuse Web Application"
        A[API Request] --> B[Queue Client]
        B --> C[Redis Queue]
    end
    
    subgraph "Langfuse Worker"
        C --> D[Worker Manager]
        D --> E1[Ingestion Worker]
        D --> E2[Eval Worker]
        D --> E3[Batch Worker]
        D --> E4[Webhook Worker]
    end
    
    subgraph "Redis"
        C --> F[(Redis Cluster)]
    end
    
    subgraph "External Services"
        E1 --> G[(ClickHouse)]
        E1 --> H[(PostgreSQL)]
        E2 --> H
        E3 --> H
        E4 --> I[(External APIs)]
    end
```

## Queue Types

Langfuse defines multiple specialized queues for different workloads:

| Queue Name | Purpose | Processing Type | Priority |
|------------|---------|-----------------|----------|
| `ingestion` | Event ingestion and processing | Async batch | Medium |
| `evalExecution` | LLM evaluation execution | Async | Medium |
| `batchAction` | Bulk operations on data | Async batch | Low |
| `webhook` | Outbound webhook delivery | Async | High |
| `OtelIngestion` | OpenTelemetry event ingestion | Async | Medium |
| `IngestionSecondary` | Secondary ingestion processing | Async | Medium |

资料来源：[worker/src/queues/workerManager.ts]()

## Queue Configuration

### Redis Connection

All queues rely on a Redis connection string configured via environment variables:

```bash
REDIS_CONNECTION_STRING=redis://:myredissecret@127.0.0.1:6379
```

### Queue Initialization

Each queue is initialized with specific BullMQ configuration:

```typescript
const myQueue = new Queue<T>(queueName, {
  connection: {
    host: redisConfig.host,
    port: redisConfig.port,
    password: redisConfig.password,
  },
  defaultJobOptions: {
    attempts: 3,
    backoff: {
      type: "exponential",
      delay: 1000,
    },
    removeOnComplete: true,
    removeOnFail: false,
  },
});
```

资料来源：[packages/shared/src/server/redis/ingestionQueue.ts]()

## Queue Implementations

### Ingestion Queue

The ingestion queue handles event processing from SDK clients and the OpenTelemetry protocol.

```mermaid
graph LR
    A[SDK Events] --> B[API Endpoint]
    B --> C[ingestionQueue]
    C --> D[Validate Events]
    D --> E[Parse & Transform]
    E --> F[(ClickHouse)]
    E --> G[(PostgreSQL)]
```

**Key Features:**

- Batch processing with configurable batch size
- Retry with exponential backoff
- Event validation against schema
- S3 file-based storage for large payloads

资料来源：[packages/shared/src/server/redis/ingestionQueue.ts]()

### Evaluation Execution Queue

Handles asynchronous execution of LLM-based evaluations:

```mermaid
graph TD
    A[Create Eval Job] --> B[evalExecutionQueue]
    B --> C[Worker Pickup]
    C --> D[Fetch Traces]
    D --> E[Run LLM Evaluation]
    E --> F[Store Results]
    F --> G[(PostgreSQL)]
```

**Job Options:**

```typescript
{
  attempts: 3,
  backoff: {
    type: "exponential",
    delay: 2000,
  },
  removeOnComplete: 100, // Keep last 100 completed
  removeOnFail: 1000,   // Keep last 1000 failed
}
```

资料来源：[packages/shared/src/server/redis/evalExecutionQueue.ts]()

### Batch Action Queue

Processes bulk operations such as batch updates and deletions:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `batchSize` | 100 | Items per batch |
| `concurrency` | 5 | Parallel workers |
| `attempts` | 3 | Retry count |
| `backoffDelay` | 1000 | Initial backoff ms |

资料来源：[packages/shared/src/server/redis/batchActionQueue.ts]()

### Webhook Queue

Manages outbound webhook deliveries with priority handling:

```mermaid
graph TD
    A[Trigger Event] --> B[webhookQueue]
    B --> C{Has Retry Config?}
    C -->|Yes| D[Schedule Retry]
    C -->|No| E[Immediate Delivery]
    D --> E
    E --> F{HTTP Response}
    F -->|2xx| G[Log Success]
    F -->|4xx| H[Log Failure]
    F -->|5xx| D
```

资料来源：[packages/shared/src/server/redis/webhookQueue.ts]()

## Worker Manager

The `WorkerManager` orchestrates all queue workers within the worker process:

```typescript
export class WorkerManager {
  private workers: Map<string, Worker>;

  async initialize(): Promise<void> {
    // Initialize all queue workers
  }

  async gracefulShutdown(): Promise<void> {
    // Gracefully close all workers
  }
}
```

资料来源：[worker/src/queues/workerManager.ts]()

### Worker Lifecycle

```mermaid
graph TD
    A[Start Worker Process] --> B[Load Configuration]
    B --> C[Initialize Redis Connection]
    C --> D[Create Queue Instances]
    D --> E[Create Worker Instances]
    E --> F[Register Event Handlers]
    F --> G[Workers Ready]
    
    H[Shutdown Signal] --> I[Close Workers]
    I --> J[Process Pending Jobs]
    J --> K[Close Redis Connection]
    K --> L[Exit]
```

### Event Handling

Workers register handlers for job lifecycle events:

| Event | Handler Purpose |
|-------|------------------|
| `completed` | Log successful job completion |
| `failed` | Handle job failures and retries |
| `progress` | Track job progress updates |
| `stalled` | Detect and requeue stalled jobs |

## Job Data Flow

### Standard Event Ingestion

Events flow through the system as follows:

```mermaid
sequenceDiagram
    participant SDK
    participant API
    participant Redis
    participant Worker
    participant DB
    
    SDK->>API: POST /api/public/ingestion
    API->>Redis: Add to IngestionQueue
    API-->>SDK: 202 Accepted
    Worker->>Redis: Dequeue Job
    Worker->>DB: Validate & Store
    Worker->>Redis: Job Complete
```

### Event Transformation

The ingestion endpoint transforms S3 keys into queue payloads:

**Standard format:**
```json
{
  "authCheck": {
    "validKey": true,
    "scope": { "projectId": "<projectId>" }
  },
  "data": {
    "eventBodyId": "<eventBodyId>",
    "fileKey": "<eventId>",
    "type": "<type>-create"
  }
}
```

**OTEL format:**
```json
{
  "authCheck": {
    "validKey": true,
    "scope": { "projectId": "<projectId>", "accessLevel": "project" }
  },
  "data": {
    "fileKey": "otel/<projectId>/<yyyy>/<mm>/<dd>/<hh>/<mm>/<eventId>.json"
  }
}
```

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md]()

## Error Handling and Retries

### Retry Strategy

All queues implement exponential backoff retry:

```typescript
const jobOptions = {
  attempts: 3,
  backoff: {
    type: "exponential",
    delay: 1000, // 1s, 2s, 4s
  },
};
```

### Error Classification

| HTTP Status | Behavior | Retry |
|-------------|----------|-------|
| 2xx | Success | No |
| 429 | Rate limited | Yes (with backoff) |
| 5xx | Server error | Yes (up to 3 times) |
| 4xx (not 429) | Client error | No (logged and skipped) |

## Monitoring and Debugging

### Progress Tracking

The replay scripts provide progress updates:

```
[1200/45000] 2.7% — 498 queued, 2 skipped
```

### Checkpoint System

Scripts write checkpoints to enable resume after failures:

```bash
# Checkpoint file location
./worker/.checkpoint

# Resume from checkpoint
pnpm run --filter=worker replay-ingestion --resume
```

### Error Logging

Failed jobs are logged to `errors.csv` for manual inspection:

```csv
"operation","key","error"
"REST.PUT.OBJECT","projectId/type/eventBodyId/eventId.json","Connection timeout"
```

## Admin API for Queue Management

### `POST /api/admin/ingestion-replay`

Enqueues batches of S3 keys for reprocessing:

**Request:**
```json
{
  "keys": [
    "projectId/trace/eventBodyId/eventId.json",
    "otel/projectId/2025/07/09/14/30/some-uuid.json"
  ]
}
```

**Response:**
```json
{
  "queued": 498,
  "skipped": 2,
  "errors": []
}
```

### Authentication

Requires `Authorization: Bearer {ADMIN_API_KEY}` header validated by `AdminApiAuthService`.

## Environment Variables

| Variable | Description | Required |
|----------|-------------|----------|
| `REDIS_CONNECTION_STRING` | Redis connection URL | Yes |
| `LANGFUSE_S3_EVENT_UPLOAD_BUCKET` | S3 bucket for event storage | Yes |
| `CLICKHOUSE_URL` | ClickHouse connection URL | Yes |
| `CLICKHOUSE_USER` | ClickHouse username | Yes |
| `CLICKHOUSE_PASSWORD` | ClickHouse password | Yes |
| `ADMIN_API_KEY` | Admin API authentication key | Yes (admin endpoints) |

## Utility Scripts

### Replay Ingestion Events V2

A streamlined replacement for v1 with improved features:

| Feature | v1 | v2 |
|---------|----|----|
| Infrastructure | Redis, ClickHouse, PostgreSQL, S3 | Langfuse host URL only |
| Setup | Full repo clone | `npx tsx` + env vars |
| Event delivery | Direct BullMQ `addBulk` | HTTP POST to admin API |
| Resume support | Manual | Built-in checkpoint |
| Rate limiting | None | Client + server side |

### Refill Queue Event

Backfills queues with events from local machines:

```bash
# 1. Create events file
echo '{"projectId": "project-123", "orgId": "org-456"}' > ./worker/events.jsonl

# 2. Configure environment
# Create .env with REDIS_CONNECTION_STRING and supporting services

# 3. Run the script
pnpm run --filter=worker refill-queue-event
```

## Best Practices

1. **Connection Pooling**: Reuse Redis connections across queue operations
2. **Graceful Shutdown**: Always drain active jobs before stopping workers
3. **Monitoring**: Track queue depth and processing times
4. **Error Boundaries**: Isolate queue failures to prevent cascade
5. **Backoff Tuning**: Adjust retry delays based on workload characteristics

## Related Documentation

- [Event Ingestion System](../ingestion/overview)
- [Evaluation Framework](../evaluation/overview)
- [Webhook Configuration](../webhooks/setup)

---

<a id='worker-service'></a>

## Worker Service

### 相关页面

相关主题：[System Architecture](#system-architecture), [Queue System (Redis/BullMQ)](#queue-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [worker/src/app.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/app.ts)
- [worker/src/index.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/index.ts)
- [worker/src/queues/workerManager.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/queues/workerManager.ts)
- [worker/src/features/evaluation/evalService.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/features/evaluation/evalService.ts)
- [worker/src/features/batchAction/handleBatchActionJob.ts](https://github.com/langfuse/langfuse/blob/main/worker/src/features/batchAction/handleBatchActionJob.ts)
</details>

# Worker Service

## Overview

The Worker Service is a core backend component in Langfuse responsible for asynchronous processing of long-running tasks. It operates as a separate Node.js process that communicates with the main Langfuse server through message queues, primarily using BullMQ backed by Redis.

```mermaid
graph TB
    subgraph "Langfuse Server"
        A[API Endpoints]
        B[tRPC Routers]
    end
    
    subgraph "Redis"
        C[(Ingestion Queues)]
        D[(Evaluation Queues)]
        E[(Batch Action Queues)]
    end
    
    subgraph "Worker Service"
        F[Worker Manager]
        G[Evaluation Service]
        H[Batch Action Handler]
        I[Queue Processors]
    end
    
    A -->|"Enqueue Jobs"| C
    B -->|"Dispatch Tasks"| C
    F -->|"Process"| C
    G -->|"Execute"| D
    H -->|"Execute"| E
    C --> F
    D --> G
    E --> H
```

### Purpose and Scope

The Worker Service handles the following categories of work:

- **Event Ingestion Processing**: Processing and persisting trace events, observations, and spans from the ingestion queues
- **Evaluation Execution**: Running LLM-based evaluations on traces and observations
- **Batch Actions**: Executing bulk operations on datasets, traces, and other resources
- **Queue Replay**: Replaying historical ingestion events for data recovery or reprocessing

资料来源：[worker/src/app.ts]()

---

## Architecture

### Entry Point

The worker service is bootstrapped through `worker/src/index.ts`, which initializes the application context and starts the queue processors. The main application logic resides in `worker/src/app.ts`, which sets up the BullMQ workers and registers job handlers.

```typescript
// Simplified worker initialization flow
const app = new WorkerApp();
await app.initialize();
await app.start();
```

资料来源：[worker/src/index.ts]()

### Worker Manager

The `WorkerManager` is the central orchestrator that manages all queue workers. It is responsible for:

- Registering queue processors for different job types
- Configuring concurrency settings per queue
- Handling job failures and retries
- Graceful shutdown coordination

```mermaid
graph LR
    A[Job Enqueued] --> B[Worker Manager]
    B --> C{Job Type?}
    C -->|Ingestion| D[Ingestion Processor]
    C -->|Evaluation| E[Eval Service]
    C -->|Batch Action| F[Batch Handler]
    
    D --> G[Success]
    E --> G
    F --> G
    
    D --> H[Retry/Fail]
    E --> H
    F --> H
```

资料来源：[worker/src/queues/workerManager.ts]()

### Queue Architecture

Langfuse uses multiple queues for different purposes:

| Queue Name | Purpose | Priority | Concurrency |
|------------|---------|----------|-------------|
| `IngestionSecondaryQueue` | Standard event ingestion | Medium | Configurable |
| `OtelIngestionQueue` | OpenTelemetry-format events | High | Configurable |
| `EvalQueue` | LLM evaluation jobs | Low | Configurable |
| `BatchActionQueue` | Bulk operations | Medium | Configurable |

资料来源：[worker/src/app.ts]()
资料来源：[worker/src/queues/workerManager.ts]()

---

## Core Components

### Evaluation Service

The Evaluation Service (`evalService.ts`) handles asynchronous evaluation of traces and observations using LLM-based judges. It supports:

- **Trace-level evaluations**: Running multiple evaluation criteria against a complete trace
- **Observation-level evaluations**: Evaluating individual spans or generations
- **Dataset-based evaluations**: Running evaluations against dataset items
- **Configurable retry logic**: Handling transient failures gracefully

```mermaid
graph TD
    A[Evaluation Job Received] --> B[Load Trace/Observation]
    B --> C[Fetch Eval Config]
    C --> D[Prepare Prompt]
    D --> E[Call LLM Provider]
    E --> F{Success?}
    F -->|Yes| G[Score Result]
    F -->|No| H{Retry < 3?}
    H -->|Yes| E
    H -->|No| I[Mark Failed]
    G --> J[Persist Score]
    J --> K[Job Complete]
    I --> K
```

资料来源：[worker/src/features/evaluation/evalService.ts]()

### Batch Action Handler

The Batch Action Handler processes bulk operations requested through the admin API or scheduled jobs. Common batch actions include:

- Bulk dataset operations (import, export, deletion)
- Mass updates to traces or observations
- Batch score calculations
- Export operations

```typescript
interface BatchActionJob {
  id: string;
  type: BatchActionType;
  payload: BatchActionPayload;
  priority?: number;
  createdAt: Date;
}
```

资料来源：[worker/src/features/batchAction/handleBatchActionJob.ts]()

---

## Queue Processing

### Job Flow

```mermaid
sequenceDiagram
    participant API as API Server
    participant Redis as Redis Queue
    participant Worker as Worker Service
    
    API->>Redis: Enqueue Job
    Worker->>Redis: Poll for Jobs
    Redis-->>Worker: Available Jobs
    Worker->>Worker: Process Job
    Worker->>Redis: Mark Complete
    Worker->>API: Update Status (optional)
```

### Error Handling and Retries

The worker implements exponential backoff for transient failures:

1. **Transient failures** (network timeouts, rate limits): Retry up to 3 times with exponential backoff
2. **Permanent failures** (validation errors, missing data): Mark as failed, log error details
3. **Rate limiting**: Respects server-side rate limits and backs off accordingly

| Error Type | HTTP Code | Retry Behavior |
|------------|-----------|----------------|
| Rate Limited | 429 | Exponential backoff |
| Server Error | 5xx | Retry up to 3 times |
| Client Error | 4xx (not 429) | No retry, log and skip |
| Validation | N/A | No retry, mark failed |

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md]()

### Concurrency Configuration

Workers can be configured with different concurrency levels per queue:

```typescript
const workerConfig = {
  ingestion: {
    concurrency: 10,
    maxRetries: 3,
  },
  evaluation: {
    concurrency: 5,
    maxRetries: 2,
  },
  batchAction: {
    concurrency: 3,
    maxRetries: 1,
  },
};
```

资料来源：[worker/src/queues/workerManager.ts]()

---

## Utility Scripts

### Replay Ingestion Events V2

The replay script allows reprocessing of historical ingestion events from S3 storage.

**Usage:**
```bash
pnpm --filter=worker run replay-ingestion-events-v2 \
  --input ./events.csv \
  --batch-size 500 \
  --concurrency 4
```

**Key Features:**
- Checkpoint/resume support for interrupted runs
- Configurable batch size and concurrency
- Rate limiting with exponential backoff
- Progress tracking with percentage and ETA

| Parameter | Default | Description |
|-----------|---------|-------------|
| `--input` | Required | Path to CSV file with S3 keys |
| `--batch-size` | 500 | Keys per API request |
| `--concurrency` | 4 | Parallel API requests |
| `--rate-limit` | 50 | Max requests per second |
| `--dry-run` | false | Validate without sending |
| `--resume` | false | Continue from checkpoint |

资料来源：[worker/src/scripts/replayIngestionEventsV2/README.md]()

### Refill Queue Event

The refill script backfills queues with events from local files for testing and development.

**Requirements:**
1. Create `./worker/events.jsonl` with JSON events
2. Configure `.env` with Redis and supporting service credentials
3. Run: `pnpm run --filter=worker refill-queue-event`

**Event Format:**
```jsonl
{"projectId": "project-123", "orgId": "org-456"}
{"projectId": "project-789", "orgId": "org-101"}
```

资料来源：[worker/src/scripts/refillQueueEvent/README.md]()

---

## Environment Configuration

### Required Environment Variables

| Variable | Description | Required |
|----------|-------------|----------|
| `REDIS_CONNECTION_STRING` | Redis connection URL | Yes |
| `LANGFUSE_S3_EVENT_UPLOAD_BUCKET` | S3 bucket for event uploads | Yes |
| `CLICKHOUSE_URL` | ClickHouse connection URL | Yes |
| `CLICKHOUSE_USER` | ClickHouse username | Yes |
| `CLICKHOUSE_PASSWORD` | ClickHouse password | Yes |
| `ADMIN_API_KEY` | Admin API key for replay scripts | For replay only |
| `LANGFUSE_HOST` | Target Langfuse instance URL | For replay only |

### Optional Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `WORKER_CONCURRENCY` | 10 | Default queue concurrency |
| `EVALUATION_CONCURRENCY` | 5 | Evaluation queue concurrency |
| `BATCH_ACTION_CONCURRENCY` | 3 | Batch action concurrency |

资料来源：[worker/src/scripts/refillQueueEvent/README.md]()
资料来源：[worker/src/app.ts]()

---

## Health and Monitoring

### Debug Mode

Enable detailed logging for the AdvancedJsonViewer (useful for debugging UI components):

```javascript
localStorage.setItem("debug:AdvancedJsonViewer", "true");
```

### Logging Categories

The worker service emits structured logs for:

- Queue processing events
- Job duration and throughput
- Error rates and failure reasons
- Resource utilization metrics

### Graceful Shutdown

The worker service supports graceful shutdown to prevent job loss:

1. Stop accepting new jobs
2. Wait for in-progress jobs to complete (with timeout)
3. Release Redis connections
4. Exit cleanly

---

## Performance Considerations

### Scalability

- Multiple worker instances can run in parallel
- Each queue can have independent concurrency settings
- Redis acts as a shared job queue, enabling horizontal scaling

### Memory Management

| Strategy | Implementation |
|----------|----------------|
| Iterative Processing | Uses explicit stack-based iteration instead of recursion to prevent stack overflow |
| Batch Processing | Processes events in configurable batch sizes |
| Lazy Loading | Loads trace/observation data on-demand within job handlers |

### Known Limitations

1. **No horizontal virtualization in UI**: Wide JSON rows render fully
2. **Client-side search only**: All matches computed in memory
3. **Memory constraints**: 1M+ nodes may cause issues despite virtualization
4. **Wrap mode performance**: Long strings require height measurement

资料来源：[web/src/components/ui/AdvancedJsonViewer/README.md]()

---

## Summary

The Worker Service is essential to Langfuse's architecture, providing asynchronous processing capabilities that keep the main API server responsive. Key takeaways:

- **BullMQ + Redis**: Provides reliable job queuing with built-in retry logic
- **Specialized processors**: Dedicated handlers for ingestion, evaluation, and batch actions
- **Configurable concurrency**: Fine-tune throughput per queue type
- **Replay capabilities**: Built-in utilities for reprocessing historical events
- **Horizontal scaling**: Multiple worker instances share work through Redis

---

<a id='api-layer'></a>

## API Layer

### 相关页面

相关主题：[System Architecture](#system-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [web/src/server/api/root.ts](https://github.com/langfuse/langfuse/blob/main/web/src/server/api/root.ts)
- [web/src/server/api/routers/traces.ts](https://github.com/langfuse/langfuse/blob/main/web/src/server/api/routers/traces.ts)
- [web/src/server/api/routers/scores.ts](https://github.com/langfuse/langfuse/blob/main/web/src/server/api/routers/scores.ts)
- [web/src/server/api/routers/observations.ts](https://github.com/langfuse/langfuse/blob/main/web/src/server/api/routers/observations.ts)
- [web/src/server/api/trpc.ts](https://github.com/langfuse/langfuse/blob/main/web/src/server/api/trpc.ts)
- [packages/shared/src/features/scores/interfaces/README.md](https://github.com/langfuse/langfuse/blob/main/packages/shared/src/features/scores/interfaces/README.md)
- [web/src/features/score-analytics/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/score-analytics/README.md)
- [web/src/features/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/features/README.md)
</details>

# API Layer

The API Layer is the central communication bridge between the Langfuse frontend and backend services. Built on **tRPC** (TypeScript RPC), it provides end-to-end type safety, enabling the web application to interact with server-side logic through strongly-typed procedure calls.

## Overview

The API Layer serves multiple critical functions:

- **Type-Safe Communication**: All API calls are fully typed from server to client
- **Authentication & Authorization**: Every procedure is wrapped with auth middleware
- **Business Logic Isolation**: Procedures delegate to repository layer for data access
- **Input Validation**: Zod schemas validate all incoming requests
- **Feature Organization**: Procedures are grouped by domain (traces, observations, scores)

```mermaid
graph TD
    subgraph Frontend
        UI[React Components]
    end
    
    subgraph API Layer
        TRPC[tRPC Client]
        Procedures[tRPC Procedures]
        Middleware[Auth Middleware]
    end
    
    subgraph Backend
        Repositories[Repositories]
        Database[(Database)]
    end
    
    UI --> TRPC
    TRPC --> Procedures
    Procedures --> Middleware
    Middleware --> Repositories
    Repositories --> Database
```

## Architecture

### Core Components

| Component | File | Purpose |
|-----------|------|---------|
| tRPC Instance | `trpc.ts` | Initialize tRPC with middleware and context |
| Root Router | `root.ts` | Register all feature routers |
| Trace Router | `routers/traces.ts` | Trace CRUD and query operations |
| Observation Router | `routers/observations.ts` | Span/generation/event operations |
| Score Router | `routers/scores.ts` | Score management and analytics |
| Score Analytics | `features/score-analytics/server/scoreAnalyticsRouter.ts` | Score aggregation and statistics |

### Router Registration Flow

The root router aggregates all feature routers under a namespace:

```typescript
// Simplified from web/src/server/api/root.ts
export const rootRouter = createTRPCRouter({
  trace: traceRouter,
  observation: observationRouter,
  score: scoreRouter,
  scoreAnalytics: scoreAnalyticsRouter,
  // ... other routers
});
```

资料来源：[web/src/server/api/root.ts:1-50]()

## tRPC Configuration

### Initialization

The tRPC instance is initialized in `trpc.ts` with:

1. **Context Creation**: Builds request-scoped context with authentication
2. **Middleware Chain**: Applies auth, rate limiting, and logging
3. **Error Handling**: Transforms errors into HTTP-compatible responses

```typescript
// From web/src/server/api/trpc.ts
export const createTRPCContext = async (opts: CreateNextContextOptions) => {
  return {
    session: await getServerSession(authOptions),
    // ... additional context
  };
};

const t = initTRPC.context<typeof createTRPCContext>().create();
```

资料来源：[web/src/server/api/trpc.ts:1-30]()

### Middleware Stack

| Middleware | Purpose |
|------------|---------|
| `isAuthed` | Validates user session and project access |
| `isProjectMember` | Ensures user belongs to the project scope |
| `isOwnerOrMember` | Allows owner or member roles |
| `rateLimit` | Prevents abuse with configurable limits |

## API Routers

### Trace Router

Handles all trace-related operations including retrieval, creation, and updates.

**Key Procedures:**

| Procedure | Type | Description |
|-----------|------|-------------|
| `getById` | Query | Fetch single trace with full details |
| `list` | Query | Paginated trace listing with filters |
| `create` | Mutation | Create new trace record |
| `update` | Mutation | Update trace metadata/tags |
| `delete` | Mutation | Soft-delete trace |

资料来源：[web/src/server/api/routers/traces.ts:1-100]()

### Observation Router

Manages spans, generations, and events that belong to traces.

**Key Procedures:**

| Procedure | Type | Description |
|-----------|------|-------------|
| `getById` | Query | Fetch single observation |
| `list` | Query | List observations with trace/session filters |
| `create` | Mutation | Create observation linked to trace |
| `update` | Mutation | Update observation metadata |

资料来源：[web/src/server/api/routers/observations.ts:1-100]()

### Score Router

Provides score management with support for multiple API versions (v1 and v2).

**API Versioning Strategy:**

| Version | traceId Required | Session Support | Dataset Run Support |
|---------|------------------|-----------------|---------------------|
| v1 | Yes | No | No |
| v2 | Optional | Yes | Yes |

The Score router supports both trace-level and session-level scores through different API versions.

**Key Procedures:**

| Procedure | Type | Description |
|-----------|------|-------------|
| `create` | Mutation | Create score (POST endpoint) |
| `delete` | Mutation | Delete score |
| `getById` | Query | Fetch single score |
| `list` | Query | List scores with filters |

资料来源：[web/src/server/api/routers/scores.ts:1-100]()

### Score Analytics Router

Provides aggregated statistics and time-series data for scores.

```typescript
// From web/src/features/score-analytics/server/scoreAnalyticsRouter.ts
export const scoreAnalyticsRouter = createTRPCRouter({
  timeSeries: protectedProcedure.query(...),
  statistics: protectedProcedure.query(...),
  heatmapData: protectedProcedure.query(...),
});
```

**Key Procedures:**

| Procedure | Type | Description |
|-----------|------|-------------|
| `timeSeries` | Query | Time-series score data with gap filling |
| `statistics` | Query | Statistical summaries (count, mean, p50/p95/p99) |
| `heatmapData` | Query | Heatmap matrix for visualization |

资料来源：[web/src/features/score-analytics/README.md]()

## Request Flow

```mermaid
sequenceDiagram
    participant Client
    participant TRPC as tRPC Server
    participant Middleware
    participant Router
    participant Repository
    participant DB as Database

    Client->>TRPC: Procedure Call
    TRPC->>Middleware: Apply Chain
    Middleware->>Middleware: Auth Check
    Middleware->>Router: Validated Input
    Router->>Repository: Domain Operation
    Repository->>DB: SQL/Query
    DB-->>Repository: Result
    Repository-->>Router: Domain Object
    Router-->>TRPC: Response
    TRPC-->>Client: Typed Response
```

## Input Validation

All procedures use **Zod schemas** for runtime validation:

```typescript
// Example pattern from routers
const createTraceSchema = z.object({
  name: z.string().optional(),
  userId: z.string().optional(),
  metadata: z.record(z.unknown()).optional(),
  tags: z.array(z.string()).optional(),
});

protectedProcedure
  .input(createTraceSchema)
  .mutation(async ({ input, ctx }) => {
    return ctx.repo.trace.create(input);
  });
```

## Type Flow

The API Layer maintains type consistency across the stack:

```mermaid
graph LR
    Client[Client Input] --> InputZ[Zod Schema]
    InputZ --> InputTS[TypeScript Type]
    InputTS --> Handler[Procedure Handler]
    Handler --> Repo[Repository Return]
    Repo --> OutputZ[Zod Response Schema]
    OutputZ --> OutputTS[API Response Type]
    OutputTS --> ClientResponse[Client]
```

**Type Transformation Points:**

| Stage | Location | Purpose |
|-------|----------|---------|
| Input | Routers | Zod validation + type inference |
| Domain | Repositories | Database models to domain objects |
| Output | Routers | Zod response schema validation |
| Client | React Hooks | Full type safety for UI |

## Score Interface Architecture

Scores have a multi-layer type system:

| Layer | Location | Purpose |
|-------|----------|---------|
| API v1 | `interfaces/api/v1/` | Legacy trace-only scores |
| API v2 | `interfaces/api/v2/` | Current with session/dataset support |
| Application | `interfaces/application/` | Internal validation |
| UI | `interfaces/ui/` | Simplified frontend types |

资料来源：[packages/shared/src/features/scores/interfaces/README.md]()

## Public API Extension

The Langfuse Public API extends the internal API layer for external consumption:

```typescript
// From web/src/features/README.md
// Pattern for adding new public API routes:
1. Wrap with withMiddleware
2. Type-safe route with createAuthedAPIRoute
3. Add Zod types to /features/public-api/types
4. Use coerce for date handling
5. Use strict() on response objects
```

**SDK Generation Pipeline:**

```mermaid
graph TD
    Fern[Fern Definition] --> PythonSDK[Python SDK]
    Fern --> JSSDK[JS/TS SDK]
    Fern --> Docs[API Documentation]
```

## Best Practices

### Procedure Design

- Use `protectedProcedure` for authenticated endpoints
- Apply input validation at the procedure level
- Return consistent response structures
- Handle errors with typed error classes

### Error Handling

| Error Type | HTTP Code | Usage |
|------------|-----------|-------|
| `UNAUTHORIZED` | 401 | Missing/invalid session |
| `FORBIDDEN` | 403 | Insufficient permissions |
| `NOT_FOUND` | 404 | Resource doesn't exist |
| `BAD_REQUEST` | 400 | Invalid input |
| `INTERNAL_SERVER_ERROR` | 500 | Unexpected errors |

### Performance Considerations

- Use cursor-based pagination for large datasets
- Leverage repository-level caching where applicable
- Batch database operations in mutations
- Limit response sizes with maxTake parameters

---

---

## Doramagic 踩坑日志

项目：langfuse/langfuse

摘要：发现 17 个潜在踩坑项，其中 4 个为 high/blocking；最高优先级：安装坑 - 来源证据：bug: Using client with context manager breaks the scoring。

## 1. 安装坑 · 来源证据：bug: Using client with context manager breaks the scoring

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：bug: Using client with context manager breaks the scoring
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_5afee24537ba47369cc4621f7fb18122 | https://github.com/langfuse/langfuse/issues/8138 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 2. 安装坑 · 来源证据：bug: unnamed trace name in Langfuse UI

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：bug: unnamed trace name in Langfuse UI
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_a219c99fe99c4b7dab002e2b3a6296c2 | https://github.com/langfuse/langfuse/issues/13416 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 3. 安全/权限坑 · 来源证据：bug: AsyncStream' object has no attribute 'usage' when integrated with Semantic Kernel and Openlit

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：bug: AsyncStream' object has no attribute 'usage' when integrated with Semantic Kernel and Openlit
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_8657d86702904e90b9d448770e618256 | https://github.com/langfuse/langfuse/issues/8173 | 来源类型 github_issue 暴露的待验证使用条件。

## 4. 安全/权限坑 · 来源证据：bug: Worker shutdown takes ~1 hour in self hosted kubernetes

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：bug: Worker shutdown takes ~1 hour in self hosted kubernetes
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_cff1b1d1a1ca4eb892563c33d3aa62e9 | https://github.com/langfuse/langfuse/issues/8156 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 5. 安装坑 · 来源证据：bug: Socket timeout. Expecting data, but didn't receive any in 30000ms on idle BullMQ queues

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：bug: Socket timeout. Expecting data, but didn't receive any in 30000ms on idle BullMQ queues
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_49a69075a1c346789a28db93c9ec6f3f | https://github.com/langfuse/langfuse/issues/13601 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 6. 安装坑 · 来源证据：v3.169.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：v3.169.0
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_864f213fd7694eba9a4d2fe2bb9267ab | https://github.com/langfuse/langfuse/releases/tag/v3.169.0 | 来源讨论提到 npm 相关条件，需在安装/试用前复核。

## 7. 安装坑 · 来源证据：v3.172.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：v3.172.0
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_14588986ba9945eeb40cbc0508e3fed0 | https://github.com/langfuse/langfuse/releases/tag/v3.172.0 | 来源讨论提到 npm 相关条件，需在安装/试用前复核。

## 8. 安装坑 · 来源证据：v3.173.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：v3.173.0
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_7560a954846b4f35aedb74de1291c9a4 | https://github.com/langfuse/langfuse/releases/tag/v3.173.0 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 9. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:642497346 | https://github.com/langfuse/langfuse | README/documentation is current enough for a first validation pass.

## 10. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:642497346 | https://github.com/langfuse/langfuse | last_activity_observed missing

## 11. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:642497346 | https://github.com/langfuse/langfuse | no_demo; severity=medium

## 12. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:642497346 | https://github.com/langfuse/langfuse | no_demo; severity=medium

## 13. 安全/权限坑 · 来源证据：v3.168.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：v3.168.0
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_202b4e8c2c1f4b3790315098d1530297 | https://github.com/langfuse/langfuse/releases/tag/v3.168.0 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 14. 安全/权限坑 · 来源证据：v3.170.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：v3.170.0
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_9ed6f994e1424878aa4559a73d72fc52 | https://github.com/langfuse/langfuse/releases/tag/v3.170.0 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 15. 安全/权限坑 · 来源证据：v3.174.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：v3.174.0
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f9cb7b7232ff4cce96f0c020fe48c7f4 | https://github.com/langfuse/langfuse/releases/tag/v3.174.0 | 来源讨论提到 api key 相关条件，需在安装/试用前复核。

## 16. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:642497346 | https://github.com/langfuse/langfuse | issue_or_pr_quality=unknown

## 17. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:642497346 | https://github.com/langfuse/langfuse | release_recency=unknown

<!-- canonical_name: langfuse/langfuse; human_manual_source: deepwiki_human_wiki -->