Doramagic Project Pack ยท Human Manual

langfuse

Langfuse serves as a centralized platform for capturing and analyzing interactions between AI models and end users. The project is MIT licensed and supports both cloud-hosted and self-host...

Project Introduction

Related topics: System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Purpose

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section Frontend Stack

Continue reading this section for the full explanation and source context.

Related topics: System Architecture

Project Introduction

Langfuse is an open-source observability and analytics platform designed for Large Language Model (LLM) applications. It provides comprehensive tracing, evaluation, and prompt management capabilities that enable developers to monitor, debug, and optimize their AI-powered applications.

Overview

Langfuse serves as a centralized platform for capturing and analyzing interactions between AI models and end users. The project is MIT licensed and supports both cloud-hosted and self-hosted deployment models, making it accessible for teams of various sizes and requirements.

Core Purpose

The platform addresses several critical needs in LLM application development:

  • Observability: Track and visualize traces, observations, and model interactions in real-time
  • Evaluation: Measure and analyze AI application performance through configurable scoring systems
  • Prompt Management: Create, version, and manage prompts with support for complex dependency resolution
  • Collaboration: Enable team collaboration through commenting and sharing features
  • Analytics: Provide insights into AI application behavior through comprehensive analytics dashboards

High-Level Architecture

Langfuse follows a modern microservices-inspired architecture with clear separation between frontend, backend processing, and data storage components.

graph TD
    subgraph Frontend["Frontend (Next.js/React)"]
        UI[User Interface]
        DesignSystem[Design System Components]
        FeatureFlags[Feature Flags]
    end
    
    subgraph Backend["Backend Services"]
        API[API Server]
        MCP[MCP Server]
        Worker[Worker/Queue Processing]
    end
    
    subgraph Storage["Data Layer"]
        Postgres[(PostgreSQL)]
        ClickHouse[(ClickHouse)]
        Redis[(Redis)]
        S3[(S3 Storage)]
    end
    
    UI --> API
    MCP --> API
    Worker --> API
    API --> Postgres
    API --> ClickHouse
    API --> Redis
    API --> S3

Technology Stack

Langfuse is built using a modern technology stack optimized for performance and developer experience.

Frontend Stack

ComponentTechnologyPurpose
FrameworkNext.jsServer-side rendering and routing
UI LibraryReactComponent-based UI development
State ManagementReact Context + HooksLocal and global state
VirtualizationTanStack VirtualEfficient rendering of large lists
StylingTailwind CSS + cvaUtility-first styling with variant handling
FormsZodSchema validation
Data FetchingtRPCType-safe API communication

Sources: web/src/components/design-system/README.md

Backend Stack

ComponentTechnologyPurpose
RuntimeNode.js/TypeScriptServer-side logic
DatabasePostgreSQLPrimary data storage
AnalyticsClickHouseHigh-performance analytics queries
CacheRedisCaching and queue management
QueueBullMQBackground job processing
StorageS3-compatibleFile and event storage

Key Frontend Components

The frontend architecture is organized around several key systems:

#### Design System

The design system (web/src/components/design-system/) provides reusable, primitive UI components following strict principles:

  • Presentational only: No business logic in components
  • Explicit, typed APIs: Strict TypeScript definitions
  • No className/style props: Prevents style leakage
  • cva for variants: Consistent variant handling
graph LR
    A[Design System] --> B[Button]
    A --> C[Input]
    A --> D[Modal]
    B --> E[Consistent Styling]
    C --> E
    D --> E

Sources: web/src/components/design-system/README.md

#### Layout System

All pages use a standardized Page wrapper component that ensures:

  • Consistent layout structure
  • Sticky header behavior
  • Proper scroll management ("content-scroll" or "page-scroll")
  • Breadcrumb navigation support
  • Custom header actions
graph TD
    Page[Page Component] --> Header[Sticky Header]
    Page --> Content[Scrollable Content]
    Header --> Breadcrumb[Breadcrumb Navigation]
    Header --> Actions[Action Buttons]

Sources: web/src/components/layouts/README.md

#### JSON Viewer Component

The AdvancedJsonViewer component provides efficient rendering of large JSON datasets:

  • Virtualization: Uses TanStack Virtual for row-based rendering
  • Iterative algorithms: Explicit stack-based iteration to prevent stack overflow
  • Client-side search: In-memory matching with binary search navigation
  • Theme support: Customizable JSON syntax highlighting
graph TD
    Input[Large JSON Data] --> Parser[JSON Parser]
    Parser --> TreeBuilder[Tree Builder]
    TreeBuilder --> Virtualizer[TanStack Virtual]
    Virtualizer --> Renderer[Row Renderer]
    
    Search[Search Query] --> Matcher[In-Memory Matcher]
    Matcher --> Navigator[Binary Search Navigator]

Sources: web/src/components/ui/AdvancedJsonViewer/README.md

Core Features

Tracing and Observability

Langfuse provides comprehensive tracing capabilities that capture the full lifecycle of AI interactions:

  • Traces: Complete request/response cycles
  • Observations: Individual components within a trace (spans, events, generations)
  • Metadata: Custom metadata attachment for context
  • Tree Structure: Hierarchical representation of nested observations

The tree-building system uses iterative algorithms to handle millions of observations without stack overflow:

// Iterative traversal pattern
function traverse(rootNode: TreeNode) {
  const stack = [rootNode];
  while (stack.length > 0) {
    const node = stack.pop()!;
    process(node);
    node.children.forEach((child) => stack.push(child));
  }
}

Sources: web/src/components/trace/lib/tree-building.clienttest.ts

Prompt Management

Langfuse supports sophisticated prompt management with dependency resolution:

  • Prompt Stacking: Compose prompts from multiple sources
  • Dependency Tags: Reference other prompts using @@@langfusePrompt:...@@@ syntax
  • Resolution Modes:
  • getPromptResolved: Returns fully resolved prompt with dependencies inlined
  • getPromptUnresolved: Returns raw prompt with tags preserved for analysis
graph LR
    A[Prompt A] -->|references| B[Prompt B]
    A -->|references| C[Prompt C]
    B -->|references| D[Prompt D]
    A -->|resolved| E[Final Prompt]

Sources: web/src/features/mcp/README.md

Score Analytics

The scoring system enables quantitative evaluation of AI application performance:

  • Multiple Score Types: Supports numeric, categorical, and boolean scores
  • Time Series Analysis: Track score changes over configurable intervals
  • Distribution Analysis: Visualize score distributions with bins and categories
  • Comparison Mode: Compare two scores side-by-side

The analytics layer provides interpretive functions for common metrics:

MetricInterpretationThreshold
Agreement (Cohen's Kappa)Excellentโ‰ฅ 0.9
Agreement (Cohen's Kappa)Goodโ‰ฅ 0.8
Agreement (Cohen's Kappa)Fairโ‰ฅ 0.6
Agreement (Cohen's Kappa)Poorโ‰ฅ 0.4

Sources: web/src/features/score-analytics/lib/statistics-utils.ts

Entitlements System

Access control is managed through a hierarchical entitlements system:

graph TD
    Plan[Plan] -->|contains| Entitlements[Entitlements]
    Plan -->|contains| Limits[Entitlement Limits]
    
    Entitlements -->|grants| Features[Feature Access]
    Limits -->|restricts| Resources[Resource Quotas]
    
    PlanTypes[Plan Types] --> OSS[OSS]
    PlanTypes --> CloudPro[Cloud Pro]
    PlanTypes --> SelfHostedEnterprise[Self-Hosted Enterprise]

Available entitlements include:

  • Feature Flags: Enable/disable features via useIsFeatureEnabled hook
  • Entitlement Limits: Quotas on resources (e.g., annotation queue count)
  • Plan-based Access: Cloud and self-hosted enterprise plans

Sources: web/src/features/entitlements/README.md

Feature Flags

Feature flags control feature availability dynamically:

const isFeatureEnabled = useIsFeatureEnabled("feature-flag-name");

A feature flag is enabled when:

  1. Flag is in user's feature_flags list
  2. LANGFUSE_ENABLE_EXPERIMENTAL_FEATURES environment variable is set
  3. User has admin privileges

Sources: web/src/features/feature-flags/README.md

Collaboration Features

Langfuse includes team collaboration capabilities:

  • Mention Parser: Extract and resolve user mentions in comments
  • User References: Syntax @Display Name for linking users
  • Sanitization: Clean user-generated content for safe display
// Mention format: @[Alice](user:alice123)
// Parser extracts: alice123

Sources: web/src/features/comments/lib/mentionParser.clienttest.ts

Data Flow Architecture

Ingestion Pipeline

Events flow through the system in a structured pipeline:

graph LR
    S3[S3 Event Storage] --> Worker[Worker Processing]
    Worker -->|Standard| IngestionQueue[IngestionSecondaryQueue]
    Worker -->|OTEL| OtelQueue[OtelIngestionQueue]
    IngestionQueue --> Postgres[(PostgreSQL)]
    OtelQueue --> ClickHouse[(ClickHouse)]

Event processing includes:

  • Checkpointing: Resume from failures using .checkpoint files
  • Rate Limiting: Client-side and server-side throttling
  • Retry Logic: Exponential backoff with jitter for transient failures
  • Error Logging: Failed events appended to errors.csv

Sources: worker/src/scripts/replayIngestionEventsV2/README.md

API Architecture

Langfuse uses tRPC for type-safe API communication:

  • Server-side validation: Zod schemas for input validation
  • tRPC routers: Organized endpoint handlers
  • API versioning: V1 (legacy) and V2 (current) API support
API VersionGET SupportNotes
V1traceId requiredLegacy, trace-focused
V2traceId optionalAdds sessionId support

Sources: packages/shared/src/features/scores/interfaces/README.md

MCP Server Architecture

The Model Context Protocol (MCP) server provides external access to Langfuse:

  • Stateless per-request: Fresh server instance for each request
  • Context via closures: Authentication captured in handler closures
  • No session storage: Request-disposable architecture
graph TD
    Request[MCP Request] --> Instance[New Server Instance]
    Instance --> Auth[Auth Context Closure]
    Auth --> Handler[Request Handler]
    Handler --> Response[Response]
    Response --> Discard[Instance Discarded]

Sources: web/src/features/mcp/README.md

Filtering System

Langfuse implements a sophisticated filtering system with type-safe encoding:

Filter Types

TypeDescriptionExample
stringSimple string matchingTrace name
numberNumeric comparisonLatency values
datetimeDate/time filteringTime ranges
booleanTrue/false matchingFlag states
arrayOptionsMulti-value selectionTags
categoryOptionsCategorical filteringStatus values
positionInTraceNested locationSpan hierarchy

State Management

Filters support multiple storage locations:

  • URL: Persisted in query parameters
  • Session Storage: In-memory per session
  • Peek Context: Temporary state for preview panels
const filterOptions: UseSidebarFilterStateOptions = {
  stateLocation: "urlAndSessionStorage",
  sessionFilterContextId: projectId,
  implicitDefaultConfig: DEFAULT_SIDEBAR_IMPLICIT_ENVIRONMENT_CONFIG,
};

Sources: web/src/components/table/peek/README.md

Deployment Models

Langfuse supports multiple deployment configurations:

ModelPlan OptionsAuthenticationLicense
Cloudcloud:proJWT via NextAuthProprietary
Self-Hostedself-hosted:enterpriseJWT + License KeyProprietary
Open SourceossBasic authMIT

Self-Hosted Configuration

Self-hosted deployments require:

  • PostgreSQL database
  • ClickHouse for analytics
  • Redis for caching/queues
  • S3-compatible storage for events
  • License key for enterprise features

Sources: web/src/features/entitlements/README.md

Performance Considerations

Large Dataset Handling

Langfuse handles large datasets through several mechanisms:

ScaleMechanismThreshold
10k+ rowsTanStack VirtualRow-based rendering
1M+ nodesIterative algorithmsNo stack overflow
10k+ search matchesBinary searchEfficient navigation

Known Limitations

  • No horizontal virtualization for wide rows
  • Client-side search only (can be slow with many matches)
  • Memory constraints at 1M+ nodes
  • Read-only JSON viewer (no inline editing)
  • Wrap mode may cause layout thrashing

Sources: web/src/components/ui/AdvancedJsonViewer/README.md

Development Guidelines

Testing

The project uses Jest with custom extensions:

File PatternPurposeLocation
.clienttest.tsClient-side testsColocated with components
*.test.tsStandard unit tests__tests__ or inline
# Run client tests
pnpm --filter=web run test-client --testPathPattern="ComponentName"

Debugging

Enable debug logging via localStorage:

localStorage.setItem("debug:ComponentName", "true");

Summary

Langfuse is a comprehensive observability platform that bridges the gap between AI application development and operational monitoring. Its modular architecture, built on proven technologies like PostgreSQL, ClickHouse, and React, provides a scalable foundation for teams to understand, evaluate, and optimize their LLM applications.

The platform's emphasis on type safety, performance optimization, and developer experience makes it suitable for both small development teams and large-scale enterprise deployments.

Sources: [web/src/components/design-system/README.md](https://github.com/langfuse/langfuse/blob/main/web/src/components/design-system/README.md)

Project Structure

Related topics: Monorepo Configuration, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Workspace Configuration

Continue reading this section for the full explanation and source context.

Section Build System (Turborepo)

Continue reading this section for the full explanation and source context.

Section 1. Web Application (/web)

Continue reading this section for the full explanation and source context.

Related topics: Monorepo Configuration, System Architecture

Project Structure

Overview

Langfuse is a comprehensive observability and analytics platform for LLM applications, structured as a monorepo using pnpm workspaces and Turborepo for build orchestration. The repository contains multiple packages including the frontend web application, backend worker services, and shared libraries.

Monorepo Architecture

Workspace Configuration

Langfuse uses pnpm workspaces defined in pnpm-workspace.yaml to manage multiple packages within a single repository.

packages:
  - "packages/*"
  - "web"
  - "worker"

Sources: pnpm-workspace.yaml

Build System (Turborepo)

The project uses Turborepo for efficient incremental builds and task caching. The turbo configuration defines the build pipeline and dependencies between packages.

Sources: turbo.json

Package Structure

1. Web Application (`/web`)

The frontend application built with Next.js and React, containing the user interface and client-side logic.

DirectoryPurpose
src/components/Reusable UI components including layouts, tables, and specialized viewers
src/features/Feature-specific modules with their own logic and components
src/lib/Utility functions and helpers
src/hooks/Custom React hooks

Sources: web/package.json

2. Worker Service (`/worker`)

Backend service handling background processing, event ingestion, and queue management.

DirectoryPurpose
src/scripts/Utility scripts for data operations and migrations
src/queues/Queue handlers for async processing

Sources: worker/package.json

3. Shared Packages (`/packages/shared`)

Common utilities, types, and validation schemas shared across web and worker packages.

ModulePurpose
ValidationZod schemas for type-safe data validation
TypesShared TypeScript type definitions
UtilitiesCommon helper functions

Sources: packages/shared/package.json

Web Application Structure

Component Architecture

graph TD
    A[Web Application] --> B[Layout Components]
    A --> C[UI Components]
    A --> D[Feature Modules]
    
    B --> B1[Page Wrapper]
    B --> B2[ContainerPage]
    B --> B3[Breadcrumb]
    
    C --> C1[AdvancedJsonViewer]
    C --> C2[Table Components]
    C --> C3[Peek Components]
    
    D --> D1[MCP]
    D --> D2[Score Analytics]
    D --> D3[Comments]
    D --> D4[Filters]
    D --> D5[Entitlements]

Layout System

The Page component is the standard wrapper for all pages, providing:

  • Sticky Header: Consistent header across pages
  • Scroll Management: Supports "content-scroll" and "page-scroll" modes
  • Breadcrumb Navigation: Easy navigation path display
  • Custom Header Actions: Flexible button/link placement

For content that doesn't scale well with page width (e.g., settings pages), use ContainerPage instead.

Sources: web/src/components/layouts/README.md

Key Feature Modules

#### AdvancedJsonViewer

A virtualized JSON tree viewer with the following characteristics:

  • Performance: Uses TanStack Virtual for rendering large JSON structures
  • Search: Client-side search with regex support
  • Theme Support: Multiple color themes (GitHub, Monokai, Solarized)
  • Tree Navigation: Binary search for efficient node access

Sources: web/src/components/ui/AdvancedJsonViewer/README.md

#### Score Analytics

Provides analytics dashboard capabilities for score data:

  • Score Comparison: Compare two scores over time
  • Distribution Analysis: Histogram and heatmap visualizations
  • Time Series: Temporal trend analysis with configurable intervals
  • Data Transformation: Pure functions for data processing

Sources: web/src/features/score-analytics/README.md

#### MCP (Model Context Protocol)

Enables integration with external systems through MCP:

  • Stateless Architecture: Fresh server instance per request
  • Prompt Management: Support for getPrompt and getPromptUnresolved
  • Resource Handling: MCP resources and tool support

Sources: web/src/features/mcp/README.md

#### Entitlements

Feature availability control system:

  • Plan-based Access: oss, cloud:pro, self-hosted:enterprise
  • Entitlement Limits: Resource quotas per plan
  • Server/Client Support: Available in both frontend hooks and backend

Sources: web/src/features/entitlements/README.md

Table and Peek System

#### PeekTableStateProvider

Manages table state for peek views (slide-over panels showing item details):

graph LR
    A[PeekTableStateProvider] --> B[Filters]
    A --> C[Sorting]
    A --> D[Pagination]
    A --> E[Search]

State Persistence: Filter, sort, and pagination state persists across K/J navigation between items of the same type.

State Reset: State clears when the peek view closes (via X button, Escape, or click outside).

Sources: web/src/components/table/peek/README.md

Worker Service Structure

Scripts

The worker contains utility scripts for data operations:

ScriptPurpose
replayIngestionEventsV2Replay events from CSV to ingestion queues
refillQueueEventBackfill queue with events from local files

#### Replay Ingestion Events V2

Replays S3-stored events back to Langfuse:

  • Batch Processing: Processes events in configurable batches
  • Checkpoint Support: Resume capability via checkpoint files
  • Rate Limiting: Respects server-side rate limits with exponential backoff
  • Error Handling: Retries transient failures, logs permanent failures

Sources: worker/src/scripts/replayIngestionEventsV2/README.md

#### Refill Queue Event

Backfills queues with events from local JSONL files:

  1. Create ./worker/events.jsonl with one JSON event per line
  2. Configure Redis connection and supporting services
  3. Run via pnpm run --filter=worker refill-queue-event

Sources: worker/src/scripts/refillQueueEvent/README.md

Public API Architecture

Adding New API Routes

The project follows a structured pattern for public API development:

  1. Implementation: Wrap routes with withMiddleware and createAuthedAPIRoute
  2. Type Definition: Add Zod types to /features/public-api/types using coerce for primitives
  3. Validation: Use validateZodSchema for response validation
  4. Documentation: Add to Fern with docs attributes
  5. SDK Updates: Copy types to Python and JS SDKs
// Response type example
const responseSchema = z.object({
  data: z.string(),
  timestamp: z.coerce.date(),
}).strict();

Sources: web/src/features/README.md

Testing Infrastructure

Client-Side Testing

Tests use .clienttest.ts extension and are colocated with components:

pnpm --filter=web run test-client --testPathPattern="ComponentName"

Example: AdvancedJsonViewer tests cover tree building, navigation, expansion, and search operations.

Sources: web/src/components/ui/AdvancedJsonViewer/README.md

Trace Tree Building Tests

Performance tests for large observation sets:

ScaleThresholdStructure Types
10k500msflat, deep, balanced, realistic
25k2sflat, realistic
50k5sflat, realistic
500k60srealistic

Sources: web/src/components/trace/lib/tree-building.clienttest.ts

Development Workflow

Component Development Guidelines

  1. Use Page Wrapper: Always wrap pages with <Page> component
  2. Use ContainerPage: For settings/setup pages with non-scalable content
  3. Follow Naming: Use .clienttest.ts for client-side tests
  4. State Management: Use useSidebarFilterState for filters, useOrderByState for sorting

Prompt Composition

For MCP prompt features:

  • Resolved Prompt: Use getPrompt for executable prompts with dependencies resolved
  • Unresolved Prompt: Use getPromptUnresolved for debugging and analysis

Sources: web/src/features/mcp/README.md

Summary

The Langfuse project is organized as a well-structured monorepo with clear separation between the frontend web application, backend worker services, and shared packages. The architecture emphasizes:

  • Modularity: Feature-based organization with isolated modules
  • Performance: Virtualization and incremental builds
  • Type Safety: Zod schemas and TypeScript throughout
  • Observability: Built-in tracing and analytics capabilities

Sources: [pnpm-workspace.yaml](https://github.com/langfuse/langfuse/blob/main/pnpm-workspace.yaml)

Quickstart Guide

Related topics: Project Introduction

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section 1. Clone the Repository

Continue reading this section for the full explanation and source context.

Related topics: Project Introduction

Quickstart Guide

Langfuse is an open-source LLM engineering platform that provides observability, analytics, and prompt management for LLM applications. This guide walks you through setting up a local development environment, understanding the core architecture, and deploying Langfuse for production use.

Overview

Langfuse supports multiple deployment scenarios:

Deployment TypeUse CaseKey Components
Local DevelopmentTesting and developmentDocker Compose with all services
Self-HostedProduction deploymentDocker/Kubernetes with externalized services
CloudManaged SaaS offeringLangfuse-hosted infrastructure

Sources: README.md:1-20

Architecture Overview

Langfuse consists of three main components:

graph TD
    A[Web UI] --> B[Server API]
    B --> C[(PostgreSQL)]
    B --> D[(ClickHouse)]
    B --> E[(Redis)]
    F[Workers] --> C
    F --> D
    F --> E
    G[S3 Storage] --> F

Core Components

ComponentPurposeTechnology
web/Frontend UI applicationNext.js, React, TanStack
server/API server and business logicNode.js, tRPC, Prisma
worker/Background job processingBullMQ, Redis
clickhouse/Analytics storageClickHouse

Sources: docker-compose.yml:1-50

Local Development Setup

Prerequisites

  • Node.js 20+ (using pnpm as package manager)
  • Docker and Docker Compose
  • Git

1. Clone the Repository

git clone https://github.com/langfuse/langfuse.git
cd langfuse

2. Environment Configuration

Create a .env file in the root directory with the following required variables:

# Database
DATABASE_URL=postgresql://langfuse:langfuse@localhost:5432/langfuse

# ClickHouse
CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USER=langfuse
CLICKHOUSE_PASSWORD=langfuse

# Redis
REDIS_URL=redis://localhost:6379

# Auth (NextAuth.js)
NEXTAUTH_SECRET=your-secret-key
NEXTAUTH_URL=http://localhost:3000

# S3 Storage (MinIO for local dev)
S3_ACCESS_KEY=langfuse
S3_SECRET_KEY=langfuse
S3_REGION=us-east-1
S3_ENDPOINT_URL=http://localhost:9000
S3_EVENT_UPLOAD_BUCKET=langfuse

Sources: docker-compose.yml:50-120

3. Start Infrastructure Services

Launch all supporting services using Docker Compose:

docker compose up -d

This starts the following services:

ServicePortPurpose
postgres5432Primary database
clickhouse8123Analytics storage
redis6379Job queue broker
minio9000/9001S3-compatible storage

Sources: docker-compose.yml:120-180

4. Install Dependencies

pnpm install

5. Run Database Migrations

pnpm db:migrate

6. Start Development Servers

Langfuse uses a monorepo structure with multiple development servers:

# Start all services in development mode
pnpm run dev

# Or start individual services
pnpm --filter=server run dev    # API server
pnpm --filter=web run dev       # Web UI
pnpm --filter=worker run dev    # Background workers

Sources: CONTRIBUTING.md:50-100

Project Structure

langfuse/
โ”œโ”€โ”€ web/                      # Next.js frontend
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ components/       # Reusable UI components
โ”‚   โ”‚   โ”œโ”€โ”€ features/         # Feature modules
โ”‚   โ”‚   โ”œโ”€โ”€ pages/            # Next.js pages
โ”‚   โ”‚   โ””โ”€โ”€ lib/              # Utilities
โ”‚   โ””โ”€โ”€ public/               # Static assets
โ”œโ”€โ”€ server/                   # API server
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ api/              # tRPC routers
โ”‚   โ”‚   โ”œโ”€โ”€ services/         # Business logic
โ”‚   โ”‚   โ””โ”€โ”€ lib/              # Utilities
โ”œโ”€โ”€ worker/                   # Background workers
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ workers/          # Queue processors
โ”‚       โ””โ”€โ”€ scripts/          # Utility scripts
โ”œโ”€โ”€ clickhouse/               # ClickHouse migrations
โ””โ”€โ”€ docker-compose.yml        # Local infrastructure

Page Component Pattern

The Page component is the standard wrapper for all pages in the application:

import Page from "@/src/components/layouts/Page";

export default function MyPage() {
  return (
    <Page
      title="My Page"
      scrollable
      headerProps={{
        breadcrumb: [{ name: "Home", href: "/" }, { name: "My Page" }],
      }}
    >
      <div>Content here...</div>
    </Page>
  );
}

Important: Every page must be wrapped inside <Page>โ€”do not use <main> directly.

Sources: web/src/components/layouts/README.md:1-40

Development Workflow

Running Tests

Langfuse uses different test patterns for client and server code:

# Run all tests
pnpm run test

# Client-side tests (Vitest with .clienttest.ts extension)
pnpm --filter=web run test-client --testPathPattern="ComponentName"

# Server-side tests (Jest)
pnpm --filter=server run test

#### Client Test Pattern

Client tests use stack-based iteration to avoid stack overflow:

// โœ… Safe for deep trees - iterative approach
function traverse(rootNode: TreeNode) {
  const stack = [rootNode];
  while (stack.length > 0) {
    const node = stack.pop()!;
    process(node);
    node.children.forEach((child) => stack.push(child));
  }
}

Sources: web/src/components/ui/AdvancedJsonViewer/README.md:80-100

Debug Mode

Enable detailed logging for specific components:

localStorage.setItem("debug:AdvancedJsonViewer", "true");

Code Quality

# Lint code
pnpm run lint

# Format code
pnpm run format

# Type check
pnpm run typecheck

Sources: CONTRIBUTING.md:100-150

Feature Modules

Langfuse organizes functionality into feature modules under web/src/features/:

ModulePurpose
comments/User comments and mentions
entitlements/Feature access control
feature-flags/Feature toggle system
filters/Query filtering and search
mcp/Model Context Protocol integration
score-analytics/Score analytics and visualization
slack/Slack integration
migrations/Database migrations

Feature Flags

Enable experimental features using the useIsFeatureEnabled hook:

const isEnabled = useIsFeatureEnabled("feature-flag-name");

A feature is enabled when:

  1. The flag is in user.feature_flags
  2. LANGFUSE_ENABLE_EXPERIMENTAL_FEATURES is set
  3. The user has admin privileges

Sources: web/src/features/feature-flags/README.md:1-15

Entitlements System

Feature availability is controlled through entitlements:

  • Plans: Tiers of features (oss, cloud:pro, self-hosted:enterprise)
  • Entitlements: Available features per plan (e.g., playground)
  • EntitlementLimits: Resource limits (e.g., annotation-queue-count)

Sources: web/src/features/entitlements/README.md:1-25

Worker Scripts

The worker module includes utility scripts for data operations:

Refill Queue Event

Backfill any queue with events from local machines:

# 1. Create events file (./worker/events.jsonl)
{"projectId": "project-123", "orgId": "org-456"}

# 2. Configure environment
REDIS_CONNECTION_STRING=redis://:[email protected]:6379
CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USER=clickhouse
CLICKHOUSE_PASSWORD=clickhouse

# 3. Run the script
pnpm run --filter=worker refill-queue-event

Sources: worker/src/scripts/refillQueueEvent/README.md:1-40

Replay Ingestion Events V2

Re-process S3-stored ingestion events:

npx tsx worker/src/scripts/replayIngestionEventsV2/index.ts \
  --input=/path/to/events.csv \
  --batch-size=500 \
  --concurrency=4
ParameterDefaultDescription
--input-Path to CSV file (required)
--batch-size500Keys per API request
--concurrency4Parallel API requests
--rate-limit50Requests per second
--dry-runfalseValidate without sending
--resumefalseContinue from checkpoint

Sources: worker/src/scripts/replayIngestionEventsV2/README.md:1-50

Slack Integration Setup

For local Slack OAuth development, HTTPS is required:

1. Generate SSL Certificates

# Install mkcert
brew install mkcert
mkcert -install
mkcert localhost 127.0.0.1

# Move certificates to web directory
mv localhost+1*.pem web/

2. Configure Environment

SLACK_CLIENT_ID=your_client_id
SLACK_CLIENT_SECRET=your_client_secret
SLACK_STATE_SECRET=your_state_secret

3. Start HTTPS Server

pnpm run dev:https

Sources: web/src/features/slack/README.md:1-50

Production Deployment

Docker Compose Production Mode

For production, use externalized services:

services:
  web:
    image: langfuse/langfuse-web:latest
    environment:
      - DATABASE_URL=${DATABASE_URL}
      - CLICKHOUSE_URL=${CLICKHOUSE_URL}
      - REDIS_URL=${REDIS_URL}
      - NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
      - S3_ACCESS_KEY=${S3_ACCESS_KEY}
      - S3_SECRET_KEY=${S3_SECRET_KEY}
    ports:
      - "3000:3000"

  server:
    image: langfuse/langfuse-server:latest
    # Configuration similar to web

  worker:
    image: langfuse/langfuse-worker:latest
    depends_on:
      - redis
      - postgres
      - clickhouse

S3 Event Storage

Configure S3 bucket for event storage:

-- Example: Create external table for S3 access logs
CREATE EXTERNAL TABLE s3_access_logs (
  bucketowner STRING,
  bucket_name STRING,
  requestdatetime STRING,
  remoteip STRING,
  requester STRING,
  requestid STRING,
  operation STRING,
  key STRING,
  uri STRING,
  statuscode INT,
  errorcode STRING,
  bytessent BIGINT,
  objectsize BIGINT,
  totaltime STRING,
  turnaroundtime STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  'input.regex'='([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ...'
)
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
LOCATION 's3://your-bucket/logs/'

Sources: docker-compose.yml:180-220

Common Issues and Solutions

IssueSolution
Stack overflow in tree operationsUse iterative algorithms with explicit stacks
Large dataset performanceEnable virtualization (TanStack Virtual)
Horizontal scroll performanceAvoid wrap mode for wide datasets
Multiple tables in peek viewShare pagination state intentionally

Sources: web/src/components/ui/AdvancedJsonViewer/README.md:40-60

Next Steps

After setting up your development environment:

  1. Explore the UI - Navigate through traces, observations, and evaluations
  2. Integrate SDK - Connect your LLM application using Langfuse Python/JS SDK
  3. Configure Features - Set up feature flags and entitlements for your organization
  4. Deploy - Move to production using Docker Compose or Kubernetes

Sources: CONTRIBUTING.md:1-50

Sources: [README.md:1-20]()

System Architecture

Related topics: Project Structure, Database Schema (Prisma), Queue System (Redis/BullMQ)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Database Connection

Continue reading this section for the full explanation and source context.

Section Redis Client

Continue reading this section for the full explanation and source context.

Section ClickHouse Integration

Continue reading this section for the full explanation and source context.

Related topics: Project Structure, Database Schema (Prisma), Queue System (Redis/BullMQ)

System Architecture

Langfuse is a comprehensive observability and analytics platform designed for Large Language Model (LLM) applications. The system architecture is built on a modern, modular design that separates concerns across frontend, backend worker services, and shared infrastructure layers.

Overview

Langfuse follows a distributed architecture pattern with the following primary components:

LayerTechnology StackPurpose
FrontendNext.js, React, TypeScriptUser interface and visualization
Backend WorkerNode.js, BullMQ, TypeScriptEvent processing and queue management
Shared PackagesTypeScriptCommon utilities, types, and infrastructure clients
DatabasePostgreSQLPrimary data storage
Cache/QueueRedisQueue management and caching
AnalyticsClickHouseHigh-performance analytics queries
ObservabilityOpenTelemetryDistributed tracing

High-Level Architecture

graph TD
    subgraph Client["Frontend (Next.js)"]
        UI[User Interface]
        Pages[Page Components]
        DesignSystem[Design System]
    end
    
    subgraph Shared["Shared Packages"]
        DB[(PostgreSQL)]
        Redis[(Redis)]
        ClickHouse[(ClickHouse)]
        Otel[OpenTelemetry]
    end
    
    subgraph Backend["Worker Service"]
        Queues[Queue Workers]
        Scripts[Utility Scripts]
    end
    
    Client <-->|tRPC API| Shared
    Backend <-->|Event Processing| Shared
    Client -->|Ingestion| Backend

Infrastructure Layer

Database Connection

The PostgreSQL database is the central data store for Langfuse, managed through a shared database client module. The connection is centralized in the packages/shared/src/db.ts module, which provides a unified interface for all database operations across the application.

Sources: packages/shared/src/db.ts:1-50

Redis Client

Redis serves dual purposes in the Langfuse architecture:

  1. Queue Management: BullMQ queues for asynchronous event processing
  2. Caching: Session and temporary data caching

The Redis client is configured in packages/shared/src/server/redis/redis.ts and is shared across worker services.

Sources: packages/shared/src/server/redis/redis.ts:1-30

ClickHouse Integration

ClickHouse provides the analytical query engine for high-performance aggregations. The client is initialized in packages/shared/src/server/clickhouse/client.ts and is primarily used for:

  • Score analytics aggregations
  • Time-series data analysis
  • Large dataset transformations

Sources: packages/shared/src/server/clickhouse/client.ts:1-40

OpenTelemetry

The OpenTelemetry integration (packages/shared/src/server/otel/index.ts) provides distributed tracing across all services. This enables:

  • Request tracing across frontend and backend
  • Event processing workflow visibility
  • Performance monitoring

Sources: packages/shared/src/server/otel/index.ts:1-60

Frontend Architecture

Page Structure

All frontend pages use a standardized layout system defined in web/src/components/layouts/. The Page component is the required wrapper for all application pages, ensuring consistent layout behavior.

Key layout patterns:

PatternComponentUse Case
Standard PagesPageMost application pages
Wide ContentContainerPageSettings, setup pages with wide content

The page wrapper provides:

  • Sticky header management
  • Scroll behavior control (content-scroll or page-scroll)
  • Breadcrumb navigation
  • Custom header actions

Sources: web/src/components/layouts/README.md:1-60

Design System

The design system (web/src/components/design-system/) provides primitive, reusable UI components following strict architectural principles:

Principles:

  • Presentational only (no business logic)
  • Explicit, strictly typed APIs
  • Props over context (no React Context)

Component Structure:

design-system/
  Button/
    Button.tsx
    Button.stories.tsx

Styling Rules:

  • No arbitrary CSS values
  • Explicit enums for variants (size: "sm" | "md" | "lg")
  • CVA (Class Variance Authority) for variant management
  • Boolean props use positive naming (isLoading, shouldTruncate)

Sources: web/src/components/design-system/README.md:1-80

State Management Patterns

Langfuse uses a sophisticated state management approach with the Peek Table State system:

graph TD
    A[Page Load] --> B{Peek Context?}
    B -->|Yes| C[PeekTableStateProvider]
    B -->|No| D[URL/Session State]
    C --> E[Table State Preserved]
    D --> F[Standard State]
    
    E --> G[K/J Navigation]
    G --> H[State Retained โœ“]

The PeekTableStateProvider maintains table state (filters, sorting, pagination) across K/J keyboard navigation between items of the same type. State resets only when the peek view closes.

Sources: web/src/components/table/peek/README.md:1-100

Feature Modules

Entitlements System

The entitlements feature controls feature availability at the organization level:

ConceptDefinition
PlanFeature tier (OSS, cloud:pro, self-hosted:enterprise)
EntitlementAvailable feature (e.g., playground, score analytics)
EntitlementLimitResource limits (e.g., annotation-queue-count)

Plan Resolution:

  • Cloud: Added to organization via JWT from NextAuth
  • Self-hosted: From license key or environment configuration

Sources: web/src/features/entitlements/README.md:1-50

Score Analytics

The score analytics feature provides comprehensive statistical analysis of evaluation scores:

Architecture Components:

ComponentLocationResponsibility
ProviderScoreAnalyticsProvider.tsxContext management
HookuseScoreAnalyticsQueryData fetching and transformation
TransformersscoreAnalyticsTransformers.tsData transformation pipeline
RouterscoreAnalyticsRouter.tstRPC API endpoint

Data Flow:

graph LR
    A[API Request] --> B[tRPC Router]
    B --> C[ClickHouse Query]
    C --> D[Transformers]
    D --> E[ScoreAnalyticsProvider]
    E --> F[Chart Components]

Sources: packages/shared/src/features/scores/interfaces/README.md:1-80 Sources: web/src/features/score-analytics/README.md:1-100

Score Interfaces Architecture

Langfuse maintains a versioned API structure for scores:

interfaces/
โ”œโ”€โ”€ api/
โ”‚   โ”œโ”€โ”€ v1/    # Legacy API (trace-focused)
โ”‚   โ”œโ”€โ”€ v2/    # Current API (supports traces, sessions)
โ”‚   โ””โ”€โ”€ shared.ts
โ”œโ”€โ”€ application/
โ”œโ”€โ”€ ingestion/
โ””โ”€โ”€ ui/

API Versioning Strategy:

  • POST/DELETE APIs: Support all score types across v1 and v2
  • GET APIs:
  • V1: Requires traceId, trace-level only
  • V2: traceId optional, adds sessionId support

Sources: packages/shared/src/features/scores/interfaces/README.md:1-50

Backend Worker Architecture

Event Processing

The worker service processes ingestion events asynchronously using BullMQ queues:

Queue Types:

QueuePurposeConsumer
IngestionSecondaryQueueStandard event processingWorker
OtelIngestionQueueOpenTelemetry eventsWorker

Event Flow:

graph LR
    A[Ingestion API] --> B{Event Type?}
    B -->|Standard| C[S3 Key Parse]
    B -->|OTEL| D[OTEL Key Parse]
    C --> E[Queue Payload]
    D --> E
    E --> F[BullMQ]
    F --> G[Worker Processing]
    G --> H[(ClickHouse)]
    G --> I[(PostgreSQL)]

Sources: worker/src/scripts/replayIngestionEventsV2/README.md:1-60

Replay Ingestion Events V2

The replayIngestionEventsV2 script enables replaying historical events from S3 storage:

Key Features:

  • Batch processing with configurable size
  • Checkpoint/resume capability
  • Rate limiting with exponential backoff
  • Error handling with detailed logging

Differences from V1:

AspectV1V2
InfrastructureRedis, ClickHouse, PostgreSQL, S3Langfuse host URL only
SetupFull repo clone + .envnpx tsx + env vars
Event DeliveryBullMQ addBulk to RedisHTTP POST to admin API
ResumeManualBuilt-in checkpoint

Event Transformation:

Sources: worker/src/scripts/replayIngestionEventsV2/README.md:1-120

Refill Queue Event

The refillQueueEvent utility script backfills queues with events from local files:

Usage Pattern:

pnpm run --filter=worker refill-queue-event

Requirements:

  • ./worker/events.jsonl file with JSON events
  • Redis connection via REDIS_CONNECTION_STRING
  • Supporting services: S3, ClickHouse

Sources: worker/src/scripts/refillQueueEvent/README.md:1-60

MCP Server Architecture

Langfuse includes an MCP (Model Context Protocol) server for programmatic prompt management:

Stateless Design:

  1. Fresh server instance per request
  2. Authentication context captured in handler closures
  3. Server discarded after request completes
  4. No state between requests

Available Tools:

ToolPurpose
getPromptFetch resolved prompt with dependencies
getPromptUnresolvedFetch raw prompt without resolution
listPromptsList prompts with filtering
createTextPromptCreate text prompt version
createChatPromptCreate chat prompt version
updatePromptLabelsManage prompt labels

Prompt Resolution:

  • Resolved: Recursively replaces @@@langfusePrompt:...@@@ tags
  • Unresolved: Returns raw content with tags intact

Sources: web/src/features/mcp/README.md:1-100

Component Communication Flow

graph TD
    subgraph Pages["Page Layer"]
        P[Page Component]
        PC[PeekTableStateProvider]
    end
    
    subgraph Features["Feature Layer"]
        FP[Feature Provider]
        FH[Feature Hook]
        FT[Feature Transformers]
    end
    
    subgraph Services["Service Layer"]
        TRPC[tRPC Router]
        API[API Routes]
    end
    
    subgraph Data["Data Layer"]
        Repo[Repositories]
        DB[(PostgreSQL)]
        CH[(ClickHouse)]
        Redis[(Redis)]
    end
    
    P --> PC
    PC --> FP
    FP --> FH
    FH --> FT
    FT --> TRPC
    TRPC --> Repo
    Repo --> DB
    Repo --> CH
    Repo --> Redis

Configuration Management

Langfuse uses environment-based configuration across layers:

EnvironmentScopeExamples
REDIS_CONNECTION_STRINGWorker, QueueRedis URL
CLICKHOUSE_URLAnalyticsClickHouse connection
LANGFUSE_S3_EVENT_UPLOAD_BUCKETStorageS3 bucket name
ADMIN_API_KEYAdmin APIAuthentication

Security Architecture

Key Security Components:

  1. JWT Authentication: Organization and user context embedded in JWT tokens
  2. API Key Validation: Admin API uses dedicated key authentication
  3. Scope-based Authorization: Project-level access control
  4. Plan Entitlements: Feature availability based on subscription tier

Self-hosted Considerations:

  • License key validation for enterprise features
  • Environment-based plan override capability

Performance Optimizations

Frontend Optimizations:

  • Memoized transformations in hooks
  • Virtualized table rendering (TanStack Virtual)
  • Iterative algorithms (no recursion, preventing stack overflow)

Backend Optimizations:

  • Batch processing for event replay
  • Checkpoint/resume for long-running operations
  • Client-side rate limiting with exponential backoff

Summary

The Langfuse architecture demonstrates a well-structured approach to observability platforms:

  • Separation of Concerns: Clear boundaries between UI, business logic, and data layers
  • Scalability: Asynchronous processing via queues enables horizontal scaling
  • Extensibility: Feature modules and MCP server support programmatic access
  • Observability: Built-in OpenTelemetry integration for distributed tracing
  • Performance: ClickHouse for analytics, iterative algorithms, and batch processing

Sources: [packages/shared/src/db.ts:1-50]()

Monorepo Configuration

Related topics: Project Structure, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Pipeline Tasks

Continue reading this section for the full explanation and source context.

Section Task Dependencies

Continue reading this section for the full explanation and source context.

Section Configuration Features

Continue reading this section for the full explanation and source context.

Related topics: Project Structure, System Architecture

Monorepo Configuration

Langfuse uses a monorepo architecture managed with Turborepo, pnpm workspaces, and shared configuration packages. This setup enables efficient builds, consistent code quality standards, and streamlined dependency management across the project's multiple packages.

Architecture Overview

The Langfuse repository is organized as a pnpm workspace monorepo with the following core structure:

langfuse/
โ”œโ”€โ”€ web/                    # Next.js frontend application
โ”œโ”€โ”€ worker/                 # Background job processing
โ”œโ”€โ”€ packages/
โ”‚   โ”œโ”€โ”€ shared/            # Shared utilities and types
โ”‚   โ”œโ”€โ”€ config-eslint/     # Shared ESLint configuration
โ”‚   โ””โ”€โ”€ config-typescript/ # Shared TypeScript configurations
โ”œโ”€โ”€ turbo.json             # Turborepo pipeline definition
โ””โ”€โ”€ pnpm-workspace.yaml    # Workspace package definitions

Turborepo Pipeline Configuration

The turbo.json file defines the build pipeline and task orchestration across packages.

Core Pipeline Tasks

TaskDescriptionCache Strategy
buildCompiles TypeScript and bundles assetsEnabled
devStarts development serversLocal only
testRuns unit and integration testsEnabled
lintESLint code quality checksEnabled
typecheckTypeScript type validationEnabled

Task Dependencies

Turborepo automatically resolves dependencies between packages. For example:

graph TD
    A[packages/shared] -->|build| B[Type definitions]
    B --> C[packages/config-*]
    C --> D[web]
    C --> E[worker]
    D --> F[Build output]
    E --> F

Shared ESLint Configuration

The packages/config-eslint/index.js provides standardized ESLint rules across all packages.

Configuration Features

  • React/Next.js support for the web application
  • TypeScript-aware linting via @typescript-eslint
  • Import ordering rules for consistent module organization
  • JSX accessibility checks

Usage

Packages extend the shared configuration:

// In package's .eslintrc.js
module.exports = {
  extends: ['@langfuse/config-eslint'],
  // Package-specific overrides
  rules: {
    // Custom rules
  }
};

Sources: packages/config-eslint/index.js

Shared TypeScript Configuration

The packages/config-typescript/ directory contains base TypeScript configurations.

Base Configuration (`base.json`)

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  }
}

Package-Specific Configurations

Individual packages extend the base configuration:

// packages/shared/tsconfig.json
{
  "extends": "@langfuse/config-typescript/base.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src"
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

Sources: packages/config-typescript/base.json, packages/shared/tsconfig.json

Package Scripts and Commands

Each package defines its own scripts in package.json. The web package demonstrates the typical pattern:

Build Commands

CommandPurpose
pnpm buildProduction build with INLINE_RUNTIME_CHUNK=false
pnpm build:checkBuild without emitting (type-checking)
pnpm devDevelopment server on localhost:3000
pnpm dev:httpHTTPS development server for local testing

Quality Assurance Commands

CommandPurpose
pnpm lintESLint with caching enabled
pnpm lint:fixAuto-fix linting issues
pnpm typecheckTypeScript validation with incremental compilation
pnpm testVitest with server and in-source test projects

Sources: web/package.json

Development Workflow

Starting Development

# Install dependencies
pnpm install

# Start all dev servers based on turbo.json
pnpm dev

# Or start a specific package
cd web && pnpm dev

Running Tests

# All tests
pnpm test

# Client-side tests only (e.g., AdvancedJsonViewer)
pnpm --filter=web run test-client --testPathPattern="AdvancedJsonViewer"

# Server-side tests
pnpm --filter=web run test --project server

Build Pipeline

graph LR
    A[pnpm build] --> B[turbo build]
    B --> C{Cache hit?}
    C -->|Yes| D[Use cached output]
    C -->|No| E[Build dependencies]
    E --> F[packages/shared]
    F --> G[packages/config-*]
    G --> H[web/worker]
    H --> I[Save to cache]
    I --> J[Build artifacts]

Environment Configuration

The project uses .env files for environment-specific configuration:

FilePurpose
.envDefault environment variables
.env.localLocal overrides (git-ignored)
.env.testTest environment variables

Scripts use dotenv to load these files:

dotenv -e ../.env -- next build
dotenv -e ../.env.test -e ../.env -- vitest run

Sources: worker/src/scripts/replayIngestionEventsV2/README.md

Best Practices

Adding a New Package

  1. Create the package under packages/ or web/ directories
  2. Extend the shared TypeScript and ESLint configurations
  3. Add the package to pnpm-workspace.yaml if needed
  4. Define tasks in turbo.json if custom pipeline is required
  5. Add appropriate scripts to package.json

Caching Strategy

CI/CD Integration

In CI environments, clear caches to ensure fresh builds:

# Clear turbo cache
rm -rf .turbo node_modules/.cache

# Clear ESLint cache
rm -rf .next/cache/eslint/

# Fresh build
pnpm install --frozen-lockfile
pnpm build

Key Configuration Files Summary

FilePurpose
turbo.jsonTask pipeline and dependency graph
pnpm-workspace.yamlWorkspace package definitions
packages/config-eslint/index.jsShared ESLint rules
packages/config-typescript/base.jsonShared TypeScript base config
.eslintrc.js (per package)Package-specific lint overrides
tsconfig.json (per package)Package-specific TypeScript config

Sources: [packages/config-eslint/index.js]()

Database Schema (Prisma)

Related topics: System Architecture, ClickHouse Analytics Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Design Philosophy

Continue reading this section for the full explanation and source context.

Section Organization and User Models

Continue reading this section for the full explanation and source context.

Section Project Model

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, ClickHouse Analytics Layer

Database Schema (Prisma)

Overview

Langfuse uses Prisma ORM to manage its PostgreSQL database schema. The schema defines the core data models for the application, including organizations, projects, users, traces, observations, and scores. Prisma serves as the primary interface between the application's business logic and the relational database.

The Prisma schema is located at packages/shared/prisma/schema.prisma and is shared across multiple packages in the monorepo structure. This centralized schema approach ensures consistency in data modeling across the web application, worker services, and shared libraries.

Design Philosophy

The schema follows several key principles:

  • Normalized relationships: Related entities are linked through foreign keys with proper cascading behaviors
  • Soft deletes: Key entities support soft deletion for data recovery and audit purposes
  • Audit fields: Most tables include createdAt, updatedAt, and createdBy fields
  • Multi-tenancy: The schema supports multi-tenant architecture with organization and project isolation
  • Extensible metadata: JSON fields allow flexible storage of custom attributes

Core Data Models

Organization and User Models

The foundation of Langfuse's multi-tenant architecture begins with the Organization model, which represents the top-level tenant entity. Each organization can have multiple users with different roles and permission levels.

The User model stores authentication and profile information, linked to organizations through the Membership junction table. This many-to-many relationship enables users to belong to multiple organizations with potentially different roles in each.

erDiagram
    Organization ||--o{ User : "contains"
    Organization ||--o{ Project : "contains"
    User ||--o{ Membership : "has"
    Membership }o--|| Organization : "belongs to"
    Membership }o--|| User : "belongs to"

Project Model

Projects serve as the primary container for observability data. Each project belongs to exactly one organization and contains all traces, observations, and scores related to a specific application or use case.

FieldTypeDescription
idStringUUID primary key
nameStringProject display name
organizationIdStringForeign key to organization
createdAtDateTimeCreation timestamp
updatedAtDateTimeLast modification timestamp
deletedAtDateTime?Soft delete timestamp
settingsJsonProject-specific configuration

Sources: packages/shared/prisma/schema.prisma

Traces

Traces represent the top-level unit of observability in Langfuse. A trace encapsulates a complete interaction or request, typically corresponding to a single LLM call or a multi-step workflow.

Trace Model Schema

model Trace {
  id            String   @id @default(cuid())
  name          String?
  project       Project  @relation(fields: [projectId], references: [id])
  projectId     String
  user          String?
  metadata      Json?
  sessionId     String?
  release       String?
  version       String?
  tags          String[]
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Soft delete
  deletedAt     DateTime?
  
  // Relations
  observations  Observation[]
  scores        Score[]
  
  @@index([projectId])
  @@index([sessionId])
  @@index([createdAt])
}

Sources: packages/shared/prisma/schema.prisma

Key Fields

FieldTypeDescription
idStringUnique identifier using CUID algorithm
nameString?Optional human-readable trace name
projectIdStringReference to parent project
userString?Identifier for the end user
sessionIdString?Groups related traces into sessions
releaseString?Application release version
versionString?Trace format version
tagsString[]Array of string tags for categorization

Repository Pattern

Traces are accessed through the repository pattern defined in packages/shared/src/server/repositories/traces.ts. This abstraction provides a clean interface for CRUD operations while encapsulating query logic.

Sources: packages/shared/src/server/repositories/traces.ts

// Repository interface pattern (simplified)
interface ITraceRepository {
  create(data: CreateTraceInput): Promise<Trace>;
  getById(id: string, projectId: string): Promise<Trace | null>;
  list(projectId: string, options?: ListTracesOptions): Promise<Trace[]>;
  update(id: string, data: UpdateTraceInput): Promise<Trace>;
  softDelete(id: string): Promise<void>;
}

Observations

Observations represent the individual components within a trace, such as LLM calls, retrievals, or custom events. They form a hierarchical structure that can be nested to represent complex workflows.

Observation Model Schema

model Observation {
  id            String   @id @default(cuid())
  
  // Type discrimination
  type          ObservationType
  
  // Relations
  trace         Trace    @relation(fields: [traceId], references: [id])
  traceId       String
  parent        Observation? @relation("ObservationHierarchy", fields: [parentId], references: [id])
  parentId      String?
  children      Observation[] @relation("ObservationHierarchy")
  
  // Project reference for efficient querying
  projectId     String
  
  // Core data
  name          String?
  startTime     DateTime
  endTime       DateTime?
  status        String?
  metadata      Json?
  
  // LLM-specific fields
  model         String?
  modelId       String?
  provider      String?
  promptTokens  Int?
  completionTokens Int?
  totalTokens   Int?
  unitPrice     Float?
  currency      String?
  calculatedUnitCost Float?
  
  // Retrieval-specific fields
  input         Json?
  output        Json?
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Soft delete
  deletedAt     DateTime?
  
  @@index([traceId])
  @@index([projectId])
  @@index([startTime])
}

Sources: packages/shared/prisma/schema.prisma Sources: packages/shared/src/server/repositories/observations.ts

Observation Types

Langfuse supports several observation types through an enum:

TypeDescription
CHATChat completion calls
GENERATIONText generation calls
RETRIEVALRetrieval augmented generation steps
EVENTCustom events and markers
TOOLTool/function calls

Hierarchical Structure

Observations support nested hierarchies through self-referential relationships. This enables representing complex multi-step workflows where parent observations contain child observations representing sub-tasks or parallel operations.

graph TD
    A[Trace] --> B[Observation: Chat]
    B --> C[Observation: Retrieval]
    B --> D[Observation: Generation]
    C --> E[Observation: Event: Cache Hit]
    D --> F[Observation: Tool: Calculator]
    D --> G[Observation: Tool: Search]

Sources: packages/shared/prisma/schema.prisma

Scores

Scores provide a mechanism for evaluating trace and observation quality. They can be human-generated or automated evaluations attached to specific traces or observations.

Score Model Schema

model Score {
  id            String   @id @default(cuid())
  
  // Target discrimination
  traceId       String?
  observationId String?
  
  // Project reference
  projectId     String
  
  // Score data
  name          String
  value         Float
  dataType      ScoreDataType
  comment       String?
  
  // Source tracking
  source        String?
  
  // Author
  authorId      String?
  
  // Timestamps
  createdAt     DateTime @default(now())
  updatedAt     DateTime @updatedAt
  
  // Relations
  trace         Trace?      @relation(fields: [traceId], references: [id])
  observation   Observation? @relation(fields: [observationId], references: [id])
  
  @@index([projectId])
  @@index([traceId])
  @@index([observationId])
  @@index([name, createdAt])
}

Sources: packages/shared/prisma/schema.prisma Sources: packages/shared/src/server/repositories/scores.ts

Score Data Types

The ScoreDataType enum defines the type of value stored:

Data TypeDescription
NUMERICContinuous numerical value
CATEGORICALCategorical label or classification
BOOLEANTrue/false indicator

Score Interfaces Architecture

Scores in Langfuse follow a layered interface architecture that separates concerns across different parts of the system:

graph LR
    A[UI Types] --> B[Application Validation]
    B --> C[Ingestion Validation]
    C --> D[API v1 Schemas]
    C --> E[API v2 Schemas]
    D --> F[Database Models]
    E --> F

Sources: packages/shared/src/features/scores/interfaces/README.md

Indexing Strategy

The schema defines strategic indexes to optimize common query patterns:

TableIndexesPurpose
TraceprojectId, sessionId, createdAtFast project filtering and time-based queries
ObservationtraceId, projectId, startTimeTrace traversal and time-series queries
ScoreprojectId, traceId, observationId, name, createdAtScore lookups and time-series analytics

The composite index on Score(name, createdAt) specifically supports the score analytics feature's need to retrieve scores by name over time intervals.

Sources: packages/shared/prisma/schema.prisma

Prisma Client Usage

Prisma Client is generated from the schema and used throughout the application. The generated client provides type-safe access to all database operations.

Client Configuration

import { PrismaClient } from "@langfuse/shared/prisma";

const prisma = new PrismaClient({
  log: process.env.NODE_ENV === "development" ? ["query", "error"] : ["error"],
});

Transaction Support

The schema supports atomic operations through Prisma's transaction API:

await prisma.$transaction([
  prisma.trace.create({ data: traceData }),
  prisma.observation.createMany({ data: observations }),
  prisma.score.createMany({ data: scores }),
]);

Migrations

Database migrations are managed through Prisma Migrate. Migration files are stored in packages/shared/prisma/migrations/ and version-controlled alongside the schema.

Running Migrations

# Apply pending migrations
pnpm --filter=langfuse-prisma migrate deploy

# Create a new migration
pnpm --filter=langfuse-prisma migrate dev --name add_new_field

Repository Layer

The repository pattern abstracts database access behind domain-specific interfaces:

RepositoryFilePurpose
TraceRepositoryrepositories/traces.tsTrace CRUD and querying
ObservationRepositoryrepositories/observations.tsObservation management
ScoreRepositoryrepositories/scores.tsScore operations

Sources: packages/shared/src/server/repositories/traces.ts Sources: packages/shared/src/server/repositories/scores.ts

ClickHouse Integration

While PostgreSQL (via Prisma) stores transactional data like traces and scores, Langfuse also uses ClickHouse for analytics workloads. The Prisma schema defines PostgreSQL models for the primary application data, while ClickHouse handles high-volume analytical queries.

Sources: packages/shared/scripts/seeder/utils/README.md

Summary

The Prisma schema forms the backbone of Langfuse's data layer, defining:

  • Multi-tenant structure: Organizations, projects, and user memberships
  • Observability core: Traces and observations with hierarchical support
  • Evaluation framework: Scores with multiple data types and sources
  • Operational metadata: Timestamps, soft deletes, and JSON fields for flexibility

The schema design prioritizes query performance through strategic indexing, data integrity through proper relationships, and extensibility through JSON metadata fields.

Sources: [packages/shared/prisma/schema.prisma]()

ClickHouse Analytics Layer

Related topics: Database Schema (Prisma), Worker Service

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Components Overview

Continue reading this section for the full explanation and source context.

Section Traces Table

Continue reading this section for the full explanation and source context.

Section Observations Table

Continue reading this section for the full explanation and source context.

Related topics: Database Schema (Prisma), Worker Service

ClickHouse Analytics Layer

Overview

The ClickHouse Analytics Layer is a core infrastructure component in Langfuse that provides high-performance analytical capabilities for processing and querying large-scale observability data. ClickHouse serves as the primary OLAP (Online Analytical Processing) database for storing traces, observations, and score analytics with optimized columnar storage and efficient aggregation queries.

Langfuse leverages ClickHouse for:

Sources: packages/shared/scripts/seeder/utils/README.md

  • High-throughput event ingestion during trace collection
  • Complex analytical queries for score comparisons and distributions
  • Time-series analysis with efficient aggregation
  • Large dataset sampling and optimization strategies

Architecture Overview

graph TD
    subgraph Ingestion["Ingestion Layer"]
        W[Worker Service] --> CW[ClickhouseWriter]
        CW --> CH[ClickHouse Cluster]
    end
    
    subgraph Storage["Storage Layer"]
        CH --> TS[Traces Table]
        CH --> OS[Observations Table]
        CH --> SS[Scores Table]
    end
    
    subgraph Query["Query Layer"]
        CR[ClickHouse Repository] --> CH
        SA[Score Analytics] --> CR
        WEB[Web Frontend] --> SA
    end
    
    subgraph Optimization["Optimization Layer"]
        CR --> HASH[cityHash64 Sampling]
        CR --> FINAL[Adaptive FINAL]
        CR --> INTERVAL[Time Interval Alignment]
    end

Components Overview

ComponentLocationPurpose
ClickhouseWriterworker/src/services/ClickhouseWriter/index.tsWrites ingestion events to ClickHouse
ClickHouse Repositorypackages/shared/src/server/repositories/clickhouse.tsProvides query interface and optimization
Score Analyticspackages/shared/src/server/repositories/score-analytics.tsSpecialized analytics queries
Schema Definitionspackages/shared/src/server/clickhouse/schema.tsTypeScript types for ClickHouse data
Migrationspackages/shared/clickhouse/migrations/clustered/Database schema migrations

Sources: worker/src/services/ClickhouseWriter/index.ts Sources: packages/shared/src/server/repositories/clickhouse.ts

Data Schema

Traces Table

The traces table stores the fundamental trace records with hierarchical observation data. The clustered migration defines the primary schema with optimized column types for analytical queries.

Key columns include:

ColumnTypeDescription
idUUIDUnique trace identifier
project_idStringProject association
timestampDateTime64Event timestamp with millisecond precision
nameStringTrace name
user_idStringUser identifier
metadataJSONFlexible metadata storage
tagsArray(String)Tag-based categorization
inputTextInput data
outputTextOutput data

Sources: packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql

Observations Table

Observations represent individual events within a trace, storing:

  • Model inputs/outputs
  • Function calls
  • Embeddings
  • Generation events

Each observation is linked to its parent trace via trace_id and supports nested hierarchies through parent_observation_id.

Sources: packages/shared/src/server/clickhouse/schema.ts

Scores Table

Scores store evaluation metrics associated with traces and observations:

ColumnTypePurpose
trace_idUUIDAssociated trace
observation_idUUIDOptional observation link
nameStringScore identifier
valueFloat64Numeric score value
data_typeEnumNUMERIC, BOOLEAN, or CATEGORICAL
sourceStringScore origin (e.g., "framework-trace")

Sources: packages/shared/src/server/repositories/score-analytics.ts

Ingestion Pipeline

Event Flow

sequenceDiagram
    participant API as Ingestion API
    participant Queue as Redis Queue
    participant Worker as Worker Service
    participant Writer as ClickhouseWriter
    participant CH as ClickHouse
    
    API->>Queue: Enqueue OtelIngestionEvent
    Worker->>Queue: Dequeue Event
    Worker->>Writer: Process Event
    Writer->>CH: Insert Batch (ClickHouseQueryBuilder)
    CH-->>Writer: Confirmation
    Writer->>Worker: Acknowledge

ClickhouseWriter Service

The ClickhouseWriter handles the actual data insertion into ClickHouse:

// Simplified flow from worker/src/services/ClickhouseWriter/index.ts
class ClickhouseWriter {
  async writeBatch(events: IngestionEvent[]): Promise<void> {
    const queryBuilder = new ClickHouseQueryBuilder();
    
    for (const event of events) {
      queryBuilder.addEvent(event);
    }
    
    await this.executeQuery(queryBuilder.build());
  }
}

Key responsibilities:

  1. Batch Processing: Aggregates multiple events for efficient insertion
  2. Schema Validation: Ensures events match expected schema
  3. Query Building: Uses ClickHouseQueryBuilder for optimized INSERT queries
  4. Error Recovery: Handles failed insertions with retry logic

Sources: worker/src/services/ClickhouseWriter/index.ts

ClickHouseQueryBuilder

The ClickHouseQueryBuilder class constructs optimized ClickHouse SQL queries with:

  • Proper escaping for special characters
  • Type-aware value formatting
  • Batch insert optimization
  • Efficient column mapping

Sources: packages/shared/scripts/seeder/utils/README.md

Query Layer

Repository Pattern

The clickhouse.ts repository provides a clean interface for all ClickHouse operations:

// packages/shared/src/server/repositories/clickhouse.ts
class ClickHouseRepository {
  // Query execution with automatic connection management
  async query<T>(sql: string, params?: QueryParams): Promise<T[]>
  
  // Stream processing for large result sets
  async streamQuery(sql: string, handler: (row: T) => void): Promise<void>
  
  // Batch inserts with transaction support
  async insertBatch(table: string, rows: Record<string, unknown>[]): Promise<void>
}

Sources: packages/shared/src/server/repositories/clickhouse.ts

Score Analytics Queries

The score analytics module provides specialized queries for evaluating model performance:

// packages/shared/src/server/repositories/score-analytics.ts
interface ScoreAnalyticsQuery {
  getScoreIdentifiers(projectId: string): Promise<ScoreIdentifier[]>;
  
  estimateScoreComparisonSize(
    projectId: string,
    score1Id: string,
    score2Id?: string
  ): Promise<QueryEstimate>;
  
  getScoreComparisonAnalytics(
    params: ScoreAnalyticsParams
  ): Promise<ScoreAnalyticsResult>;
}

#### Query Estimation

Before executing expensive analytics queries, the system estimates query size:

MetricDescription
scoreCountTotal number of scores matching criteria
matchedCountEstimated rows that will match
willSampleWhether hash-based sampling is needed
estimatedQueryTimePredicted query duration

This estimation enables adaptive query optimization based on dataset size.

Sources: packages/shared/src/server/repositories/score-analytics.ts Sources: web/src/features/score-analytics/README.md

Optimization Strategies

Hash-Based Sampling

For large datasets (>100,000 matches), Langfuse uses cityHash64 for consistent sampling:

SELECT * FROM scores
WHERE cityHash64(trace_id) < 0.1  -- 10% sample

Benefits:

  • Consistent sampling across query executions
  • Reproducible results for the same query parameters
  • Reduced query load while maintaining statistical validity

Adaptive FINAL Optimization

ClickHouse's FINAL modifier ensures up-to-date data but adds significant overhead. Langfuse uses adaptive application:

Dataset SizeFINAL Applied
< 70,000 scoresYes
> 70,000 scoresNo

Sources: web/src/features/score-analytics/README.md

Time Interval Alignment

Time series queries use proper interval alignment for accurate aggregation:

// ISO 8601 weeks
const weekInterval = "1W";

// Calendar months
const monthInterval = "1MONTH";

Proper alignment ensures:

  • Consistent bucket boundaries
  • Accurate period-over-period comparisons
  • Correct aggregation across daylight saving time transitions

Seeding and Testing

The seeder utility (packages/shared/scripts/seeder/) generates realistic test data for ClickHouse:

Data Types

TypeEnvironmentPurpose
Experiment Traceslangfuse-prompt-experimentRealistic traces from actual datasets
Evaluation Datalangfuse-evaluationMetrics and scoring for evaluations
Synthetic DatadefaultLarge-scale hierarchical test data

ID Patterns

  • Experiment: trace-dataset-{datasetName}-{itemIndex}-{projectId}-{runNumber}
  • Evaluation: trace-eval-{index}-{projectId}
  • Synthetic: trace-synthetic-{index}-{projectId}

Sources: packages/shared/scripts/seeder/utils/README.md

DataGenerator

The DataGenerator class creates realistic data for all three types:

MethodOutput
generateDatasetTrace()Traces linked to dataset items
generateSyntheticTraces()Hierarchical traces with scores
generateEvaluationTraces()Evaluation-focused traces

Sources: packages/shared/scripts/seeder/utils/README.md

Framework Traces

Framework traces are real traces produced through official Langfuse framework instrumentation. They can be added to the system for UI testing and demo purposes.

Adding New Framework Traces

``bash npx ts-node merge-observations.ts trace-file.json observations.json trace-merged.json ``

  1. Generate a trace using framework instrumentation
  2. Download from UI using the download button
  3. Convert to JSON format via "Log View (Beta)"
  4. Merge observations using the provided script:
  5. Save the merged file with date-based naming

Discovery

Framework traces use the ID pattern framework-frameworkName-traceId. Filter by:

  • source: "framework-trace" in trace table
  • "All Time" date range (timestamps not rewritten)

Sources: packages/shared/scripts/seeder/utils/framework-traces/README.md

TypeScript Integration

Schema Types

The schema.ts file provides TypeScript type definitions:

// packages/shared/src/server/clickhouse/schema.ts
interface ClickHouseTrace {
  id: string;
  project_id: string;
  timestamp: Date;
  name: string;
  user_id?: string;
  metadata?: Record<string, unknown>;
  tags?: string[];
  input?: string;
  output?: string;
  session_id?: string;
}

interface ClickHouseObservation {
  id: string;
  trace_id: string;
  parent_observation_id?: string;
  type: ObservationType;
  timestamp: Date;
  name?: string;
  // ... additional fields
}

These types ensure compile-time safety when interacting with ClickHouse data.

Sources: packages/shared/src/server/clickhouse/schema.ts

Configuration

Required Environment Variables

VariableDescriptionExample
CLICKHOUSE_URLClickHouse server URLhttp://localhost:8123
CLICKHOUSE_USERDatabase userclickhouse
CLICKHOUSE_PASSWORDUser passwordclickhouse
CLICKHOUSE_DATABASETarget databasedefault

Cluster Configuration

Migrations support clustered deployments:

# Clustered migration path
packages/shared/clickhouse/migrations/clustered/

The clustered migrations ensure schema consistency across all nodes in a ClickHouse cluster.

Sources: packages/shared/clickhouse/migrations/clustered/0001_traces.up.sql

Best Practices

Query Optimization

  1. Use projections for frequently accessed columns
  2. Leverageskipping indexes for high-cardinality columns
  3. Batch inserts to reduce overhead
  4. Filter early to minimize data processed

Data Management

  1. Partition by date for efficient time-range queries
  2. Use TTL policies for automatic data expiration
  3. Compress data using ClickHouse's native compression

Integration Guidelines

  1. Always use the repository pattern for query abstraction
  2. Implement query estimation before expensive operations
  3. Use hash-based sampling for large analytical queries
  4. Consider adaptive FINAL optimization for query performance

See Also

Sources: [packages/shared/scripts/seeder/utils/README.md]()

Queue System (Redis/BullMQ)

Related topics: System Architecture, Worker Service

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Redis Connection

Continue reading this section for the full explanation and source context.

Section Queue Initialization

Continue reading this section for the full explanation and source context.

Section Ingestion Queue

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Worker Service

Queue System (Redis/BullMQ)

Langfuse employs a distributed queue system built on Redis for storage and BullMQ for job orchestration. This architecture enables asynchronous processing of high-volume operations including event ingestion, evaluation execution, batch actions, and webhook delivery.

Architecture Overview

The queue system follows a producer-consumer pattern where the web application enqueues jobs and worker processes consume them asynchronously.

graph TD
    subgraph "Langfuse Web Application"
        A[API Request] --> B[Queue Client]
        B --> C[Redis Queue]
    end
    
    subgraph "Langfuse Worker"
        C --> D[Worker Manager]
        D --> E1[Ingestion Worker]
        D --> E2[Eval Worker]
        D --> E3[Batch Worker]
        D --> E4[Webhook Worker]
    end
    
    subgraph "Redis"
        C --> F[(Redis Cluster)]
    end
    
    subgraph "External Services"
        E1 --> G[(ClickHouse)]
        E1 --> H[(PostgreSQL)]
        E2 --> H
        E3 --> H
        E4 --> I[(External APIs)]
    end

Queue Types

Langfuse defines multiple specialized queues for different workloads:

Queue NamePurposeProcessing TypePriority
ingestionEvent ingestion and processingAsync batchMedium
evalExecutionLLM evaluation executionAsyncMedium
batchActionBulk operations on dataAsync batchLow
webhookOutbound webhook deliveryAsyncHigh
OtelIngestionOpenTelemetry event ingestionAsyncMedium
IngestionSecondarySecondary ingestion processingAsyncMedium

Sources: worker/src/queues/workerManager.ts

Queue Configuration

Redis Connection

All queues rely on a Redis connection string configured via environment variables:

REDIS_CONNECTION_STRING=redis://:[email protected]:6379

Queue Initialization

Each queue is initialized with specific BullMQ configuration:

const myQueue = new Queue<T>(queueName, {
  connection: {
    host: redisConfig.host,
    port: redisConfig.port,
    password: redisConfig.password,
  },
  defaultJobOptions: {
    attempts: 3,
    backoff: {
      type: "exponential",
      delay: 1000,
    },
    removeOnComplete: true,
    removeOnFail: false,
  },
});

Sources: packages/shared/src/server/redis/ingestionQueue.ts

Queue Implementations

Ingestion Queue

The ingestion queue handles event processing from SDK clients and the OpenTelemetry protocol.

graph LR
    A[SDK Events] --> B[API Endpoint]
    B --> C[ingestionQueue]
    C --> D[Validate Events]
    D --> E[Parse & Transform]
    E --> F[(ClickHouse)]
    E --> G[(PostgreSQL)]

Key Features:

  • Batch processing with configurable batch size
  • Retry with exponential backoff
  • Event validation against schema
  • S3 file-based storage for large payloads

Sources: packages/shared/src/server/redis/ingestionQueue.ts

Evaluation Execution Queue

Handles asynchronous execution of LLM-based evaluations:

graph TD
    A[Create Eval Job] --> B[evalExecutionQueue]
    B --> C[Worker Pickup]
    C --> D[Fetch Traces]
    D --> E[Run LLM Evaluation]
    E --> F[Store Results]
    F --> G[(PostgreSQL)]

Job Options:

{
  attempts: 3,
  backoff: {
    type: "exponential",
    delay: 2000,
  },
  removeOnComplete: 100, // Keep last 100 completed
  removeOnFail: 1000,   // Keep last 1000 failed
}

Sources: packages/shared/src/server/redis/evalExecutionQueue.ts

Batch Action Queue

Processes bulk operations such as batch updates and deletions:

ParameterDefaultDescription
batchSize100Items per batch
concurrency5Parallel workers
attempts3Retry count
backoffDelay1000Initial backoff ms

Sources: packages/shared/src/server/redis/batchActionQueue.ts

Webhook Queue

Manages outbound webhook deliveries with priority handling:

graph TD
    A[Trigger Event] --> B[webhookQueue]
    B --> C{Has Retry Config?}
    C -->|Yes| D[Schedule Retry]
    C -->|No| E[Immediate Delivery]
    D --> E
    E --> F{HTTP Response}
    F -->|2xx| G[Log Success]
    F -->|4xx| H[Log Failure]
    F -->|5xx| D

Sources: packages/shared/src/server/redis/webhookQueue.ts

Worker Manager

The WorkerManager orchestrates all queue workers within the worker process:

export class WorkerManager {
  private workers: Map<string, Worker>;

  async initialize(): Promise<void> {
    // Initialize all queue workers
  }

  async gracefulShutdown(): Promise<void> {
    // Gracefully close all workers
  }
}

Sources: worker/src/queues/workerManager.ts

Worker Lifecycle

graph TD
    A[Start Worker Process] --> B[Load Configuration]
    B --> C[Initialize Redis Connection]
    C --> D[Create Queue Instances]
    D --> E[Create Worker Instances]
    E --> F[Register Event Handlers]
    F --> G[Workers Ready]
    
    H[Shutdown Signal] --> I[Close Workers]
    I --> J[Process Pending Jobs]
    J --> K[Close Redis Connection]
    K --> L[Exit]

Event Handling

Workers register handlers for job lifecycle events:

EventHandler Purpose
completedLog successful job completion
failedHandle job failures and retries
progressTrack job progress updates
stalledDetect and requeue stalled jobs

Job Data Flow

Standard Event Ingestion

Events flow through the system as follows:

sequenceDiagram
    participant SDK
    participant API
    participant Redis
    participant Worker
    participant DB
    
    SDK->>API: POST /api/public/ingestion
    API->>Redis: Add to IngestionQueue
    API-->>SDK: 202 Accepted
    Worker->>Redis: Dequeue Job
    Worker->>DB: Validate & Store
    Worker->>Redis: Job Complete

Event Transformation

The ingestion endpoint transforms S3 keys into queue payloads:

Standard format:

{
  "authCheck": {
    "validKey": true,
    "scope": { "projectId": "<projectId>" }
  },
  "data": {
    "eventBodyId": "<eventBodyId>",
    "fileKey": "<eventId>",
    "type": "<type>-create"
  }
}

OTEL format:

{
  "authCheck": {
    "validKey": true,
    "scope": { "projectId": "<projectId>", "accessLevel": "project" }
  },
  "data": {
    "fileKey": "otel/<projectId>/<yyyy>/<mm>/<dd>/<hh>/<mm>/<eventId>.json"
  }
}

Sources: worker/src/scripts/replayIngestionEventsV2/README.md

Error Handling and Retries

Retry Strategy

All queues implement exponential backoff retry:

const jobOptions = {
  attempts: 3,
  backoff: {
    type: "exponential",
    delay: 1000, // 1s, 2s, 4s
  },
};

Error Classification

HTTP StatusBehaviorRetry
2xxSuccessNo
429Rate limitedYes (with backoff)
5xxServer errorYes (up to 3 times)
4xx (not 429)Client errorNo (logged and skipped)

Monitoring and Debugging

Progress Tracking

The replay scripts provide progress updates:

[1200/45000] 2.7% โ€” 498 queued, 2 skipped

Checkpoint System

Scripts write checkpoints to enable resume after failures:

# Checkpoint file location
./worker/.checkpoint

# Resume from checkpoint
pnpm run --filter=worker replay-ingestion --resume

Error Logging

Failed jobs are logged to errors.csv for manual inspection:

"operation","key","error"
"REST.PUT.OBJECT","projectId/type/eventBodyId/eventId.json","Connection timeout"

Admin API for Queue Management

`POST /api/admin/ingestion-replay`

Enqueues batches of S3 keys for reprocessing:

Request:

{
  "keys": [
    "projectId/trace/eventBodyId/eventId.json",
    "otel/projectId/2025/07/09/14/30/some-uuid.json"
  ]
}

Response:

{
  "queued": 498,
  "skipped": 2,
  "errors": []
}

Authentication

Requires Authorization: Bearer {ADMIN_API_KEY} header validated by AdminApiAuthService.

Environment Variables

VariableDescriptionRequired
REDIS_CONNECTION_STRINGRedis connection URLYes
LANGFUSE_S3_EVENT_UPLOAD_BUCKETS3 bucket for event storageYes
CLICKHOUSE_URLClickHouse connection URLYes
CLICKHOUSE_USERClickHouse usernameYes
CLICKHOUSE_PASSWORDClickHouse passwordYes
ADMIN_API_KEYAdmin API authentication keyYes (admin endpoints)

Utility Scripts

Replay Ingestion Events V2

A streamlined replacement for v1 with improved features:

Featurev1v2
InfrastructureRedis, ClickHouse, PostgreSQL, S3Langfuse host URL only
SetupFull repo clonenpx tsx + env vars
Event deliveryDirect BullMQ addBulkHTTP POST to admin API
Resume supportManualBuilt-in checkpoint
Rate limitingNoneClient + server side

Refill Queue Event

Backfills queues with events from local machines:

# 1. Create events file
echo '{"projectId": "project-123", "orgId": "org-456"}' > ./worker/events.jsonl

# 2. Configure environment
# Create .env with REDIS_CONNECTION_STRING and supporting services

# 3. Run the script
pnpm run --filter=worker refill-queue-event

Best Practices

  1. Connection Pooling: Reuse Redis connections across queue operations
  2. Graceful Shutdown: Always drain active jobs before stopping workers
  3. Monitoring: Track queue depth and processing times
  4. Error Boundaries: Isolate queue failures to prevent cascade
  5. Backoff Tuning: Adjust retry delays based on workload characteristics

Sources: [worker/src/queues/workerManager.ts]()

Worker Service

Related topics: System Architecture, Queue System (Redis/BullMQ)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Purpose and Scope

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Queue System (Redis/BullMQ)

Worker Service

Overview

The Worker Service is a core backend component in Langfuse responsible for asynchronous processing of long-running tasks. It operates as a separate Node.js process that communicates with the main Langfuse server through message queues, primarily using BullMQ backed by Redis.

graph TB
    subgraph "Langfuse Server"
        A[API Endpoints]
        B[tRPC Routers]
    end
    
    subgraph "Redis"
        C[(Ingestion Queues)]
        D[(Evaluation Queues)]
        E[(Batch Action Queues)]
    end
    
    subgraph "Worker Service"
        F[Worker Manager]
        G[Evaluation Service]
        H[Batch Action Handler]
        I[Queue Processors]
    end
    
    A -->|"Enqueue Jobs"| C
    B -->|"Dispatch Tasks"| C
    F -->|"Process"| C
    G -->|"Execute"| D
    H -->|"Execute"| E
    C --> F
    D --> G
    E --> H

Purpose and Scope

The Worker Service handles the following categories of work:

  • Event Ingestion Processing: Processing and persisting trace events, observations, and spans from the ingestion queues
  • Evaluation Execution: Running LLM-based evaluations on traces and observations
  • Batch Actions: Executing bulk operations on datasets, traces, and other resources
  • Queue Replay: Replaying historical ingestion events for data recovery or reprocessing

Sources: worker/src/app.ts

Sources: [worker/src/app.ts]()

API Layer

Related topics: System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Router Registration Flow

Continue reading this section for the full explanation and source context.

Section Initialization

Continue reading this section for the full explanation and source context.

Related topics: System Architecture

API Layer

The API Layer is the central communication bridge between the Langfuse frontend and backend services. Built on tRPC (TypeScript RPC), it provides end-to-end type safety, enabling the web application to interact with server-side logic through strongly-typed procedure calls.

Overview

The API Layer serves multiple critical functions:

  • Type-Safe Communication: All API calls are fully typed from server to client
  • Authentication & Authorization: Every procedure is wrapped with auth middleware
  • Business Logic Isolation: Procedures delegate to repository layer for data access
  • Input Validation: Zod schemas validate all incoming requests
  • Feature Organization: Procedures are grouped by domain (traces, observations, scores)
graph TD
    subgraph Frontend
        UI[React Components]
    end
    
    subgraph API Layer
        TRPC[tRPC Client]
        Procedures[tRPC Procedures]
        Middleware[Auth Middleware]
    end
    
    subgraph Backend
        Repositories[Repositories]
        Database[(Database)]
    end
    
    UI --> TRPC
    TRPC --> Procedures
    Procedures --> Middleware
    Middleware --> Repositories
    Repositories --> Database

Architecture

Core Components

ComponentFilePurpose
tRPC Instancetrpc.tsInitialize tRPC with middleware and context
Root Routerroot.tsRegister all feature routers
Trace Routerrouters/traces.tsTrace CRUD and query operations
Observation Routerrouters/observations.tsSpan/generation/event operations
Score Routerrouters/scores.tsScore management and analytics
Score Analyticsfeatures/score-analytics/server/scoreAnalyticsRouter.tsScore aggregation and statistics

Router Registration Flow

The root router aggregates all feature routers under a namespace:

// Simplified from web/src/server/api/root.ts
export const rootRouter = createTRPCRouter({
  trace: traceRouter,
  observation: observationRouter,
  score: scoreRouter,
  scoreAnalytics: scoreAnalyticsRouter,
  // ... other routers
});

Sources: web/src/server/api/root.ts:1-50

tRPC Configuration

Initialization

The tRPC instance is initialized in trpc.ts with:

  1. Context Creation: Builds request-scoped context with authentication
  2. Middleware Chain: Applies auth, rate limiting, and logging
  3. Error Handling: Transforms errors into HTTP-compatible responses
// From web/src/server/api/trpc.ts
export const createTRPCContext = async (opts: CreateNextContextOptions) => {
  return {
    session: await getServerSession(authOptions),
    // ... additional context
  };
};

const t = initTRPC.context<typeof createTRPCContext>().create();

Sources: web/src/server/api/trpc.ts:1-30

Middleware Stack

MiddlewarePurpose
isAuthedValidates user session and project access
isProjectMemberEnsures user belongs to the project scope
isOwnerOrMemberAllows owner or member roles
rateLimitPrevents abuse with configurable limits

API Routers

Trace Router

Handles all trace-related operations including retrieval, creation, and updates.

Key Procedures:

ProcedureTypeDescription
getByIdQueryFetch single trace with full details
listQueryPaginated trace listing with filters
createMutationCreate new trace record
updateMutationUpdate trace metadata/tags
deleteMutationSoft-delete trace

Sources: web/src/server/api/routers/traces.ts:1-100

Observation Router

Manages spans, generations, and events that belong to traces.

Key Procedures:

ProcedureTypeDescription
getByIdQueryFetch single observation
listQueryList observations with trace/session filters
createMutationCreate observation linked to trace
updateMutationUpdate observation metadata

Sources: web/src/server/api/routers/observations.ts:1-100

Score Router

Provides score management with support for multiple API versions (v1 and v2).

API Versioning Strategy:

VersiontraceId RequiredSession SupportDataset Run Support
v1YesNoNo
v2OptionalYesYes

The Score router supports both trace-level and session-level scores through different API versions.

Key Procedures:

ProcedureTypeDescription
createMutationCreate score (POST endpoint)
deleteMutationDelete score
getByIdQueryFetch single score
listQueryList scores with filters

Sources: web/src/server/api/routers/scores.ts:1-100

Score Analytics Router

Provides aggregated statistics and time-series data for scores.

// From web/src/features/score-analytics/server/scoreAnalyticsRouter.ts
export const scoreAnalyticsRouter = createTRPCRouter({
  timeSeries: protectedProcedure.query(...),
  statistics: protectedProcedure.query(...),
  heatmapData: protectedProcedure.query(...),
});

Key Procedures:

ProcedureTypeDescription
timeSeriesQueryTime-series score data with gap filling
statisticsQueryStatistical summaries (count, mean, p50/p95/p99)
heatmapDataQueryHeatmap matrix for visualization

Sources: web/src/features/score-analytics/README.md

Request Flow

sequenceDiagram
    participant Client
    participant TRPC as tRPC Server
    participant Middleware
    participant Router
    participant Repository
    participant DB as Database

    Client->>TRPC: Procedure Call
    TRPC->>Middleware: Apply Chain
    Middleware->>Middleware: Auth Check
    Middleware->>Router: Validated Input
    Router->>Repository: Domain Operation
    Repository->>DB: SQL/Query
    DB-->>Repository: Result
    Repository-->>Router: Domain Object
    Router-->>TRPC: Response
    TRPC-->>Client: Typed Response

Input Validation

All procedures use Zod schemas for runtime validation:

// Example pattern from routers
const createTraceSchema = z.object({
  name: z.string().optional(),
  userId: z.string().optional(),
  metadata: z.record(z.unknown()).optional(),
  tags: z.array(z.string()).optional(),
});

protectedProcedure
  .input(createTraceSchema)
  .mutation(async ({ input, ctx }) => {
    return ctx.repo.trace.create(input);
  });

Type Flow

The API Layer maintains type consistency across the stack:

graph LR
    Client[Client Input] --> InputZ[Zod Schema]
    InputZ --> InputTS[TypeScript Type]
    InputTS --> Handler[Procedure Handler]
    Handler --> Repo[Repository Return]
    Repo --> OutputZ[Zod Response Schema]
    OutputZ --> OutputTS[API Response Type]
    OutputTS --> ClientResponse[Client]

Type Transformation Points:

StageLocationPurpose
InputRoutersZod validation + type inference
DomainRepositoriesDatabase models to domain objects
OutputRoutersZod response schema validation
ClientReact HooksFull type safety for UI

Score Interface Architecture

Scores have a multi-layer type system:

LayerLocationPurpose
API v1interfaces/api/v1/Legacy trace-only scores
API v2interfaces/api/v2/Current with session/dataset support
Applicationinterfaces/application/Internal validation
UIinterfaces/ui/Simplified frontend types

Sources: packages/shared/src/features/scores/interfaces/README.md

Public API Extension

The Langfuse Public API extends the internal API layer for external consumption:

// From web/src/features/README.md
// Pattern for adding new public API routes:
1. Wrap with withMiddleware
2. Type-safe route with createAuthedAPIRoute
3. Add Zod types to /features/public-api/types
4. Use coerce for date handling
5. Use strict() on response objects

SDK Generation Pipeline:

graph TD
    Fern[Fern Definition] --> PythonSDK[Python SDK]
    Fern --> JSSDK[JS/TS SDK]
    Fern --> Docs[API Documentation]

Best Practices

Procedure Design

  • Use protectedProcedure for authenticated endpoints
  • Apply input validation at the procedure level
  • Return consistent response structures
  • Handle errors with typed error classes

Error Handling

Error TypeHTTP CodeUsage
UNAUTHORIZED401Missing/invalid session
FORBIDDEN403Insufficient permissions
NOT_FOUND404Resource doesn't exist
BAD_REQUEST400Invalid input
INTERNAL_SERVER_ERROR500Unexpected errors

Performance Considerations

  • Use cursor-based pagination for large datasets
  • Leverage repository-level caching where applicable
  • Batch database operations in mutations
  • Limit response sizes with maxTake parameters

Sources: [web/src/server/api/root.ts:1-50]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high bug: Using client with context manager breaks the scoring

First-time setup may fail or require extra isolation and rollback planning.

high bug: unnamed trace name in Langfuse UI

First-time setup may fail or require extra isolation and rollback planning.

high bug: AsyncStream' object has no attribute 'usage' when integrated with Semantic Kernel and Openlit

The project may affect permissions, credentials, data exposure, or host boundaries.

high bug: Worker shutdown takes ~1 hour in self hosted kubernetes

The project may affect permissions, credentials, data exposure, or host boundaries.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: bug: Using client with context manager breaks the scoring

  • Severity: high
  • Finding: Installation risk is backed by a source signal: bug: Using client with context manager breaks the scoring. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/issues/8138

2. Installation risk: bug: unnamed trace name in Langfuse UI

  • Severity: high
  • Finding: Installation risk is backed by a source signal: bug: unnamed trace name in Langfuse UI. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/issues/13416

3. Security or permission risk: bug: AsyncStream' object has no attribute 'usage' when integrated with Semantic Kernel and Openlit

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: bug: AsyncStream' object has no attribute 'usage' when integrated with Semantic Kernel and Openlit. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/issues/8173

4. Security or permission risk: bug: Worker shutdown takes ~1 hour in self hosted kubernetes

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: bug: Worker shutdown takes ~1 hour in self hosted kubernetes. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/issues/8156

5. Installation risk: bug: Socket timeout. Expecting data, but didn't receive any in 30000ms on idle BullMQ queues

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: bug: Socket timeout. Expecting data, but didn't receive any in 30000ms on idle BullMQ queues. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/issues/13601

6. Installation risk: v3.169.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v3.169.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/releases/tag/v3.169.0

7. Installation risk: v3.172.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v3.172.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/releases/tag/v3.172.0

8. Installation risk: v3.173.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v3.173.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/langfuse/langfuse/releases/tag/v3.173.0

9. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:642497346 | https://github.com/langfuse/langfuse | README/documentation is current enough for a first validation pass.

10. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:642497346 | https://github.com/langfuse/langfuse | last_activity_observed missing

11. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: downstream_validation.risk_items | github_repo:642497346 | https://github.com/langfuse/langfuse | no_demo; severity=medium

12. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: risks.scoring_risks | github_repo:642497346 | https://github.com/langfuse/langfuse | no_demo; severity=medium

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using langfuse with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence