open-webui Manual Preview

Doramagic Project Pack · Human Manual

open-webui

Open WebUI is an open-source project that prioritizes offline functionality and user privacy. The platform is built with extensibility in mind, allowing users to customize and extend its c...

Project Introduction

Related topics: Installation Guide, Architecture Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Frontend Layer

Continue reading this section for the full explanation and source context.

Section Backend Layer

Continue reading this section for the full explanation and source context.

Section Supported File Types

Continue reading this section for the full explanation and source context.

Related topics: Installation Guide, Architecture Overview

Project Introduction

Open WebUI is an extensible, self-hosted AI interface designed to provide a powerful and user-friendly chat experience for Large Language Models (LLMs). It serves as a comprehensive web-based frontend that seamlessly integrates with various LLM backends, enabling users to interact with AI models through a modern, feature-rich interface.

Overview

Open WebUI is an open-source project that prioritizes offline functionality and user privacy. The platform is built with extensibility in mind, allowing users to customize and extend its capabilities through a modular architecture. The project supports multiple installation methods and integrates with popular LLM providers like Ollama, OpenAI, and various other AI services.

The system operates as a full-stack application with a Svelte-based frontend and a Python FastAPI backend, communicating through RESTful APIs and WebSocket connections for real-time interactions.

Architecture

Open WebUI follows a client-server architecture with clear separation between the frontend presentation layer and the backend API layer.

graph TD
    subgraph Frontend["Frontend (Svelte)"]
        UI[User Interface]
        State[State Management]
        API[API Client]
    end
    
    subgraph Backend["Backend (Python/FastAPI)"]
        Routes[API Routes]
        Services[Business Logic]
        DB[(Database)]
        Auth[Authentication]
    end
    
    subgraph External["External Services"]
        Ollama[Ollama]
        OpenAI[OpenAI API]
        RAG[RAG Providers]
    end
    
    UI --> State
    State --> API
    API --> Routes
    Routes --> Services
    Services --> DB
    Routes --> Auth
    Services --> Ollama
    Services --> OpenAI
    Services --> RAG

Frontend Layer

The frontend is built using Svelte and SvelteKit, providing a reactive and performant user interface. Key components include:

Component	Location	Purpose
Constants	`src/lib/constants.ts`	Application-wide configuration values
Utilities	`src/lib/utils/index.ts`	Content processing and sanitization
API Clients	`src/lib/apis/`	Communication with backend services

Sources: src/lib/constants.ts:1-20

The frontend defines API base URLs for various services:

export const WEBUI_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1`;
export const OLLAMA_API_BASE_URL = `${WEBUI_BASE_URL}/ollama`;
export const OPENAI_API_BASE_URL = `${WEBUI_BASE_URL}/openai`;
export const AUDIO_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/audio`;
export const IMAGES_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/images`;
export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;

Sources: src/lib/constants.ts:8-15

Backend Layer

The backend is built with Python using FastAPI, providing a robust and scalable API layer. The backend handles authentication, data management, and communication with external AI services.

#### Core Dependencies

Package	Version	Purpose
fastapi	0.135.1	Web framework
uvicorn	0.41.0	ASGI server
pydantic	2.12.5	Data validation
sqlalchemy	2.0.48	ORM framework
python-socketio	5.16.1	WebSocket support
pycrdt	0.12.47	CRDT for real-time collaboration

Sources: backend/requirements-min.txt:1-35

Features

Open WebUI provides a comprehensive set of features designed to enhance the AI chat experience:

Supported File Types

The system supports various document formats for upload and processing:

Category	File Types
Documents	PDF, EPUB, DOCX, TXT
Code	Python, JavaScript, CSS, XML
Data	CSV, Markdown
Media	MP3, WAV (audio)
Other	HTML, Octet-stream

Sources: src/lib/constants.ts:18-32

Key Capabilities

Multi-Model Support: Engage with multiple AI models simultaneously through the MOA (Mixture of Agents) architecture
Code Interpreter: Execute Python code in sandboxed environments using Pyodide or Jupyter
Voice Mode: Voice-activated interactions with customizable prompts
RAG Integration: Retrieval-augmented generation with support for 15+ search providers
Web Browsing: Extract and integrate web content directly into conversations
Image Generation: Integration with DALL-E, Gemini, ComfyUI, and AUTOMATIC1111
Role-Based Access Control (RBAC): Granular permission management

Configuration System

Open WebUI uses a persistent configuration system to manage application settings. Configuration values are stored in the database and can be overridden by environment variables.

Code Execution Configuration

Setting	Environment Variable	Default	Description
ENABLE_CODE_EXECUTION	ENABLE_CODE_EXECUTION	True	Enable code execution feature
CODE_EXECUTION_ENGINE	CODE_EXECUTION_ENGINE	pyodide	Execution engine (pyodide/jupyter)
JUPYTER_URL	CODE_EXECUTION_JUPYTER_URL	-	Jupyter server URL
JUPYTER_AUTH	CODE_EXECUTION_JUPYTER_AUTH	-	Jupyter authentication

Sources: backend/open_webui/config.py:1-50

Voice Mode Configuration

Parameter	Description
VOICE_MODE_PROMPT_TEMPLATE	Template for voice interaction prompts
ENABLE_VOICE_MODE_PROMPT	Enable voice-specific prompt handling

Security Features

Authentication System

The backend implements comprehensive authentication using:

JWT tokens via PyJWT
Argon2 password hashing
Session management with Redis support
Role-based access control (RBAC)

Content Processing

The system includes middleware for processing and sanitizing AI responses:

graph LR
    Response[AI Response] --> Middleware[Middleware Layer]
    Middleware --> Sanitize[Content Sanitization]
    Middleware --> CodeBlock[Code Block Processing]
    Middleware --> Reasoning[Reasoning Display]
    Sanitize --> Render[Rendered Response]
    CodeBlock --> Render
    Reasoning --> Render

The middleware handles special content types including:

Code interpreter blocks
Reasoning/thinking blocks
HTML content rendering

Sources: backend/open_webui/utils/middleware.py:1-40

Installation Methods

Python pip Installation

pip install open-webui
open-webui serve

The server runs on http://localhost:8080 by default.

Docker Installation

docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --add-host=host.docker.internal:host-gateway \
  --restart always \
  ghcr.io/open-webui/open-webui:latest

[!IMPORTANT]

The volume mount -v open-webui:/app/backend/data is crucial for database persistence.

Development Branch

For testing unstable features:

docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui --add-host=host.docker.internal:host-gateway --restart always ghcr.io/open-webui/open-webui:dev

Sources: README.md:1-80

Technology Stack Summary

Layer	Technology	Key Libraries
Frontend Framework	Svelte/SvelteKit	-
Backend Framework	Python/FastAPI	Pydantic, SQLAlchemy
Database	SQLite/PostgreSQL	aiosqlite, psycopg
Real-time	WebSocket	python-socketio, pycrdt
Caching	Redis	starsessions
Authentication	JWT/Argon2	PyJWT, argon2-cffi
HTTP Client	httpx	With SOCKS, HTTP/2 support
Task Scheduling	APScheduler	-

System Requirements

Python Version: 3.11+ (required for compatibility)
Node.js: For frontend development
Database: SQLite (default), PostgreSQL (production)
Memory: Minimum 4GB RAM recommended
Storage: Depends on models and data usage

Sources: [src/lib/constants.ts:1-20]()

Installation Guide

Open WebUI provides multiple installation methods to accommodate different use cases, from simple Docker deployments to development environments. This guide covers all supported installati...

Section System Requirements

Continue reading this section for the full explanation and source context.

Section Required Dependencies

Continue reading this section for the full explanation and source context.

Section Docker Installation (Recommended)

Continue reading this section for the full explanation and source context.

Section Python pip Installation

Continue reading this section for the full explanation and source context.

Open WebUI provides multiple installation methods to accommodate different use cases, from simple Docker deployments to development environments. This guide covers all supported installation approaches, configuration options, and environment variables required for a successful setup.

Prerequisites

System Requirements

Component	Minimum	Recommended
Python	3.11	3.11+
RAM	4 GB	8 GB+
Disk	10 GB	20 GB+
Docker	20.10+	Latest
GPU	Optional	NVIDIA GPU with CUDA

Required Dependencies

The backend requires the following core packages for basic operation:

fastapi==0.135.1
uvicorn[standard]==0.41.0
pydantic==2.12.5
python-multipart==0.0.22
itsdangerous==2.2.0
python-socketio==5.16.1
python-jose==3.5.0
cryptography
sqlalchemy==2.0.48
aiosqlite==0.21.0

Sources: backend/requirements-min.txt:1-15

Installation Methods

Docker Installation (Recommended)

Docker is the recommended installation method for production use. Open WebUI provides multiple official images with different configurations.

#### Docker Image Variants

Tag	Description	Use Case
`main`	Base Open WebUI	Standard deployment
`cuda`	With CUDA support	NVIDIA GPU acceleration
`ollama`	Bundled with Ollama	Local model inference
`dev`	Development build	Testing latest features

#### Basic Docker Installation

For connecting to Ollama on localhost:

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Sources: README.md:42-47

#### NVIDIA GPU Support

To enable GPU acceleration:

docker run -d -p 3000:8080 \
  --gpus all \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:cuda

Sources: README.md:53-59

#### Bundled Ollama Installation

For a streamlined setup with both Open WebUI and Ollama in a single container:

With GPU Support:

docker run -d -p 3000:8080 --gpus=all \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:ollama

CPU Only:

docker run -d -p 3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:ollama

Sources: README.md:64-79

#### OpenAI API Only

For environments using only the OpenAI API:

docker run -d -p 3000:8080 \
  -e OPENAI_API_KEY=your_secret_key \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Sources: README.md:50-56

#### Remote Ollama Server

To connect to Ollama on a different server:

docker run -d -p 3000:8080 \
  -e OLLAMA_BASE_URL=https://example.com \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Sources: README.md:40-46

Python pip Installation

Open WebUI can be installed directly via pip for environments without Docker.

#### Requirements

Python 3.11 or higher
pip package manager

#### Installation Steps

Install Open WebUI package:

pip install open-webui

Start the server:

open-webui serve

The server will be accessible at http://localhost:8080.

Sources: README.md:12-25

Development Installation

#### Using the Dev Branch

[!WARNING]

The :dev branch contains unstable features. Use at your own risk.

docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --add-host=host.docker.internal:host-gateway \
  --restart always \
  ghcr.io/open-webui/open-webui:dev

Sources: README.md:27-34

Environment Configuration

Core Environment Variables

Variable	Description	Default
`OLLAMA_BASE_URL`	Ollama server URL	`http://localhost:11434`
`OPENAI_API_KEY`	OpenAI API key	-
`WEBUI_SECRET_KEY`	Session encryption key	Auto-generated
`WEBUI_SESSION_COOKIE_SECURE`	Secure cookie flag	`True`
`WEBUI_SESSION_COOKIE_SAME_SITE`	Cookie SameSite policy	`Lax`

Sources: backend/open_webui/main.py:18-35

Database Configuration

Variable	Description	Default
`DATABASE_URL`	Database connection string	SQLite
`ENABLE_DATABASE_ENCRYPTION`	Enable SQLite encryption	`False`

#### Supported Databases

SQLite: Default, requires no configuration
PostgreSQL: Set DATABASE_URL to PostgreSQL connection string
Redis: For session management and caching

Sources: backend/open_webui/env.py

Redis Configuration

REDIS_URL=redis://localhost:6379
REDIS_KEY_PREFIX=open-webui
REDIS_SENTINEL_HOSTS=host1:26379,host2:26379
REDIS_SENTINEL_PORT=26379

Sources: backend/open_webui/main.py:15-18

Security Configuration

Variable	Description	Default
`ENABLE_SIGNUP_PASSWORD_CONFIRMATION`	Require password confirmation	`True`
`WEBUI_AUTH_TRUSTED_EMAIL_HEADER`	Trusted email header for SSO	-
`WEBUI_AUTH_SIGNOUT_REDIRECT_URL`	Signout redirect URL	-

Sources: backend/open_webui/main.py:36-38

Audit Logging

Variable	Description	Default
`ENABLE_AUDIT_GET_REQUESTS`	Log GET requests	`False`
`AUDIT_INCLUDED_PATHS`	Paths to include	-
`AUDIT_EXCLUDED_PATHS`	Paths to exclude	-
`AUDIT_LOG_LEVEL`	Logging verbosity	`INFO`

Sources: backend/open_webui/env.py:12-15

Observability

Variable	Description	Default
`ENABLE_OTEL`	Enable OpenTelemetry	`False`
`ENABLE_VERSION_UPDATE_CHECK`	Check for updates	`True`

Sources: backend/open_webui/main.py:48-51

Data Persistence

[!IMPORTANT]

Always mount the volume -v open-webui:/app/backend/data to prevent database loss.

The data directory contains:

SQLite database file
Uploaded files
Configuration cache
User sessions (if Redis not used)

-v open-webui:/app/backend/data

Sources: README.md:19-22

Offline Installation

For air-gapped environments, set the Hugging Face offline mode:

export HF_HUB_OFFLINE=1

Sources: README.md:36-38

Installation Architecture

graph TD
    A[User Request] --> B{Installation Method}
    B -->|Docker| C[Official Docker Image]
    B -->|pip| D[PyPI Package]
    
    C --> E{Configuration}
    D --> E
    
    E -->|OLLAMA_BASE_URL| F[Ollama Server]
    E -->|OPENAI_API_KEY| G[OpenAI API]
    E -->|Database Config| H[(Database)]
    
    F --> I[Model Inference]
    G --> J[API Processing]
    
    H --> K[Application State]
    I --> L[Response]
    J --> L
    K --> L

Docker Compose Installation

For production deployments, use Docker Compose with persistent storage:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"
    volumes:
      - open-webui:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped

volumes:
  open-webui:

Troubleshooting

Common Issues

Issue	Solution
Connection refused to Ollama	Check `OLLAMA_BASE_URL` and ensure Ollama is running
Database errors	Verify volume mount is correct
GPU not detected	Ensure NVIDIA Container Toolkit is installed
Port conflicts	Change host port mapping

Verification

After installation, verify the service is running:

curl http://localhost:3000/api/v1/models

The server should respond with available models from the configured backend.

Sources: README.md:40-60

Next Steps

After successful installation:

Access the web interface at http://localhost:3000
Configure additional models and backends
Set up user authentication and RBAC
Configure retrieval and RAG pipelines
Integrate additional tools and extensions

Sources: [backend/requirements-min.txt:1-15](https://github.com/open-webui/open-webui/blob/main/backend/requirements-min.txt)

Architecture Overview

Related topics: Data Models, API Routers, Frontend Structure

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Backend Structure

Continue reading this section for the full explanation and source context.

Section Frontend Structure

Continue reading this section for the full explanation and source context.

Section API Endpoint Structure

Continue reading this section for the full explanation and source context.

Related topics: Data Models, API Routers, Frontend Structure

Architecture Overview

Open WebUI is a self-hosted, extensible AI interface designed to provide a unified chat experience with various LLM backends. The architecture follows a modern full-stack pattern with a Python-based backend and a Svelte-based frontend, communicating via REST APIs and WebSocket connections.

System Architecture

Open WebUI employs a layered architecture that separates concerns between presentation, business logic, and data access:

graph TD
    subgraph Frontend["Frontend (Svelte/SvelteKit)"]
        UI["UI Components<br/>(+layout.svelte)"]
        Utils["Utilities<br/>(src/lib/utils)"]
        APIs["API Client<br/>(src/lib/apis)"]
        Const["Constants<br/>(src/lib/constants)"]
    end

    subgraph Backend["Backend (Python/FastAPI)"]
        Main["Main Application<br/>(main.py)"]
        Socket["WebSocket Server<br/>(socket/main.py)"]
        Config["Configuration<br/>(config.py)"]
        Env["Environment<br/>(env.py)"]
        Middleware["Middleware<br/>(middleware.py)"]
        Retrieval["Retrieval System<br/>(retrieval/)"]
    end

    subgraph External["External Services"]
        Ollama["Ollama API"]
        OpenAI["OpenAI API"]
        VectorDB["Vector Databases"]
        Redis["Redis Session Store"]
        DB["SQLite/PostgreSQL"]
    end

    UI --> Utils
    UI --> APIs
    Utils --> Const
    APIs --> Const
    APIs --> Main
    UI --> Socket
    
    Main --> Config
    Main --> Env
    Main --> Middleware
    Main --> Retrieval
    Main --> Socket
    
    Main --> Ollama
    Main --> OpenAI
    Main --> VectorDB
    Main --> Redis
    Main --> DB

Directory Structure

The repository is organized into two main components:

Directory	Purpose
`backend/`	Python/FastAPI backend application
`src/`	Svelte/SvelteKit frontend application

Backend Structure

Path	Description
`backend/open_webui/`	Main application package
`backend/open_webui/main.py`	FastAPI application entry point
`backend/open_webui/socket/main.py`	Socket.IO WebSocket handler
`backend/open_webui/config.py`	Persistent configuration system
`backend/open_webui/env.py`	Environment variable loading
`backend/open_webui/utils/middleware.py`	Response processing middleware
`backend/open_webui/retrieval/`	RAG and document retrieval system

Frontend Structure

Path	Description
`src/routes/`	SvelteKit routes and page components
`src/lib/`	Shared libraries and utilities
`src/lib/apis/`	API client implementations
`src/lib/utils/`	Utility functions
`src/lib/constants.ts`	Application constants and configuration

API Architecture

API Endpoint Structure

Open WebUI exposes multiple API bases for different services:

graph LR
    subgraph Gateway["API Gateway"]
        Base["/"]
    end
    
    subgraph Services["Service Endpoints"]
        API["/api/v1<br/>REST API"]
        Ollama["/ollama<br/>Ollama Proxy"]
        OpenAI["/openai<br/>OpenAI Proxy"]
        Audio["/api/v1/audio<br/>Audio Processing"]
        Images["/api/v1/images<br/>Image Processing"]
        Retrieval["/api/v1/retrieval<br/>RAG Retrieval"]
    end
    
    Base --> API
    Base --> Ollama
    Base --> OpenAI
    Base --> Audio
    Base --> Images
    Base --> Retrieval

API Constants Configuration

API base URLs are defined in src/lib/constants.ts:

Constant	Default Value	Purpose
`WEBUI_BASE_URL`	Dynamic (dev/prod)	Base application URL
`WEBUI_API_BASE_URL`	`${WEBUI_BASE_URL}/api/v1`	Main REST API
`OLLAMA_API_BASE_URL`	`${WEBUI_BASE_URL}/ollama`	Ollama API proxy
`OPENAI_API_BASE_URL`	`${WEBUI_BASE_URL}/openai`	OpenAI API proxy
`AUDIO_API_BASE_URL`	`${WEBUI_BASE_URL}/api/v1/audio`	Audio processing
`IMAGES_API_BASE_URL`	`${WEBUI_BASE_URL}/api/v1/images`	Image generation
`RETRIEVAL_API_BASE_URL`	`${WEBUI_BASE_URL}/api/v1/retrieval`	RAG retrieval

Sources: src/lib/constants.ts:1-15

API Client Pattern

The frontend uses a consistent API client pattern implemented in src/lib/apis/:

// Pattern used across all API clients
const res = await fetch(`${WEBUI_API_BASE_URL}/endpoint`, {
    method: 'METHOD',
    headers: {
        Accept: 'application/json',
        'Content-Type': 'application/json',
        authorization: `Bearer ${token}`
    },
    body: JSON.stringify({ /* payload */ })
})
    .then(async (res) => {
        if (!res.ok) throw await res.json();
        return res.json();
    });

Sources: src/lib/apis/knowledge/index.ts:1-35

Configuration System

Environment Setup

The backend loads configuration from environment variables and .env files using the following hierarchy defined in backend/open_webui/env.py:

Variable	Description
`OPEN_WEBUI_DIR`	Application directory (location of `env.py`)
`BACKEND_DIR`	Parent of `open_webui/`
`BASE_DIR`	Repository root
`DOCKER`	Docker environment flag
`USE_CUDA_DOCKER`	CUDA/GPU acceleration flag

Sources: backend/open_webui/env.py:1-45

Persistent Configuration

Configuration values are stored persistently using the PersistentConfig system:

ENABLE_CODE_EXECUTION = PersistentConfig(
    'ENABLE_CODE_EXECUTION',
    'code_execution.enable',
    os.environ.get('ENABLE_CODE_EXECUTION', 'True').lower() == 'true',
)

CODE_EXECUTION_ENGINE = PersistentConfig(
    'CODE_EXECUTION_ENGINE',
    'code_execution.engine',
    os.environ.get('CODE_EXECUTION_ENGINE', 'pyodide'),
)

Sources: backend/open_webui/config.py:1-50

Supported File Types

The application supports various file upload types:

Category	MIME Types
Documents	`application/pdf`, `application/epub+zip`, `application/vnd.openxmlformats-officedocument.wordprocessingml.document`
Text	`text/plain`, `text/csv`, `text/xml`, `text/html`, `text/x-python`, `text/css`, `text/markdown`
Code	`text/x-python`, `text/css`, `application/x-javascript`
Media	`audio/mpeg`, `audio/wav`
Other	`application/octet-stream`

Sources: src/lib/constants.ts:20-35

WebSocket Communication

Real-time communication uses Socket.IO for bidirectional messaging:

sequenceDiagram
    participant Client as Frontend
    participant Socket as Socket.IO Server
    participant Main as Main Application
    
    Client->>Socket: Connect with auth token
    Socket->>Main: Validate session
    Main->>Socket: Session valid
    Socket->>Client: Connection established
    
    Client->>Socket: Send message event
    Socket->>Main: Forward message
    Main->>Main: Process with LLM
    Main->>Socket: Stream response
    Socket->>Client: Stream chunks
    
    Client->>Socket: Disconnect
    Socket->>Client: Connection closed

Sources: backend/open_webui/socket/main.py

Middleware Pipeline

The middleware system processes responses and transforms content for the frontend. The build_output() function in backend/open_webui/utils/middleware.py handles special content types:

Content Type Processing

| reasoning | `' )

Content Type	Rendering	Description


Sources: [backend/open_webui/utils/middleware.py:1-80]()

### Deep Merge Utility

The middleware also provides a `deep_merge()` function for combining configuration:

| Behavior | Description |
|----------|-------------|
| Dicts | Recursive merge |
| Strings | Concatenation |
| Others | Overwrite |

Sources: [backend/open_webui/utils/middleware.py:75-85]()

## Frontend Application Structure

### Layout System

The main layout is defined in `src/routes/+layout.svelte` which serves as the root component:

graph TD Layout["+layout.svelte Root Layout"] Splash["Splash Screen (#splash-screen)"] Progress["Progress Bar (#progress-bar)"] Logo["Logo Elements (#logo, #logo-her)"] Theme["Theme Detection (.dark, .her)"]

Layout --> Splash Layout --> Progress Layout --> Logo Layout --> Theme


Sources: [src/app.html:1-60]()

### Utility Libraries

| Library | Purpose |
|---------|---------|
| `src/lib/utils/index.ts` | Content processing, sanitization, Chinese language handling |
| `src/lib/utils/codeHighlight.ts` | Code syntax highlighting with Shiki |
| `src/lib/apis/index.ts` | API client exports |

### Content Processing Pipeline

The `processResponseContent()` function handles special content transformations:

export const processResponseContent = (content: string) => { content = processChineseContent(content); return content.trim(); };

export const sanitizeResponseContent = (content: string) => { return content .replace(/<\|[a-z]*$/, '') .replace(/<\|[a-z]+\|$/, '') .replace(/<$/, '') .replaceAll('<', '<') .replaceAll('>', '>') .replaceAll(/<\|[a-z]+\|>/g, ' ') .trim(); };


Sources: [src/lib/utils/index.ts:1-50]()

## Retrieval System

The RAG (Retrieval-Augmented Generation) system supports multiple document loaders and search engines:

### Supported Document Sources

| Source | Configuration |
|--------|--------------|
| External Document Loader | `EXTERNAL_DOCUMENT_LOADER_URL`, `EXTERNAL_DOCUMENT_LOADER_API_KEY` |
| Apache TIKA | `TIKA_SERVER_URL` |
| Docling | `DOCLING_SERVER_URL`, `DOCLING_API_KEY`, `DOCLING_PARAMS` |
| Mistral OCR | `MISTRAL_OCR_API_BASE_URL`, `MISTRAL_OCR_API_KEY` |
| PaddleOCR VL | `PADDLEOCR_VL_BASE_URL`, `PADDLEOCR_VL_TOKEN` |
| MinerU | `MINERU_API_URL`, `MINERU_API_KEY`, `MINERU_PARAMS` |

### Supported Search Providers

| Provider | Notes |
|----------|-------|
| SearXNG | Self-hosted metasearch |
| Google PSE | Programmable Search Engine |
| Brave Search | Privacy-focused search |
| Ollama Cloud | LLM provider search |
| Azure AI Search | Enterprise search |

Sources: [backend/open_webui/retrieval/utils.py:1-60]()

## Code Execution Engine

Open WebUI supports code execution with configurable backends:

### Configuration Options

| Setting | Default | Description |
|---------|---------|-------------|
| `ENABLE_CODE_EXECUTION` | `True` | Enable/disable code execution |
| `CODE_EXECUTION_ENGINE` | `pyodide` | Execution engine (pyodide/jupyter) |
| `CODE_EXECUTION_JUPYTER_URL` | `''` | Jupyter server URL |
| `CODE_EXECUTION_JUPYTER_AUTH` | `''` | Jupyter authentication |
| `CODE_EXECUTION_JUPYTER_AUTH_TOKEN` | `''` | Jupyter auth token |

### Execution Environments

| Engine | Environment | Constraints |
|--------|-------------|-------------|
| Pyodide | Browser-based | Cannot install packages, `pip install` unavailable |
| Jupyter | External server | Requires URL and optional authentication |

Sources: [backend/open_webui/config.py:50-100]()

## Technology Stack

### Backend Dependencies

Key packages from `backend/requirements-min.txt`:

| Package | Version | Purpose |
|---------|---------|---------|
| `fastapi` | 0.135.1 | Web framework |
| `uvicorn[standard]` | 0.41.0 | ASGI server |
| `pydantic` | 2.12.5 | Data validation |
| `python-multipart` | 0.0.22 | Form parsing |
| `python-socketio` | 5.16.1 | WebSocket support |
| `sqlalchemy` | 2.0.48 | ORM |
| `aiosqlite` | 0.21.0 | Async SQLite |
| `psycopg[binary]` | 3.2.9 | PostgreSQL driver |
| `httpx[socks,http2,zstd,cli,brotli]` | 0.28.1 | HTTP client |
| `redis` | latest | Session storage |
| `pycrdt` | 0.12.47 | CRDT for collaboration |
| ` RestrictedPython` | 8.1 | Safe Python execution |

Sources: [backend/requirements-min.txt:1-40]()

### Frontend Architecture

| Technology | Purpose |
|------------|---------|
| SvelteKit | Frontend framework |
| TypeScript | Type safety |
| Shiki | Code syntax highlighting |

## Security Considerations

### Authentication Flow

The system uses Bearer token authentication for API requests:

headers: { authorization: Bearer ${token} }


### Role-Based Access Control (RBAC)

Open WebUI implements RBAC for:
- Ollama endpoint access
- Model creation/pulling rights
- Knowledge base permissions

Sources: [README.md]()

## Deployment Modes

### Docker Deployment

docker run -d -p 3000:8080 \ -v open-webui:/app/backend/data \ --name open-webui \ --add-host=host.docker.internal:host-gateway \ --restart always \ ghcr.io/open-webui/open-webui:latest


### Python pip Installation

pip install open-webui open-webui serve


### Environment Variables

| Variable | Values | Description |
|----------|--------|-------------|
| `DOCKER` | `True`/`False` | Docker environment detection |
| `USE_CUDA_DOCKER` | `true`/`false` | GPU acceleration |
| `HF_HUB_OFFLINE` | `1` | Offline mode (prevent downloads) |

Sources: [README.md](), [backend/open_webui/env.py:30-40]()

Sources: [src/lib/constants.ts:1-15]()

Data Models

Related topics: Architecture Overview, API Routers

Section Related Pages

Continue reading this section for the full explanation and source context.

Section User Model

Continue reading this section for the full explanation and source context.

Section Chat Model

Continue reading this section for the full explanation and source context.

Section Message Model

Continue reading this section for the full explanation and source context.

Related topics: Architecture Overview, API Routers

Data Models

Overview

The Open WebUI project implements a comprehensive data modeling layer that manages persistent storage for all core application entities. The data models are built using SQLAlchemy ORM and follow a structured approach to storing user interactions, configurations, and content within the application.

The data model architecture serves as the foundation for:

User Management: Authentication, authorization, and user preferences
Chat Persistence: Message history and conversation state
Knowledge Bases: RAG (Retrieval-Augmented Generation) document storage
File Management: Document uploads and attachments
Access Control: Permission management through groups and grants

Sources: backend/open_webui/internal/db.py:1-50

Architecture Overview

Open WebUI uses a layered data access architecture where models are defined as SQLAlchemy ORM classes and accessed through service layers.

graph TD
    A[API Routers] --> B[Service Layer]
    B --> C[Data Models]
    C --> D[SQLAlchemy ORM]
    D --> E[(SQLite Database)]
    
    F[ChatMessages Table] --> C
    G[Chats Table] --> C
    H[Users Table] --> C
    I[Knowledge Table] --> C
    J[Files Table] --> C

Core Data Models

User Model

The User model manages user accounts, authentication, and preferences.

class UserModel(BaseModel):
    id: str
    name: str
    email: Optional[str]
    role: str  # admin, user, guest
    email_verified: bool
    created_at: datetime
    updated_at: datetime
    settings: dict
    keys: list

Field	Type	Description
`id`	String	Unique user identifier (UUID)
`name`	String	Display name
`email`	String (nullable)	User email address
`role`	Enum	User role: `admin`, `user`, `guest`
`email_verified`	Boolean	Email verification status
`created_at`	DateTime	Account creation timestamp
`updated_at`	DateTime	Last modification timestamp
`settings`	JSON	User preferences and configurations

Sources: backend/open_webui/models/users.py:1-100

Chat Model

The Chat model stores conversation sessions and their associated metadata.

graph LR
    A[User] -->|has many| B[Chats]
    B -->|contains| C[Messages]
    B -->|references| D[ChatMessages Table]
    D -->|links to| E[Messages JSON]

The Chat model structure:

class ChatModel(BaseModel):
    id: str
    user_id: str
    title: str
    chat: dict  # Contains history, messages, metadata
    created_at: datetime
    updated_at: datetime
    share_id: Optional[str]
    archived: bool

Field	Type	Description
`id`	String	Unique chat identifier
`user_id`	String	Owner user ID
`title`	String	Chat title
`chat`	JSON	Full chat history and state
`share_id`	String (nullable)	Public sharing identifier
`archived`	Boolean	Archive status

The chat field contains a nested JSON structure:

{
  "history": {
    "messages": {
      "message_id": {
        "id": "...",
        "type": "human|ai|system",
        "content": "...",
        "created_at": "..."
      }
    }
  },
  "metadata": {}
}

Sources: backend/open_webui/models/chats.py:1-150

Message Model

The Message model represents individual messages within a chat conversation.

graph TD
    A[Message] --> B[type]
    A --> C[content]
    A --> D[role]
    A --> E[timestamp]
    
    B --> F[human|ai|system|tool]
    C --> G[text|images|files]

class MessageModel(BaseModel):
    id: str
    chat_id: str
    message_id: str
    type: str  # human, ai, system, tool
    role: str
    content: str
    files: list
    images: list
    created_at: datetime

Field	Type	Description
`id`	String	Unique message ID
`chat_id`	String	Parent chat ID
`message_id`	String	Message identifier within chat
`type`	Enum	Message type
`role`	String	Role: `user`, `assistant`, `system`, `tool`
`content`	String	Message content
`files`	List	Attached file references
`images`	List	Embedded image data

Sources: backend/open_webui/models/messages.py:1-100

Knowledge Model

The Knowledge model manages RAG knowledge bases for document retrieval.

class KnowledgeModel(BaseModel):
    id: str
    user_id: str
    name: str
    description: str
    created_at: datetime
    updated_at: datetime
    data: dict  # Contains documents and vectors

Field	Type	Description
`id`	String	Knowledge base ID
`user_id`	String	Owner user ID
`name`	String	Knowledge base name
`description`	String	Knowledge base description
`data`	JSON	Documents and vector embeddings

Sources: backend/open_webui/models/knowledge.py:1-100

File Model

The File model handles file uploads and attachments.

class FileModel(BaseModel):
    id: str
    user_id: str
    filename: str
    path: str
    type: str
    size: int
    created_at: datetime
    data: dict  # Metadata

Field	Type	Description
`id`	String	File identifier
`user_id`	String	Owner user ID
`filename`	String	Original filename
`path`	String	Storage path
`type`	String	MIME type
`size`	Integer	File size in bytes
`data`	JSON	Additional metadata

Sources: backend/open_webui/models/files.py:1-100

Database Schema

Entity Relationship Diagram

erDiagram
    USERS ||--o{ CHATS : "owns"
    USERS ||--o{ FILES : "owns"
    USERS ||--o{ KNOWLEDGE : "owns"
    USERS ||--o{ MESSAGES : "sends"
    
    CHATS ||--o{ CHAT_MESSAGES : "contains"
    CHAT_MESSAGES ||--|| MESSAGES : "references"
    
    KNOWLEDGE ||--o{ DOCUMENTS : "contains"
    
    USERS ||--o{ GROUPS : "belongs to"
    GROUPS ||--o{ ACCESS_GRANTS : "grants"
    
    CHATS ||--o| SHARES : "can be shared"

Database Tables

Table Name	Primary Key	Description
`users`	`id`	User accounts and settings
`chats`	`id`	Chat session storage
`chat_messages`	`id, chat_id, message_id`	Normalized message storage
`messages`	`id`	Message content (embedded in chats)
`knowledge`	`id`	Knowledge base definitions
`documents`	`id`	Knowledge base documents
`files`	`id`	File metadata
`folders`	`id`	Folder organization
`groups`	`id`	User groups
`access_grants`	`id`	Permission grants
`memories`	`id`	User memory storage
`channels`	`id`	Communication channels
`notes`	`id`	User notes

Sources: backend/open_webui/migrations/versions/7e5b5dc7342b_init.py:1-500

Access Control Models

User Groups

class GroupModel(BaseModel):
    id: str
    name: str
    description: str
    created_at: datetime
    user_id: str  # Creator/owner

Access Grants

graph TD
    A[User] -->|belongs to| B[Groups]
    B -->|grants| C[Access Grants]
    C -->|applies to| D[Resource]
    
    D --> E[Model]
    D --> F[Knowledge]
    D --> G[Tool]
    D --> H[Function]

Field	Type	Description
`id`	String	Grant identifier
`user_id`	String	User receiving access
`group_id`	String	Group granting access
`resource_type`	Enum	Type: `model`, `knowledge`, `tool`, `function`
`resource_id`	String	Target resource ID
`permission`	String	Permission level: `read`, `write`, `admin`

Sources: backend/open_webui/utils/access_control/__init__.py:1-50

Service Layer Integration

Chat Service Pattern

The Chat model provides methods for message management:

async def get_messages_map_by_chat_id(id: str) -> dict:
    """Get message map for walking history."""
    
async def get_message_by_id_and_message_id(
    id: str, 
    message_id: str
) -> Optional[dict]:
    """Retrieve specific message from chat."""
    
async def upsert_message_to_chat_by_id_and_message_id(
    id: str, 
    message_id: str, 
    message: dict
) -> Optional[ChatModel]:
    """Update or insert message in chat."""

Message Sanitization

Before database operations, message content is sanitized to prevent issues:

def sanitize_text_for_db(text: str) -> str:
    """Remove null characters and invalid sequences."""

This ensures database compatibility and prevents JSON parsing errors when loading chat history.

Sources: backend/open_webui/models/chats.py:100-180

Model Operations

CRUD Operations

Operation	Method	Description
Create	`Model.create()`	Insert new record
Read	`Model.get()`	Retrieve by ID
Update	`Model.update()`	Modify existing record
Delete	`Model.delete()`	Remove record
List	`Model.get_all()`	Retrieve all records
Filter	`Model.filter_by()`	Query with conditions

Async Database Access

Open WebUI uses async database operations for improved performance:

async def get_chat_by_id(id: str) -> Optional[ChatModel]:
    """Async retrieval of chat by ID."""
    
async def upsert_message_to_chat_by_id_and_message_id(
    id: str, 
    message_id: str, 
    message: dict
) -> Optional[ChatModel]:
    """Async upsert operation."""

Data Storage Locations

Database File

By default, Open WebUI uses SQLite stored at:

backend/data/webui.db

File Storage

Uploaded files are stored in:

backend/data/uploads/

Configuration

Database and storage paths are configured via environment variables:

Variable	Default	Description
`DATA_DIR`	`backend/data`	Base data directory
`DATABASE_URL`	`sqlite:///data/webui.db`	Database connection string

Sources: backend/open_webui/env.py:1-80

Migration System

Open WebUI uses Alembic for database migrations:

graph LR
    A[Migration Scripts] --> B[Alembic]
    B --> C[Database Schema]
    C --> D[Model Definitions]
    D --> E[Application]

Migration files are located in:

backend/open_webui/migrations/versions/

Sources: backend/open_webui/migrations/versions/7e5b5dc7342b_init.py:1-500

Summary

The Open WebUI data model layer provides a robust foundation for:

User Management: Complete user lifecycle including authentication and authorization
Chat Persistence: Flexible JSON-based chat storage with normalized message tables
Knowledge Management: RAG-capable knowledge bases for document retrieval
File Handling: Secure file upload and storage with metadata tracking
Access Control: Fine-grained permissions through groups and resource grants

The architecture prioritizes:

Performance: Async database operations and message normalization
Flexibility: JSON-based storage for variable content structures
Security: Text sanitization and access control enforcement
Extensibility: Modular model design for future features

Sources: [backend/open_webui/internal/db.py:1-50]()

API Routers

Related topics: Architecture Overview, Data Models

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Central Router Assembly

Continue reading this section for the full explanation and source context.

Section Router Prefix Mapping

Continue reading this section for the full explanation and source context.

Section Middleware Stack

Continue reading this section for the full explanation and source context.

Related topics: Architecture Overview, Data Models

API Routers

Overview

The Open WebUI project implements a comprehensive API routing architecture built on FastAPI. API Routers serve as the primary mechanism for organizing and exposing RESTful endpoints across the application. Each router encapsulates a specific functional domain (e.g., authentication, chat management, file handling, knowledge bases) and is mounted at a defined prefix under the /api/v1/ base path.

The router architecture follows a modular design pattern where related endpoints are grouped into dedicated router modules located in backend/open_webui/routers/. This separation of concerns enables maintainability, testability, and clear API boundaries.

Sources: backend/open_webui/main.py:1-60

Router Registration Architecture

Central Router Assembly

All routers are registered in backend/open_webui/main.py using FastAPI's include_router() method. Each router receives a unique URL prefix and OpenAPI tag for documentation and routing purposes.

app.include_router(auths.router, prefix='/api/v1/auths', tags=['auths'])
app.include_router(users.router, prefix='/api/v1/users', tags=['users'])
app.include_router(chats.router, prefix='/api/v1/chats', tags=['chats'])
app.include_router(models.router, prefix='/api/v1/models', tags=['models'])
app.include_router(knowledge.router, prefix='/api/v1/knowledge', tags=['knowledge'])
app.include_router(files.router, prefix='/api/v1/files', tags=['files'])

Sources: backend/open_webui/main.py:35-55

Router Prefix Mapping

Functional Domain	Router Module	API Prefix	OpenAPI Tag
Authentication	`auths`	`/api/v1/auths`	`auths`
User Management	`users`	`/api/v1/users`	`users`
Chat Operations	`chats`	`/api/v1/chats`	`chats`
Model Management	`models`	`/api/v1/models`	`models`
Knowledge Bases	`knowledge`	`/api/v1/knowledge`	`knowledge`
File Handling	`files`	`/api/v1/files`	`files`
Prompts	`prompts`	`/api/v1/prompts`	`prompts`
Tools	`tools`	`/api/v1/tools`	`tools`
Skills	`skills`	`/api/v1/skills`	`skills`
Memories	`memories`	`/api/v1/memories`	`memories`
Folders	`folders`	`/api/v1/folders`	`folders`
Groups	`groups`	`/api/v1/groups`	`groups`
Functions	`functions`	`/api/v1/functions`	`functions`
Evaluations	`evaluations`	`/api/v1/evaluations`	`evaluations`
Audio Processing	`audio`	`/api/v1/audio`	`audio`
Image Processing	`images`	`/api/v1/images`	`images`
Retrieval	`retrieval`	`/api/v1/retrieval`	`retrieval`
Configurations	`configs`	`/api/v1/configs`	`configs`
Channels	`channels`	`/api/v1/channels`	`channels`
Notes	`notes`	`/api/v1/notes`	`notes`
Tasks	`tasks`	`/api/v1/tasks`	`tasks`
Utils	`utils`	`/api/v1/utils`	`utils`
Terminals	`terminals`	`/api/v1/terminals`	`terminals`
Automations	`automations`	`/api/v1/automations`	`automations`
Calendars	`calendar`	`/api/v1/calendars`	`calendars`
SCIM Identity	`scim`	`/api/v1/scim/v2`	`scim`
Analytics	`analytics`	`/api/v1/analytics`	`analytics`

Sources: backend/open_webui/main.py:35-65

Request Flow and Middleware Pipeline

Middleware Stack

The API request lifecycle involves multiple middleware layers that process requests before they reach individual route handlers.

graph TD
    A[HTTP Request] --> B[ASGI Middleware]
    B --> C[Authentication Middleware]
    C --> D[Token Extraction<br/>API Key/Cookie/Bearer]
    D --> E[Audit Logging Middleware<br/>Conditional]
    E --> F[Pipeline Inlet Filter]
    F --> G[Route Handler]
    G --> H[Pipeline Outlet Filter]
    H --> I[Response]

Sources: backend/open_webui/utils/asgi_middleware.py:1-30

Authentication Middleware

The ASGI middleware (asgi_middleware.py) handles credential extraction from multiple sources:

Bearer Token: Extracted from Authorization header
Cookie Token: Retrieved from token cookie
API Key: Retrieved from custom header specified by CUSTOM_API_KEY_HEADER environment variable

The extracted credentials are stored in request.state.token for downstream route handlers.

Sources: backend/open_webui/utils/asgi_middleware.py:12-40

Pipeline Filter System

The pipelines.py module implements a filter system that allows middleware-like processing at the inlet and outlet of request handling. This enables transformation and validation of payloads through user-defined pipeline stages.

def get_sorted_filters(model_id, models):
    filters = [
        model
        for model in models.values()
        if 'pipeline' in model
        and 'type' in model['pipeline']
        and model['pipeline']['type'] == 'filter'
        and (
            model['pipeline']['pipelines'] == ['*']
            or any(model_id == target_model_id for target_model_id in model['pipeline']['pipelines'])
        )
    ]
    sorted_filters = sorted(filters, key=lambda x: x['pipeline']['priority'])
    return sorted_filters

Sources: backend/open_webui/routers/pipelines.py:30-45

Router Module Structure

Standard Router Pattern

Each router module follows a consistent pattern:

from fastapi import APIRouter, Depends, HTTPException, Request, status
from pydantic import BaseModel
from typing import Optional
from open_webui.utils.auth import get_verified_user, get_admin_user

router = APIRouter()

class EndpointForm(BaseModel):
    # Request payload schema

@router.post('/endpoint')
async def endpoint_handler(
    request: Request,
    form_data: EndpointForm,
    user=Depends(get_verified_user)
):
    # Handler implementation

Sources: backend/open_webui/routers/prompts.py:1-30

Authentication Dependencies

Dependency	Purpose	Access Level
`get_verified_user`	Validates authenticated user	Authenticated users
`get_admin_user`	Validates admin privileges	Admin only

Sources: backend/open_webui/routers/prompts.py:25-30

Core Router Modules

Tasks Router

The tasks router (tasks.py) handles asynchronous operations for chat-related tasks including title generation, follow-up generation, query generation, and image prompt generation.

Task Types Available:

Task	Purpose	Template Function
Title Generation	Create chat titles	`title_generation_template()`
Follow-up Generation	Generate follow-up questions	`follow_up_generation_template()`
Query Generation	Create search queries	`query_generation_template()`
Image Prompt Generation	Generate image prompts	`image_prompt_generation_template()`
Autocomplete	Autocomplete suggestions	`autocomplete_generation_template()`
Tags Generation	Generate content tags	`tags_generation_template()`
Emoji Generation	Generate emoji suggestions	`emoji_generation_template()`
MoA Response	Mixture of Agents response	`moa_response_generation_template()`

Sources: backend/open_webui/routers/tasks.py:1-40

Prompts Router

The prompts router manages user-defined prompt templates with command-based activation. It implements access control based on user roles and resource grants.

Access Control Logic:

write_access=(
    (user.role == 'admin' and BYPASS_ADMIN_ACCESS_CONTROL)
    or user.id == prompt.user_id
    or await AccessGrants.has_access(
        user_id=user.id,
        resource_type='prompt',
        resource_id=prompt.id,
        permission='write',
        db=db,
    )
)

Sources: backend/open_webui/routers/prompts.py:50-70

Conditional Router Loading

Some routers are conditionally loaded based on configuration flags:

SCIM Router

The SCIM 2.0 router for identity management is enabled via the ENABLE_SCIM environment variable:

if ENABLE_SCIM:
    app.include_router(scim.router, prefix='/api/v1/scim/v2', tags=['scim'])

Analytics Router

The analytics router is loaded when admin analytics are enabled:

if ENABLE_ADMIN_ANALYTICS:
    app.include_router(analytics.router, prefix='/api/v1/analytics', tags=['analytics'])

Audit Logging Middleware

Audit logging is conditionally applied based on the AUDIT_LOG_LEVEL configuration:

try:
    audit_level = AuditLevel(AUDIT_LOG_LEVEL)
except ValueError as e:
    logger.error(f'Invalid audit level: {AUDIT_LOG_LEVEL}. Error: {e}')
    audit_level = AuditLevel.NONE

if audit_level != AuditLevel.NONE:
    app.add_middleware(
        AuditLoggingMiddleware,
        audit_level=audit_level,
        excluded_paths=AUDIT_EXCLUDED_PATHS,
    )

Sources: backend/open_webui/main.py:55-70

Utility Functions and Helpers

Middleware Utility Imports

The middleware.py module aggregates utility functions from multiple sources for use by route handlers:

from open_webui.utils.chat import generate_chat_completion
from open_webui.utils.task import get_task_model_id, rag_template
from open_webui.utils.tools import get_tools, get_terminal_tools
from open_webui.utils.misc import (
    deep_update, extract_urls, get_message_list,
    add_or_update_system_message, merge_system_messages
)
from open_webui.utils.files import (
    convert_markdown_base64_images,
    get_file_url_from_base64,
    get_image_base64_from_url,
)

Sources: backend/open_webui/utils/middleware.py:1-35

Security Architecture

Token-Based Authentication

sequenceDiagram
    participant C as Client
    participant M as ASGI Middleware
    participant R as Route Handler
    
    C->>M: Request + Credentials
    M->>M: Extract Bearer/Cookie/API-Key
    M->>R: Set request.state.token
    R->>R: Verify with get_verified_user
    alt Invalid Token
        R-->>C: 401 Unauthorized
    else Valid Token
        R->>R: Process Request
        R-->>C: Response
    end

Sources: backend/open_webui/utils/asgi_middleware.py:20-50

Frontend API Integration

The frontend TypeScript codebase in src/lib/apis/ provides typed interfaces for all major routers:

Router Domain	Frontend Module
Knowledge Bases	`src/lib/apis/knowledge/index.ts`
Skills	`src/lib/apis/skills/index.ts`
OpenAI Config	`src/lib/apis/openai/index.ts`
Tool Servers	`src/lib/apis/index.ts`

The frontend uses WEBUI_API_BASE_URL constant (${WEBUI_BASE_URL}/api/v1) as the base for all API calls.

Sources: src/lib/constants.ts:1-20

Summary

The API Routers system in Open WebUI implements a well-organized, FastAPI-based architecture with:

Modular Design: 26+ functional router modules organized by domain
Consistent Patterns: Standardized router structure with Pydantic models and authentication dependencies
Middleware Pipeline: Request processing through ASGI middleware, authentication, audit logging, and pipeline filters
Conditional Loading: Feature flags for SCIM, analytics, and audit logging
Access Control: Role-based and grant-based authorization at the router and endpoint levels
Frontend Integration: TypeScript API clients aligned with backend router structure

Sources: [backend/open_webui/main.py:1-60]()

Retrieval System

Related topics: Ollama Integration, RAG Pipeline

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Supported File Types

Continue reading this section for the full explanation and source context.

Section OCR Processing

Continue reading this section for the full explanation and source context.

Related topics: Ollama Integration, RAG Pipeline

Retrieval System

The Retrieval System in Open WebUI is a comprehensive framework for document loading, web searching, and vector-based information retrieval. It enables users to ingest documents, perform web searches, and leverage retrieval-augmented generation (RAG) capabilities to enhance LLM responses with contextual information.

Architecture Overview

The retrieval system is composed of three primary subsystems:

graph TD
    subgraph Retrieval["Retrieval System"]
        subgraph Loaders["Document Loaders"]
            PDF[PDF Loader]
            OCR[OCR Loaders]
            WebLoader[Web Loader]
        end
        
        subgraph WebSearch["Web Search Providers"]
            SearXNG[SearXNG]
            DuckDuckGo[DuckDuckGo]
            GooglePSE[Google PSE]
            Brave[Brave Search]
            YouDC[You.com]
        end
        
        subgraph VectorDB["Vector Stores"]
            Chroma[Chroma]
            FAISS[FAISS]
            Milvus[Milvus]
            Qdrant[Qdrant]
            PGVector[pgvector]
        end
    end
    
    API[API Router] --> Loaders
    API --> WebSearch
    API --> VectorDB

Core Components

Component	Purpose	Location
Document Loaders	Ingest various file formats into the system	`backend/open_webui/retrieval/loaders/`
Web Search	Query external search engines for information	`backend/open_webui/retrieval/web/`
Vector Database	Store and query embeddings for semantic search	`backend/open_webui/retrieval/vector/`
API Router	Expose retrieval endpoints to the frontend	`backend/open_webui/routers/retrieval.py`

Document Loaders

The document loader subsystem handles ingestion of various file formats into the retrieval pipeline.

Supported File Types

The system supports the following file types for upload and processing:

Category	MIME Types
Documents	PDF, EPUB, DOCX, TXT, CSV, XML, HTML, Markdown
Code	Python, JavaScript, CSS
Audio	MP3, WAV
Images	PNG, JPG (with OCR)

Sources: src/lib/constants.ts

OCR Processing

For scanned documents and images, Open WebUI supports multiple OCR engines:

PaddleOCR VL is one of the supported OCR backends. It processes documents page-by-page, extracting text and returning structured Document objects with metadata.

# Processing flow in paddleocr_vl.py
for i, page in enumerate(doc):
    markdown_text = run_paddle_ocr(page)
    cleaned_content = clean_markdown(markdown_text)
    
    documents.append(
        Document(
            page_content=cleaned_content,
            metadata={
                'page': i,
                'page_label': i + 1,
                'total_pages': total_pages,
                'file_name': self.file_name,
                'processing_engine': 'paddleocr-vl',
            }
        )
    )

Sources: backend/open_webui/retrieval/loaders/paddleocr_vl.py

Configuration Options

The retrieval loaders are configured through the following environment variables:

Variable	Description
`EXTERNAL_DOCUMENT_LOADER_URL`	URL for external document loader service
`EXTERNAL_DOCUMENT_LOADER_API_KEY`	API key for external loader
`TIKA_SERVER_URL`	Apache Tika server endpoint
`DOCLING_SERVER_URL`	Docling OCR server endpoint
`DOCLING_API_KEY`	API key for Docling service
`DOCLING_PARAMS`	Additional Docling parameters
`PDF_EXTRACT_IMAGES`	Enable image extraction from PDFs
`PDF_LOADER_MODE`	PDF loading mode configuration
`DOCUMENT_INTELLIGENCE_ENDPOINT`	Azure Document Intelligence endpoint
`DOCUMENT_INTELLIGENCE_KEY`	Azure Document Intelligence API key
`DOCUMENT_INTELLIGENCE_MODEL`	Model identifier for document processing
`MISTRAL_OCR_API_BASE_URL`	Mistral OCR API base URL
`MISTRAL_OCR_API_KEY`	Mistral OCR API key
`PADDLEOCR_VL_BASE_URL`	PaddleOCR VL server URL
`PADDLEOCR_VL_TOKEN`	Authentication token for PaddleOCR VL
`MINERU_API_MODE`	MinerU API mode
`MINERU_API_URL`	MinerU API endpoint
`MINERU_API_KEY`	MinerU API key
`MINERU_API_TIMEOUT`	MinerU API timeout in seconds

Sources: backend/open_webui/retrieval/utils.py

Web Search

The web search subsystem provides integration with multiple search providers for retrieving up-to-date information from the internet.

Supported Providers

Provider	Implementation	Features
SearXNG	Self-hosted meta-search engine	Privacy-focused, aggregated results
DuckDuckGo	Public search API	No API key required
Google PSE	Google Programmable Search	Requires API key
Brave Search	Privacy-focused search	API-based
You.com	AI-enhanced search	Rich snippets and descriptions
Tavily	AI-optimized search	Structured outputs
Perplexity	LLM-optimized search	Citations included

Search Result Structure

Search results are normalized into a common SearchResult format:

@dataclass
class SearchResult:
    link: str           # URL of the result
    title: str          # Title of the page
    snippet: str        # Text snippet/summary

#### You.com Implementation

The You.com provider demonstrates the search result normalization:

def _build_snippet(result: dict) -> str:
    """Combine the description and snippets list into a single string."""
    parts: list[str] = []
    
    description = result.get('description')
    if description:
        parts.append(description)
    
    snippets = result.get('snippets')
    if snippets and isinstance(snippets, list):
        parts.extend(snippets)
    
    return '\n\n'.join(parts)

Sources: backend/open_webui/retrieval/web/ydc.py

Web Loader Configuration

The web loader for content extraction supports the following configuration:

Setting	Description
`ENABLE_WEB_LOADER_SSL_VERIFICATION`	Enable SSL certificate verification
`WEB_LOADER_CONCURRENT_REQUESTS`	Rate limiting for concurrent requests
`WEB_SEARCH_TRUST_ENV`	Trust environment variables for requests
`BYPASS_WEB_SEARCH_WEB_LOADER`	Skip content extraction, use snippets only
`BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL`	Skip embedding and retrieval stages

Sources: backend/open_webui/routers/retrieval.py

Web Search Flow

sequenceDiagram
    participant Client
    participant API as /api/v1/retrieval/web/search
    participant SearchProvider as Search Provider
    participant WebLoader as Web Loader
    participant VectorDB as Vector Store
    
    Client->>API: POST /search {query, urls}
    API->>SearchProvider: Execute search queries
    SearchProvider-->>API: Raw search results
    API->>WebLoader: Extract content from URLs
    WebLoader-->>API: Document objects
    API->>VectorDB: Store documents
    VectorDB-->>API: Collection confirmation
    API-->>Client: {status, collection_name, files}

Vector Database Integration

The vector database subsystem handles storage and retrieval of document embeddings for semantic search.

Supported Vector Stores

Database	Implementation	Use Case
Chroma	`chromadb`	Lightweight, local-first
FAISS	`faiss-cpu`/`faiss-gpu`	Large-scale similarity search
Milvus	`pymilvus`	Cloud-native, scalable
Qdrant	`qdrant-client`	High-performance, hybrid search
pgvector	`psycopg2`	PostgreSQL extension for vectors

Vector Factory Pattern

The system uses a factory pattern to instantiate vector databases:

class VectorStoreFactory:
    @staticmethod
    def get_vector_store(config: Config) -> VectorStore:
        provider = config.VECTOR_DB
        if provider == "chromadb":
            return ChromaDBStore()
        elif provider == "pgvector":
            return PGVectorStore()
        # ... other providers

Sources: backend/open_webui/retrieval/vector/factory.py

pgvector Implementation

For PostgreSQL-based vector storage:

class PGVectorStore:
    def __init__(self, connection_string: str, embedding_dim: int = 1536):
        self.conn = psycopg2.connect(connection_string)
        self.embedding_dim = embedding_dim
    
    def insert(self, collection: str, documents: list[Document]):
        # Insert vectors with pgvector extension

Sources: backend/open_webui/retrieval/vector/dbs/pgvector.py

API Endpoints

The retrieval system exposes REST API endpoints through the router.

Web Search Endpoint

POST /api/v1/retrieval/web/search

Request Body:

{
  "query": "search query string",
  "collection_name": "optional_collection",
  "retrieval_enabled": true,
  "k": 5
}

Response:

{
  "status": true,
  "collection_name": "web_20240115_abc123",
  "filenames": ["python.org", "wikipedia.org"],
  "content": "extracted content...",
  "sources": [
    {"url": "https://python.org", "content": "..."}
  ]
}

File Upload and Processing

POST /api/v1/retrieval/upload

Handles file uploads, runs document loaders, and stores in the configured vector database.

Sources: backend/open_webui/routers/retrieval.py

Configuration Reference

Environment Variables

Variable	Default	Description
`VECTOR_DB`	`chroma`	Vector database provider
`RAG_TOP_K`	`5`	Number of top results to retrieve
`RAG_RELEVANCE_THRESHOLD`	`0.0`	Minimum relevance score threshold
`WEB_SEARCH_ENABLED`	`True`	Enable web search functionality

Frontend API URLs

The frontend communicates with these API base URLs:

export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;

Sources: src/lib/constants.ts

Data Flow

graph LR
    subgraph Input["Input Sources"]
        Files[Uploaded Files]
        WebSearch[Web Search]
        URLs[Direct URLs]
    end
    
    subgraph Processing["Processing Pipeline"]
        Loaders[Document Loaders]
        Chunks[Text Chunking]
        Embed[Embedding Model]
    end
    
    subgraph Storage["Storage"]
        Vector[Vector Store]
        Meta[Metadata Store]
    end
    
    subgraph Query["Query Processing"]
        QueryEmb[Query Embedding]
        Similarity[Similarity Search]
        Rerank[Reranking]
    end
    
    Files --> Loaders
    WebSearch --> Loaders
    URLs --> Loaders
    Loaders --> Chunks
    Chunks --> Embed
    Embed --> Vector
    
    Query --> QueryEmb
    QueryEmb --> Similarity
    Similarity --> Rerank
    Rerank --> Context[LLM Context]

Error Handling

The retrieval system implements comprehensive error handling:

Error Type	HTTP Code	Message
Web search failure	400	`WEB_SEARCH_ERROR` with exception details
No results found	404	`No results found from web search`
Loader failure	500	Loader-specific error message
Vector store error	500	Database connection or query errors

Sources: backend/open_webui/routers/retrieval.py:1-50

Extension Points

The retrieval system is designed for extensibility:

Custom Document Loaders: Implement the DocumentLoader interface in loaders/
New Search Providers: Add provider class in web/ following the SearchProvider protocol
Vector Store Adapters: Implement VectorStore abstract class in vector/dbs/
Embedding Models: Configure through EMBEDDING_MODEL setting

Sources: [src/lib/constants.ts](https://github.com/open-webui/open-webui/blob/main/src/lib/constants.ts)

Frontend Structure

Related topics: Chat Interface, Architecture Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Chat Interface, Architecture Overview

Frontend Structure

Overview

Open WebUI uses a modern SvelteKit-based frontend architecture built with TypeScript. The frontend is responsible for the user interface, real-time chat interactions, multimedia handling, and communication with the backend API. The application runs as a Single Page Application (SPA) with server-side rendering capabilities provided by SvelteKit.

技术栈

Layer	Technology
Framework	SvelteKit
Language	TypeScript
Styling	CSS (with custom properties)
State Management	Svelte Stores
API Communication	Fetch API
Internationalization	i18n module
Code Highlighting	Shiki
Build Tool	Vite (via SvelteKit)

Sources: src/lib/constants.ts:1

Sources: [src/lib/constants.ts:1]()

Chat Interface

Related topics: Ollama Integration, Frontend Structure

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Chat Stores

Continue reading this section for the full explanation and source context.

Section UI Visibility Stores

Continue reading this section for the full explanation and source context.

Section Audio and Transcription Stores

Continue reading this section for the full explanation and source context.

Related topics: Ollama Integration, Frontend Structure

Chat Interface

Overview

The Chat Interface is the core user-facing component of Open WebUI, providing an interactive environment for conversations with AI models. It handles message composition, response rendering, conversation state management, and integration with various backend services including Ollama, OpenAI-compatible APIs, and code execution engines.

The interface is built with SvelteKit on the frontend and Python/FastAPI on the backend, enabling real-time streaming responses, multi-model conversations, and rich content rendering including markdown, code blocks, and embedded media.

Architecture Overview

graph TD
    subgraph Frontend["Frontend (Svelte)"]
        Chat[Chat.svelte]
        MessageInput[MessageInput.svelte]
        Message[Message.svelte]
        Markdown[Markdown.svelte]
        ModelSelector[ModelSelector.svelte]
        Navbar[Navbar.svelte]
    end
    
    subgraph StateManagement["State Management"]
        Stores[index.ts - Svelte Stores]
    end
    
    subgraph Backend["Backend (Python/FastAPI)"]
        ChatModel[models/chats.py]
        Config[config.py]
        Middleware[middleware.py]
    end
    
    Chat --> Stores
    MessageInput --> Stores
    Message --> Stores
    Chat --> MessageInput
    Chat --> Message
    Message --> Markdown
    Stores --> ChatModel
    Stores --> Middleware

State Management

The chat interface relies heavily on Svelte stores for reactive state management. These stores maintain the current conversation state, UI visibility flags, and application-wide settings.

Core Chat Stores

All chat-related state is managed through Svelte writable stores defined in src/lib/stores/index.ts:

Store	Type	Purpose
`chatId`	`Writable<string>`	Current active chat identifier
`chatTitle`	`Writable<string>`	Title of the current chat
`chats`	`Writable<null>`	Cached chat objects
`pinnedChats`	`Writable<Chat[]>`	Pinned conversations
`models`	`Writable<Model[]>`	Available AI models
`chatRequestQueues`	`Writable<Record<string, QueueItem[]>>`	Request queue management

Sources: src/lib/stores/index.ts:53-58

UI Visibility Stores

The interface uses boolean stores to control component visibility:

Store	Type	Purpose
`showSidebar`	`Writable<boolean>`	Sidebar visibility
`showSettings`	`Writable<boolean>`	Settings panel visibility
`showShortcuts`	`Writable<boolean>`	Keyboard shortcuts overlay
`showControls`	`Writable<boolean>`	Chat controls visibility
`showEmbeds`	`Writable<boolean>`	Embedded content display
`showArtifacts`	`Writable<boolean>`	Code artifacts panel

Sources: src/lib/stores/index.ts:22-30

Audio and Transcription Stores

Store	Type	Purpose
`audioQueue`	`Writable<AudioQueue \	null>`	TTS audio queue
`TTSWorker`	`Writable<Worker \	null>`	Text-to-speech web worker

Message Processing Pipeline

Content Sanitization

Before rendering, message content undergoes sanitization to prevent XSS attacks and normalize special tokens:

export const sanitizeResponseContent = (content: string) => {
    return content
        .replace(/<\|[a-z]*$/, '')
        .replace(/<\|[a-z]+\|$/, '')
        .replace(/<$/, '')
        .replaceAll('<', '&lt;')
        .replaceAll('>', '&gt;')
        .replaceAll(/<\|[a-z]+\|>/g, ' ')
        .trim();
};

Sources: src/lib/utils/index.ts:180-189

Content Processing for Chinese Text

The system includes special handling for Chinese content to address markdown and LaTeX formatting issues:

function processChineseContent(content: string): string {
    if (!/[\u4e00-\u9fa5]/.test(content)) return content;
    const lines = content.split('\n');
    const processedLines = lines.map((line) => {
        // Chinese-specific processing logic
    });
    return processedLines.join('\n');
}

Sources: src/lib/utils/index.ts:195-208

Sentence and Paragraph Extraction

For audio processing (text-to-speech), messages are split into appropriate segments:

export const extractSentencesForAudio = (text: string) => {
    return extractSentences(text).reduce((mergedTexts, currentText) => {
        const lastIndex = mergedTexts.length - 1;
        if (lastIndex >= 0) {
            const previousText = mergedTexts[lastIndex];
            const wordCount = previousText.split(/\s+/).length;
            const charCount = previousText.length;
            if (wordCount < 4 || charCount < 50) {
                mergedTexts[lastIndex] = previousText + ' ' + currentText;
            } else {
                mergedTexts.push(currentText);
            }
        }
        return mergedTexts;
    }, []);
};

Sources: src/lib/utils/index.ts:300-319

Chat Data Models

Backend Chat Model

The backend defines chat structures in backend/open_webui/models/chats.py:

class ChatModel:
    async def get_message_list(self, id: str) -> Optional[dict]:
        """Message map for walking history.
        
        Prefer chat_message rows to avoid loading the large chat
        JSON blob; fall back to embedded history when no rows exist
        (legacy chats).
        """
        messages_map = await ChatMessages.get_messages_map_by_chat_id(id)
        if messages_map is not None:
            return messages_map
        
        # Fall back to embedded JSON blob for legacy chats
        chat = await self.get_chat_by_id(id)
        if chat is None:
            return None
        
        return chat.chat.get('history', {}).get('messages', {}) or {}

Sources: backend/open_webui/models/chats.py:1-25

Message Structure

Messages support both normalized storage (via chat_message rows) and legacy embedded JSON format:

Field	Type	Description
`id`	`string`	Unique message identifier
`parentId`	`string \	null`	Parent message ID for threading
`childrenIds`	`string[]`	Child message IDs
`role`	`user \	assistant`	Message author role
`content`	`string`	Message content
`model`	`string`	Model used for assistant responses
`timestamp`	`number`	Unix timestamp of creation
`done`	`boolean`	Whether response is complete

Configuration and Prompt Templates

Voice Mode Configuration

Voice mode settings are configurable via environment variables:

Config Key	Environment Variable	Default	Description
`ENABLE_VOICE_MODE_PROMPT`	`ENABLE_VOICE_MODE_PROMPT`	`True`	Enable voice mode prompt
`VOICE_MODE_PROMPT_TEMPLATE`	`VOICE_MODE_PROMPT_TEMPLATE`	`''`	Custom voice prompt template

Sources: backend/open_webui/config.py:1-20

Code Interpreter Configuration

The chat interface integrates code execution capabilities:

Config Key	Environment Variable	Default	Description
`ENABLE_CODE_EXECUTION`	`ENABLE_CODE_EXECUTION`	`True`	Enable code execution
`CODE_EXECUTION_ENGINE`	`CODE_EXECUTION_ENGINE`	`pyodide`	Execution engine (pyodide/jupyter)
`CODE_EXECUTION_JUPYTER_URL`	`CODE_EXECUTION_JUPYTER_URL`	`''`	Jupyter server URL
`CODE_EXECUTION_JUPYTER_AUTH`	`CODE_EXECUTION_JUPYTER_AUTH`	`''`	Jupyter authentication

Sources: backend/open_webui/config.py:35-60

Prompt Generation Templates

The system uses configurable prompt templates for various tasks:

Template	Purpose
`DEFAULT_MOA_GENERATION_PROMPT_TEMPLATE`	Multi-model answer synthesis
`IMAGE_PROMPT_GENERATION_PROMPT_TEMPLATE`	Image generation prompt creation
`FOLLOW_UP_GENERATION_PROMPT_TEMPLATE`	Suggesting follow-up questions

Code Interpreter Integration

Backend Middleware Rendering

The backend middleware handles code interpreter rendering in the streaming response pipeline:

elif item_type == 'open_webui:code_interpreter':
    # Code interpreter needs to inspect/mutate prior accumulated content
    # to strip trailing unclosed code fences
    content = '\n'.join(parts)
    content_stripped, original_whitespace = split_content_and_whitespace(content)
    if is_opening_code_block(content_stripped):
        content = content_stripped.rstrip('`').rstrip() + original_whitespace
    else:
        content = content_stripped + original_whitespace
    
    # Render as 

# Ollama Integration

## Overview

The Ollama Integration is a core component of Open WebUI that enables seamless communication between the frontend application and local Ollama instances. This integration provides a unified interface for managing, accessing, and interacting with LLM models hosted locally through Ollama, supporting both native Ollama API calls and OpenAI-compatible endpoints.

Ollama serves as the primary backend inference engine for Open WebUI, allowing users to run large language models entirely on their local hardware without relying on cloud-based services.

## Architecture Overview

The Ollama Integration follows a proxy pattern where the backend server acts as an intermediary, forwarding requests from the frontend to Ollama instances while applying access controls, model routing, and API transformations.

graph TD subgraph Frontend["Frontend (Svelte)"] UI[User Interface] API_CLIENT[API Client src/lib/apis/ollama/index.ts] end

subgraph Backend["Backend Server (Python/FastAPI)"] OLLAMA_ROUTER[Ollama Router routers/ollama.py] OPENAI_ROUTER[OpenAI Router routers/openai.py] MODEL_UTILS[Model Utilities utils/models.py] CONFIG[Configuration config.py] end

subgraph OllamaInstances["Ollama Instances"] OLLAMA_LOCAL[Local Ollama localhost:11434] OLLAMA_CUSTOM[Custom Ollama Configured URLs] end

style Frontend fill:#e1f5fe style Backend fill:#f3e5f5 style OllamaInstances fill:#fff3e0


## Core Components

### Backend Router (routers/ollama.py)

The Ollama router (`backend/open_webui/routers/ollama.py`) handles all native Ollama API operations. It provides endpoints for model management, chat completions, and model operations.

**Primary Endpoints:**

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/chat` | POST | Send chat completion requests |
| `/api/generate` | POST | Generate text with model |
| `/api/tags` | GET | List available models |
| `/api/pull` | POST | Pull a new model |
| `/api/push` | POST | Push a model to registry |
| `/api/delete` | DELETE | Delete a model |
| `/api/create` | POST | Create a new model |
| `/config` | GET/POST | Get/update Ollama configuration |
| `/verify` | POST | Verify connection to Ollama |
| `/v1/chat/completions` | POST | OpenAI-compatible chat endpoint |
| `/v1/models` | GET | OpenAI-compatible models list |
| `/v1/messages` | POST | Anthropic-compatible messages endpoint |
| `/v1/responses` | POST | Ollama Responses API endpoint |

Sources: [backend/open_webui/routers/ollama.py:1-500](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

### Model Resolution and URL Selection

The system supports multiple Ollama instances through a URL index system. When a request is made, the router resolves the appropriate Ollama instance based on model configuration.

sequenceDiagram participant Client participant Router participant Config participant Ollama

Client->>Router: POST /api/chat {model: "llama2"} Router->>Config: get_ollama_url(model, url_idx) Config->>Config: Check model-to-URL mapping Config->>Config: Check url_idx or default Config-->>Router: (url, url_idx) Router->>Ollama: Forward request to url Ollama-->>Router: Response Router-->>Client: Forwarded response


The `get_ollama_url` function performs the following resolution logic:

1. If `url_idx` is provided, use the corresponding URL from `OLLAMA_BASE_URLS`
2. Check model-specific URL mappings stored in `OLLAMA_MODELS`
3. Fall back to the primary `OLLAMA_BASE_URL`

Sources: [backend/open_webui/routers/ollama.py:100-200](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

### Prefix ID Handling

For multi-tenant deployments, the system supports `prefix_id` configuration. When a prefix is configured, model names are automatically transformed:

prefix_id = api_config.get('prefix_id', None) if prefix_id: payload['model'] = payload['model'].replace(f'{prefix_id}.', '')


This allows users to use short model names (e.g., `llama2`) while the backend automatically transforms them to prefixed names (e.g., `tenant1.llama2`) for the Ollama API.

Sources: [backend/open_webui/routers/ollama.py:200-220](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

## Configuration

### Environment Variables

The Ollama integration is configured through environment variables in `backend/open_webui/config.py`:

| Variable | Default | Description |
|----------|---------|-------------|
| `ENABLE_OLLAMA_API` | `True` | Enable/disable Ollama API |
| `OLLAMA_API_BASE_URL` | `http://localhost:11434/api` | Primary Ollama API URL |
| `OLLAMA_BASE_URL` | Auto-derived | Base URL for Ollama connections |
| `USE_OLLAMA_DOCKER` | `false` | Use all-in-one Docker container |
| `K8S_FLAG` | Empty | Kubernetes deployment flag |

ENABLE_OLLAMA_API = PersistentConfig( 'ENABLE_OLLAMA_API', 'ollama.enable', os.environ.get('ENABLE_OLLAMA_API', 'True').lower() == 'true', )

OLLAMA_API_BASE_URL = os.environ.get('OLLAMA_API_BASE_URL', 'http://localhost:11434/api')


Sources: [backend/open_webui/config.py:1-100](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)

### Port Fallback Resolution

The configuration includes automatic port fallback logic for environments where the default Ollama port (11434) might be blocked:

def _resolve_ollama_base_url(url: str) -> str: """If the default Ollama port (11434) is unreachable, try the fallback port (12434).""" # Checks port 11434 first, then falls back to 12434 if unreachable


This enables seamless operation in environments like certain corporate networks or containerized setups where only specific ports are accessible.

Sources: [backend/open_webui/config.py:50-80](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)

### Docker and Kubernetes Handling

The configuration adapts to different deployment scenarios:

if OLLAMA_BASE_URL == '/ollama' and not K8S_FLAG: if USE_OLLAMA_DOCKER.lower() == 'true': OLLAMA_BASE_URL = 'http://localhost:11434' else: OLLAMA_BASE_URL = 'http://host.docker.internal:11434' elif K8S_FLAG: OLLAMA_BASE_URL = 'http://ollama-service.open-webui.svc.cluster.local:11434'


Sources: [backend/open_webui/config.py:40-50](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)

## API Compatibility Layers

### OpenAI-Compatible API

The Ollama router provides OpenAI-compatible endpoints that translate requests to Ollama's API format:

**Endpoint:** `POST /ollama/v1/chat/completions`

The system transforms OpenAI-format requests into Ollama-native format:

payload = apply_model_params_to_body_openai(params, payload) payload = await apply_system_prompt_to_body(system, payload, metadata, user)


This transformation includes:
- Converting OpenAI parameter names to Ollama format
- Applying model-specific parameter modifications
- Injecting system prompts from user metadata

Sources: [backend/open_webui/routers/ollama.py:150-180](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

### Anthropic-Compatible API

Support for Anthropic's `/v1/messages` endpoint is provided through the Responses API:

**Endpoint:** `POST /ollama/v1/messages`

@router.post('/v1/messages') async def generate_anthropic_messages( request: Request, form_data: dict, url_idx: Optional[int] = None, user=Depends(get_verified_user), ): """ Proxy for Ollama's Anthropic-compatible /v1/messages endpoint. Forwards the request as-is to the Ollama backend. """


The request is forwarded to Ollama's `/v1/responses` endpoint with appropriate streaming headers.

Sources: [backend/open_webui/routers/ollama.py:250-280](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

## Frontend Integration

### API Client (src/lib/apis/ollama/index.ts)

The frontend provides a TypeScript API client for communicating with the backend Ollama proxy:

**Key Functions:**

| Function | Purpose |
|----------|---------|
| `deleteModel()` | Delete a model from Ollama |
| `pullModel()` | Pull a new model with progress tracking |
| `verifyOllamaConnection()` | Test connectivity to Ollama instance |
| `getOllamaConfig()` | Retrieve current Ollama configuration |

export const pullModel = async (token: string, tagName: string, urlIdx: number | null = null) => { const controller = new AbortController(); const res = await fetch( ${OLLAMA_API_BASE_URL}/api/pull${urlIdx !== null ? /${urlIdx} : ''}, { signal: controller.signal, method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: Bearer ${token} }, body: JSON.stringify({ name: tagName }) } ); return res; };


Sources: [src/lib/apis/ollama/index.ts:1-150](https://github.com/open-webui/open-webui/blob/main/src/lib/apis/ollama/index.ts)

### API Base URL Configuration

Frontend constants define the base URLs for API communication:

export const OLLAMA_API_BASE_URL = ${WEBUI_BASE_URL}/ollama; export const WEBUI_API_BASE_URL = ${WEBUI_BASE_URL}/api/v1;


The system automatically configures the base URL based on environment:
- **Development:** `http://hostname:8080`
- **Production:** Uses the configured domain

Sources: [src/lib/constants.ts:1-30](https://github.com/open-webui/open-webui/blob/main/src/lib/constants.ts)

## Request Flow

graph LR A[User Request] --> B[Frontend API Client] B --> C[Backend Router]

C --> D{Request Type?}

E --> H[Model Resolution] F --> H G --> H

H --> I[Access Control Check] I --> J{Model Access Allowed?}

J -->|Yes| K[Forward to Ollama] J -->|No| L[HTTP 403 Forbidden]

K --> M[Ollama Instance] M --> N[Response] N --> O[Stream/Return to Client]


## Model Management

### Model Registration

Models discovered from Ollama instances are registered in the application state:

app.state.OLLAMA_MODELS = {}


Each model entry contains:
- `urls`: Array of Ollama instance URLs where the model is available
- `details`: Model metadata (size, capabilities, etc.)

Sources: [backend/open_webui/main.py:100-120](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/main.py)

### Model Access Control

Before forwarding requests, the system checks user access permissions:

model_info = await Models.get_model_by_id(model_id) if model_info: if model_info.base_model_id: payload['model'] = model_info.base_model_id await check_model_access(user, model_info) else: await check_model_access(user, None)


Sources: [backend/open_webui/routers/ollama.py:130-150](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

## Error Handling

### Connection Verification

The system provides a connection verification endpoint for testing Ollama connectivity:

export const verifyOllamaConnection = async (token: string = '', connection: dict = {}) => { const res = await fetch(${OLLAMA_API_BASE_URL}/verify, { method: 'POST', headers: { Authorization: Bearer ${token}, 'Content-Type': 'application/json' }, body: JSON.stringify({ ...connection }) }); return res; };


Sources: [src/lib/apis/ollama/index.ts:150-180](https://github.com/open-webui/open-webui/blob/main/src/lib/apis/ollama/index.ts)

### Error Messages

Common error scenarios include:

| Scenario | HTTP Status | Error Message |
|----------|-------------|---------------|
| Ollama API Disabled | 503 | `OLLAMA_API_DISABLED` |
| Model Not Found | 400 | `MODEL_NOT_FOUND` |
| Network Problem | Various | `Ollama: Network Problem` |
| Invalid Config | 500 | `DEFAULT(e)` |

Sources: [backend/open_webui/routers/ollama.py:50-80](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)

## Summary

The Ollama Integration provides a robust, flexible bridge between Open WebUI and local Ollama instances. Key features include:

- **Multi-instance support** through URL indexing
- **API compatibility layers** for OpenAI and Anthropic formats
- **Automatic port fallback** for network flexibility
- **Access control** integration for model permissions
- **Prefix-based multi-tenancy** support
- **Streaming support** for real-time responses
- **Docker and Kubernetes** deployment optimizations

This integration enables users to run powerful LLM models entirely locally while maintaining a modern, feature-rich web interface for interaction.

Sources: [src/lib/stores/index.ts:53-58]()

RAG Pipeline

Related topics: Ollama Integration, Retrieval System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Text Extraction

Continue reading this section for the full explanation and source context.

Section Supported Document Loaders

Continue reading this section for the full explanation and source context.

Related topics: Ollama Integration, Retrieval System

RAG Pipeline

Retrieval-Augmented Generation (RAG) Pipeline in Open WebUI enables users to upload documents, process them into searchable vector embeddings, and augment LLM responses with relevant context from user knowledge bases.

Architecture Overview

The RAG Pipeline consists of multiple integrated components that work together to provide document retrieval and context augmentation capabilities.

graph TD
    A[User Upload] --> B[Document Processing]
    B --> C[Text Extraction]
    C --> D[Chunking]
    D --> E[Embedding Generation]
    E --> F[Vector Storage]
    F --> G[Retrieval]
    G --> H[Context Injection]
    H --> I[LLM Response]
    
    J[Knowledge Management] --> K[Access Control]
    K --> F

Core Components

Component	Location	Purpose
Knowledge Router	`backend/open_webui/routers/knowledge.py`	REST API endpoints for knowledge management
Retrieval Utils	`backend/open_webui/retrieval/utils.py`	Document loading and text extraction
API Router	`backend/open_webui/main.py`	Registers retrieval endpoints
Frontend	`src/lib/components/workspace/Knowledge.svelte`	UI for knowledge management

Sources: backend/open_webui/main.py:17-30

Supported Document Types

Open WebUI supports a wide range of document formats through configurable document loaders.

// src/lib/constants.ts
export const SUPPORTED_FILE_TYPE = [
    'application/epub+zip',
    'application/pdf',
    'text/plain',
    'text/csv',
    'text/xml',
    'text/html',
    'text/x-python',
    'text/css',
    'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
    'application/octet-stream',
    'application/x-javascript',
    'text/markdown',
    'audio/mpeg',
    'audio/wav',
    'video/mp4',
    'video/mpeg'
];

Sources: src/lib/constants.ts:16-30

Document Processing Pipeline

Text Extraction

The retrieval utility module handles document parsing through multiple backends:

# backend/open_webui/retrieval/utils.py
def _extract_text_from_binary_response(request, response, url):
    """Download response body to a temp file and extract text using the Loader pipeline."""
    import mimetypes
    import tempfile
    import urllib.parse

Supported Document Loaders

Loader	Purpose	Configuration
TIKA Server	Apache Tika for generic document parsing	`TIKA_SERVER_URL`
DOCLING	Advanced PDF and document processing	`DOCLING_SERVER_URL`, `DOCLING_API_KEY`
PDF Loader	Configurable PDF extraction	`PDF_LOADER_MODE`, `PDF_EXTRACT_IMAGES`
Document Intelligence	Azure AI document analysis	`DOCUMENT_INTELLIGENCE_ENDPOINT`
Mistral OCR	OCR for scanned documents	`MISTRAL_OCR_API_BASE_URL`
PaddleOCR VL	Visual language OCR	`PADDLEOCR_VL_BASE_URL`
MinerU	Chinese document processing	`MINERU_API_MODE`, `MINERU_API_URL`

Sources: backend/open_webui/retrieval/utils.py:1-25

Knowledge Management API

Endpoints Overview

The knowledge router provides CRUD operations for managing user knowledge bases.

/api/v1/knowledge          - List and create knowledge bases
/api/v1/knowledge/{id}    - Get, update, delete specific knowledge
/api/v1/knowledge/{id}/file/add     - Add file to knowledge base
/api/v1/knowledge/{id}/search       - Search within knowledge base

Access Control

Knowledge resources are protected by role-based access control:

# backend/open_webui/routers/knowledge.py
if not (
    user.role == 'admin'
    or knowledge.user_id == user.id
    or await AccessGrants.has_access(
        user_id=user.id,
        resource_type='knowledge',
        resource_id=knowledge.id,
        permission='read',
        db=db,
    )
):
    raise HTTPException(
        status_code=status.HTTP_400_BAD_REQUEST,
        detail=ERROR_MESSAGES.ACCESS_PROHIBITED,
    )

Sources: backend/open_webui/routers/knowledge.py:40-55

Search Functionality

The search endpoint supports pagination and filtering:

Parameter	Type	Default	Description
`page`	int	1	Page number (minimum 1)
`query`	string	-	Search query text
`view_option`	string	-	Filter option
`order_by`	string	-	Sort field
`direction`	string	-	Sort direction

page = max(page, 1)
limit = 30
skip = (page - 1) * limit

filter = {}
if query:
    filter['query'] = query

Sources: backend/open_webui/routers/knowledge.py:57-70

Configuration Options

Environment Variables

Variable	Default	Description
`EXTERNAL_DOCUMENT_LOADER_URL`	-	External document loader endpoint
`EXTERNAL_DOCUMENT_LOADER_API_KEY`	-	API key for external loader
`TIKA_SERVER_URL`	-	Apache Tika server URL
`DOCLING_SERVER_URL`	-	Docling server endpoint
`DOCLING_API_KEY`	-	Docling API authentication
`PDF_LOADER_MODE`	-	PDF extraction mode
`PDF_EXTRACT_IMAGES`	-	Enable image extraction from PDFs
`DOCUMENT_INTELLIGENCE_ENDPOINT`	-	Azure AI endpoint
`DOCUMENT_INTELLIGENCE_KEY`	-	Azure AI API key
`MISTRAL_OCR_API_BASE_URL`	-	Mistral OCR service URL
`MISTRAL_OCR_API_KEY`	-	Mistral OCR authentication
`PADDLEOCR_VL_BASE_URL`	-	PaddleOCR endpoint
`PADDLEOCR_VL_TOKEN`	-	PaddleOCR token
`MINERU_API_MODE`	-	MinerU processing mode
`MINERU_API_URL`	-	MinerU API endpoint
`MINERU_API_KEY`	-	MinerU API key
`MINERU_API_TIMEOUT`	-	MinerU request timeout

Sources: backend/open_webui/config.py:1-25

Data Flow

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant API as Knowledge API
    participant DL as Document Loader
    participant VC as Vector Cache
    participant LLM as LLM
    
    U->>F: Upload Document
    F->>API: POST /api/v1/knowledge/{id}/file/add
    API->>DL: Extract Text
    DL-->>API: Raw Text Content
    API->>VC: Generate Embeddings
    VC-->>API: Vector Embeddings
    API-->>F: Success Response
    
    U->>F: Query with RAG
    F->>API: POST /api/v1/retrieval
    API->>VC: Search Vectors
    VC-->>API: Relevant Chunks
    API-->>F: Augmented Context
    F->>LLM: Prompt + Context
    LLM-->>U: Generated Response

Frontend Integration

The knowledge management interface is implemented as a Svelte component:

Location: src/lib/components/workspace/Knowledge.svelte
Provides file upload, management, and search UI
Communicates with backend via REST API

API Base URLs

// src/lib/constants.ts
export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;
export const AUDIO_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/audio`;
export const IMAGES_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/images`;

Sources: src/lib/constants.ts:9-13

Dependencies

Key Python packages for RAG functionality:

Package	Version	Purpose
`sqlalchemy`	2.0.48	Database ORM
`requests`	2.33.1	HTTP client
`httpx`	0.28.1	Async HTTP with HTTP/2 support
`aiofiles`	-	Async file operations
`redis`	-	Vector caching
`pycrdt`	0.12.47	CRDT operations

Sources: backend/requirements-min.txt:1-35

Error Handling

The system uses centralized error messages:

ERROR_MESSAGES.NOT_FOUND = "Knowledge base not found"
ERROR_MESSAGES.ACCESS_PROHIBITED = "Access denied to this knowledge base"

Best Practices

Document Preparation: Use supported formats for optimal extraction quality
Chunking Strategy: Configure appropriate chunk sizes based on use case
Access Control: Leverage RBAC to protect sensitive knowledge bases
Loader Selection: Choose appropriate document loader based on document complexity
Resource Management: Monitor vector storage size for large knowledge bases

Sources: [backend/open_webui/main.py:17-30]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium README/documentation is current enough for a first validation pass.

The project should not be treated as fully validated until this signal is reviewed.

medium Maintainer activity is unknown

Users cannot judge support quality until recent activity, releases, and issue response are checked.

medium no_demo

The project may affect permissions, credentials, data exposure, or host boundaries.

medium no_demo

The project may affect permissions, credentials, data exposure, or host boundaries.

Doramagic Pitfall Log

Doramagic extracted 6 source-linked risk signals. Review them before installing or handing real data to the project.

1. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:701547123 | https://github.com/open-webui/open-webui | README/documentation is current enough for a first validation pass.

2. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | last_activity_observed missing

3. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:701547123 | https://github.com/open-webui/open-webui | no_demo; severity=medium

4. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:701547123 | https://github.com/open-webui/open-webui | no_demo; severity=medium

5. Maintenance risk: issue_or_pr_quality=unknown

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | issue_or_pr_quality=unknown

6. Maintenance risk: release_recency=unknown

Severity: low
Finding: release_recency=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | release_recency=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using open-webui with real data or production workflows.

[[BUG] v0.9.3 - Notes completely broken: cannot open or create notes (Typ](https://github.com/open-webui/open-webui/issues/24484) - github / github_issue
issue: llamacpp load/unload indicator doesn't work - github / github_issue
issue: When continuing a conversation in the new version using a chat cr - github / github_issue
issue: image_gen is exposed to the model even when image generation is d - github / github_issue
feat: Add file types per MCP Integration - github / github_issue
feat: apply filter in tool call loop - github / github_issue
issue: Cmd+r on Mac (refresh page) causes chat to generate a new respons - github / github_issue
v0.9.5 - github / github_release
v0.9.4 - github / github_release
v0.9.3 - github / github_release
v0.9.2 - github / github_release
v0.9.1 - github / github_release

Source: Project Pack community evidence and pitfall evidence

open-webui

Project Introduction

Related Pages

Project Introduction

Overview

Architecture

Frontend Layer

Backend Layer

Features

Supported File Types

Key Capabilities

Configuration System

Code Execution Configuration

Voice Mode Configuration

Security Features

Authentication System

Content Processing

Installation Methods

Python pip Installation

Docker Installation

Development Branch

Technology Stack Summary

System Requirements

Related Documentation

Installation Guide

Prerequisites

System Requirements

Required Dependencies

Installation Methods

Docker Installation (Recommended)

Python pip Installation

Development Installation

Environment Configuration

Core Environment Variables

Database Configuration

Redis Configuration

Security Configuration

Audit Logging

Observability

Data Persistence

Offline Installation

Installation Architecture

Docker Compose Installation

Troubleshooting

Common Issues

Verification

Next Steps

Architecture Overview

Related Pages

Architecture Overview

System Architecture

Directory Structure

Backend Structure

Frontend Structure

API Architecture

API Endpoint Structure

API Constants Configuration

API Client Pattern

Configuration System

Environment Setup

Persistent Configuration

Supported File Types

WebSocket Communication

Middleware Pipeline

Content Type Processing

Data Models

Related Pages

Data Models

Overview

Architecture Overview

Core Data Models

User Model

Chat Model

Message Model

Knowledge Model

File Model

Database Schema

Entity Relationship Diagram

Database Tables

Access Control Models