Doramagic Project Pack · Human Manual
open-webui
Open WebUI is an open-source project that prioritizes offline functionality and user privacy. The platform is built with extensibility in mind, allowing users to customize and extend its c...
Project Introduction
Related topics: Installation Guide, Architecture Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation Guide, Architecture Overview
Project Introduction
Open WebUI is an extensible, self-hosted AI interface designed to provide a powerful and user-friendly chat experience for Large Language Models (LLMs). It serves as a comprehensive web-based frontend that seamlessly integrates with various LLM backends, enabling users to interact with AI models through a modern, feature-rich interface.
Overview
Open WebUI is an open-source project that prioritizes offline functionality and user privacy. The platform is built with extensibility in mind, allowing users to customize and extend its capabilities through a modular architecture. The project supports multiple installation methods and integrates with popular LLM providers like Ollama, OpenAI, and various other AI services.
The system operates as a full-stack application with a Svelte-based frontend and a Python FastAPI backend, communicating through RESTful APIs and WebSocket connections for real-time interactions.
Architecture
Open WebUI follows a client-server architecture with clear separation between the frontend presentation layer and the backend API layer.
graph TD
subgraph Frontend["Frontend (Svelte)"]
UI[User Interface]
State[State Management]
API[API Client]
end
subgraph Backend["Backend (Python/FastAPI)"]
Routes[API Routes]
Services[Business Logic]
DB[(Database)]
Auth[Authentication]
end
subgraph External["External Services"]
Ollama[Ollama]
OpenAI[OpenAI API]
RAG[RAG Providers]
end
UI --> State
State --> API
API --> Routes
Routes --> Services
Services --> DB
Routes --> Auth
Services --> Ollama
Services --> OpenAI
Services --> RAGFrontend Layer
The frontend is built using Svelte and SvelteKit, providing a reactive and performant user interface. Key components include:
| Component | Location | Purpose |
|---|---|---|
| Constants | src/lib/constants.ts | Application-wide configuration values |
| Utilities | src/lib/utils/index.ts | Content processing and sanitization |
| API Clients | src/lib/apis/ | Communication with backend services |
Sources: src/lib/constants.ts:1-20
The frontend defines API base URLs for various services:
export const WEBUI_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1`;
export const OLLAMA_API_BASE_URL = `${WEBUI_BASE_URL}/ollama`;
export const OPENAI_API_BASE_URL = `${WEBUI_BASE_URL}/openai`;
export const AUDIO_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/audio`;
export const IMAGES_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/images`;
export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;
Sources: src/lib/constants.ts:8-15
Backend Layer
The backend is built with Python using FastAPI, providing a robust and scalable API layer. The backend handles authentication, data management, and communication with external AI services.
#### Core Dependencies
| Package | Version | Purpose |
|---|---|---|
| fastapi | 0.135.1 | Web framework |
| uvicorn | 0.41.0 | ASGI server |
| pydantic | 2.12.5 | Data validation |
| sqlalchemy | 2.0.48 | ORM framework |
| python-socketio | 5.16.1 | WebSocket support |
| pycrdt | 0.12.47 | CRDT for real-time collaboration |
Sources: backend/requirements-min.txt:1-35
Features
Open WebUI provides a comprehensive set of features designed to enhance the AI chat experience:
Supported File Types
The system supports various document formats for upload and processing:
| Category | File Types |
|---|---|
| Documents | PDF, EPUB, DOCX, TXT |
| Code | Python, JavaScript, CSS, XML |
| Data | CSV, Markdown |
| Media | MP3, WAV (audio) |
| Other | HTML, Octet-stream |
Sources: src/lib/constants.ts:18-32
Key Capabilities
- Multi-Model Support: Engage with multiple AI models simultaneously through the MOA (Mixture of Agents) architecture
- Code Interpreter: Execute Python code in sandboxed environments using Pyodide or Jupyter
- Voice Mode: Voice-activated interactions with customizable prompts
- RAG Integration: Retrieval-augmented generation with support for 15+ search providers
- Web Browsing: Extract and integrate web content directly into conversations
- Image Generation: Integration with DALL-E, Gemini, ComfyUI, and AUTOMATIC1111
- Role-Based Access Control (RBAC): Granular permission management
Configuration System
Open WebUI uses a persistent configuration system to manage application settings. Configuration values are stored in the database and can be overridden by environment variables.
Code Execution Configuration
| Setting | Environment Variable | Default | Description |
|---|---|---|---|
| ENABLE_CODE_EXECUTION | ENABLE_CODE_EXECUTION | True | Enable code execution feature |
| CODE_EXECUTION_ENGINE | CODE_EXECUTION_ENGINE | pyodide | Execution engine (pyodide/jupyter) |
| JUPYTER_URL | CODE_EXECUTION_JUPYTER_URL | - | Jupyter server URL |
| JUPYTER_AUTH | CODE_EXECUTION_JUPYTER_AUTH | - | Jupyter authentication |
Sources: backend/open_webui/config.py:1-50
Voice Mode Configuration
| Parameter | Description |
|---|---|
| VOICE_MODE_PROMPT_TEMPLATE | Template for voice interaction prompts |
| ENABLE_VOICE_MODE_PROMPT | Enable voice-specific prompt handling |
Security Features
Authentication System
The backend implements comprehensive authentication using:
- JWT tokens via PyJWT
- Argon2 password hashing
- Session management with Redis support
- Role-based access control (RBAC)
Content Processing
The system includes middleware for processing and sanitizing AI responses:
graph LR
Response[AI Response] --> Middleware[Middleware Layer]
Middleware --> Sanitize[Content Sanitization]
Middleware --> CodeBlock[Code Block Processing]
Middleware --> Reasoning[Reasoning Display]
Sanitize --> Render[Rendered Response]
CodeBlock --> Render
Reasoning --> RenderThe middleware handles special content types including:
- Code interpreter blocks
- Reasoning/thinking blocks
- HTML content rendering
Sources: backend/open_webui/utils/middleware.py:1-40
Installation Methods
Python pip Installation
pip install open-webui
open-webui serve
The server runs on http://localhost:8080 by default.
Docker Installation
docker run -d -p 3000:8080 \
-v open-webui:/app/backend/data \
--name open-webui \
--add-host=host.docker.internal:host-gateway \
--restart always \
ghcr.io/open-webui/open-webui:latest
[!IMPORTANT]
The volume mount -v open-webui:/app/backend/data is crucial for database persistence.
Development Branch
For testing unstable features:
docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui --add-host=host.docker.internal:host-gateway --restart always ghcr.io/open-webui/open-webui:dev
Sources: README.md:1-80
Technology Stack Summary
| Layer | Technology | Key Libraries |
|---|---|---|
| Frontend Framework | Svelte/SvelteKit | - |
| Backend Framework | Python/FastAPI | Pydantic, SQLAlchemy |
| Database | SQLite/PostgreSQL | aiosqlite, psycopg |
| Real-time | WebSocket | python-socketio, pycrdt |
| Caching | Redis | starsessions |
| Authentication | JWT/Argon2 | PyJWT, argon2-cffi |
| HTTP Client | httpx | With SOCKS, HTTP/2 support |
| Task Scheduling | APScheduler | - |
System Requirements
- Python Version: 3.11+ (required for compatibility)
- Node.js: For frontend development
- Database: SQLite (default), PostgreSQL (production)
- Memory: Minimum 4GB RAM recommended
- Storage: Depends on models and data usage
Related Documentation
Sources: [src/lib/constants.ts:1-20]()
Installation Guide
Open WebUI provides multiple installation methods to accommodate different use cases, from simple Docker deployments to development environments. This guide covers all supported installati...
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Open WebUI provides multiple installation methods to accommodate different use cases, from simple Docker deployments to development environments. This guide covers all supported installation approaches, configuration options, and environment variables required for a successful setup.
Prerequisites
System Requirements
| Component | Minimum | Recommended |
|---|---|---|
| Python | 3.11 | 3.11+ |
| RAM | 4 GB | 8 GB+ |
| Disk | 10 GB | 20 GB+ |
| Docker | 20.10+ | Latest |
| GPU | Optional | NVIDIA GPU with CUDA |
Required Dependencies
The backend requires the following core packages for basic operation:
fastapi==0.135.1
uvicorn[standard]==0.41.0
pydantic==2.12.5
python-multipart==0.0.22
itsdangerous==2.2.0
python-socketio==5.16.1
python-jose==3.5.0
cryptography
sqlalchemy==2.0.48
aiosqlite==0.21.0
Sources: backend/requirements-min.txt:1-15
Installation Methods
Docker Installation (Recommended)
Docker is the recommended installation method for production use. Open WebUI provides multiple official images with different configurations.
#### Docker Image Variants
| Tag | Description | Use Case |
|---|---|---|
main | Base Open WebUI | Standard deployment |
cuda | With CUDA support | NVIDIA GPU acceleration |
ollama | Bundled with Ollama | Local model inference |
dev | Development build | Testing latest features |
#### Basic Docker Installation
For connecting to Ollama on localhost:
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Sources: README.md:42-47
#### NVIDIA GPU Support
To enable GPU acceleration:
docker run -d -p 3000:8080 \
--gpus all \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:cuda
Sources: README.md:53-59
#### Bundled Ollama Installation
For a streamlined setup with both Open WebUI and Ollama in a single container:
With GPU Support:
docker run -d -p 3000:8080 --gpus=all \
-v ollama:/root/.ollama \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:ollama
CPU Only:
docker run -d -p 3000:8080 \
-v ollama:/root/.ollama \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:ollama
Sources: README.md:64-79
#### OpenAI API Only
For environments using only the OpenAI API:
docker run -d -p 3000:8080 \
-e OPENAI_API_KEY=your_secret_key \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Sources: README.md:50-56
#### Remote Ollama Server
To connect to Ollama on a different server:
docker run -d -p 3000:8080 \
-e OLLAMA_BASE_URL=https://example.com \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Sources: README.md:40-46
Python pip Installation
Open WebUI can be installed directly via pip for environments without Docker.
#### Requirements
- Python 3.11 or higher
- pip package manager
#### Installation Steps
- Install Open WebUI package:
pip install open-webui
- Start the server:
open-webui serve
The server will be accessible at http://localhost:8080.
Sources: README.md:12-25
Development Installation
#### Using the Dev Branch
[!WARNING]
The :dev branch contains unstable features. Use at your own risk.
docker run -d -p 3000:8080 \
-v open-webui:/app/backend/data \
--name open-webui \
--add-host=host.docker.internal:host-gateway \
--restart always \
ghcr.io/open-webui/open-webui:dev
Sources: README.md:27-34
Environment Configuration
Core Environment Variables
| Variable | Description | Default |
|---|---|---|
OLLAMA_BASE_URL | Ollama server URL | http://localhost:11434 |
OPENAI_API_KEY | OpenAI API key | - |
WEBUI_SECRET_KEY | Session encryption key | Auto-generated |
WEBUI_SESSION_COOKIE_SECURE | Secure cookie flag | True |
WEBUI_SESSION_COOKIE_SAME_SITE | Cookie SameSite policy | Lax |
Sources: backend/open_webui/main.py:18-35
Database Configuration
| Variable | Description | Default |
|---|---|---|
DATABASE_URL | Database connection string | SQLite |
ENABLE_DATABASE_ENCRYPTION | Enable SQLite encryption | False |
#### Supported Databases
- SQLite: Default, requires no configuration
- PostgreSQL: Set
DATABASE_URLto PostgreSQL connection string - Redis: For session management and caching
Sources: backend/open_webui/env.py
Redis Configuration
REDIS_URL=redis://localhost:6379
REDIS_KEY_PREFIX=open-webui
REDIS_SENTINEL_HOSTS=host1:26379,host2:26379
REDIS_SENTINEL_PORT=26379
Sources: backend/open_webui/main.py:15-18
Security Configuration
| Variable | Description | Default |
|---|---|---|
ENABLE_SIGNUP_PASSWORD_CONFIRMATION | Require password confirmation | True |
WEBUI_AUTH_TRUSTED_EMAIL_HEADER | Trusted email header for SSO | - |
WEBUI_AUTH_SIGNOUT_REDIRECT_URL | Signout redirect URL | - |
Sources: backend/open_webui/main.py:36-38
Audit Logging
| Variable | Description | Default |
|---|---|---|
ENABLE_AUDIT_GET_REQUESTS | Log GET requests | False |
AUDIT_INCLUDED_PATHS | Paths to include | - |
AUDIT_EXCLUDED_PATHS | Paths to exclude | - |
AUDIT_LOG_LEVEL | Logging verbosity | INFO |
Sources: backend/open_webui/env.py:12-15
Observability
| Variable | Description | Default |
|---|---|---|
ENABLE_OTEL | Enable OpenTelemetry | False |
ENABLE_VERSION_UPDATE_CHECK | Check for updates | True |
Sources: backend/open_webui/main.py:48-51
Data Persistence
[!IMPORTANT]
Always mount the volume -v open-webui:/app/backend/data to prevent database loss.
The data directory contains:
- SQLite database file
- Uploaded files
- Configuration cache
- User sessions (if Redis not used)
-v open-webui:/app/backend/data
Sources: README.md:19-22
Offline Installation
For air-gapped environments, set the Hugging Face offline mode:
export HF_HUB_OFFLINE=1
Sources: README.md:36-38
Installation Architecture
graph TD
A[User Request] --> B{Installation Method}
B -->|Docker| C[Official Docker Image]
B -->|pip| D[PyPI Package]
C --> E{Configuration}
D --> E
E -->|OLLAMA_BASE_URL| F[Ollama Server]
E -->|OPENAI_API_KEY| G[OpenAI API]
E -->|Database Config| H[(Database)]
F --> I[Model Inference]
G --> J[API Processing]
H --> K[Application State]
I --> L[Response]
J --> L
K --> LDocker Compose Installation
For production deployments, use Docker Compose with persistent storage:
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- open-webui:/app/backend/data
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
extra_hosts:
- "host.docker.internal:host-gateway"
restart: unless-stopped
volumes:
open-webui:
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| Connection refused to Ollama | Check OLLAMA_BASE_URL and ensure Ollama is running |
| Database errors | Verify volume mount is correct |
| GPU not detected | Ensure NVIDIA Container Toolkit is installed |
| Port conflicts | Change host port mapping |
Verification
After installation, verify the service is running:
curl http://localhost:3000/api/v1/models
The server should respond with available models from the configured backend.
Sources: README.md:40-60
Next Steps
After successful installation:
- Access the web interface at
http://localhost:3000 - Configure additional models and backends
- Set up user authentication and RBAC
- Configure retrieval and RAG pipelines
- Integrate additional tools and extensions
Sources: [backend/requirements-min.txt:1-15](https://github.com/open-webui/open-webui/blob/main/backend/requirements-min.txt)
Architecture Overview
Related topics: Data Models, API Routers, Frontend Structure
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Data Models, API Routers, Frontend Structure
Architecture Overview
Open WebUI is a self-hosted, extensible AI interface designed to provide a unified chat experience with various LLM backends. The architecture follows a modern full-stack pattern with a Python-based backend and a Svelte-based frontend, communicating via REST APIs and WebSocket connections.
System Architecture
Open WebUI employs a layered architecture that separates concerns between presentation, business logic, and data access:
graph TD
subgraph Frontend["Frontend (Svelte/SvelteKit)"]
UI["UI Components<br/>(+layout.svelte)"]
Utils["Utilities<br/>(src/lib/utils)"]
APIs["API Client<br/>(src/lib/apis)"]
Const["Constants<br/>(src/lib/constants)"]
end
subgraph Backend["Backend (Python/FastAPI)"]
Main["Main Application<br/>(main.py)"]
Socket["WebSocket Server<br/>(socket/main.py)"]
Config["Configuration<br/>(config.py)"]
Env["Environment<br/>(env.py)"]
Middleware["Middleware<br/>(middleware.py)"]
Retrieval["Retrieval System<br/>(retrieval/)"]
end
subgraph External["External Services"]
Ollama["Ollama API"]
OpenAI["OpenAI API"]
VectorDB["Vector Databases"]
Redis["Redis Session Store"]
DB["SQLite/PostgreSQL"]
end
UI --> Utils
UI --> APIs
Utils --> Const
APIs --> Const
APIs --> Main
UI --> Socket
Main --> Config
Main --> Env
Main --> Middleware
Main --> Retrieval
Main --> Socket
Main --> Ollama
Main --> OpenAI
Main --> VectorDB
Main --> Redis
Main --> DBDirectory Structure
The repository is organized into two main components:
| Directory | Purpose |
|---|---|
backend/ | Python/FastAPI backend application |
src/ | Svelte/SvelteKit frontend application |
Backend Structure
| Path | Description |
|---|---|
backend/open_webui/ | Main application package |
backend/open_webui/main.py | FastAPI application entry point |
backend/open_webui/socket/main.py | Socket.IO WebSocket handler |
backend/open_webui/config.py | Persistent configuration system |
backend/open_webui/env.py | Environment variable loading |
backend/open_webui/utils/middleware.py | Response processing middleware |
backend/open_webui/retrieval/ | RAG and document retrieval system |
Frontend Structure
| Path | Description |
|---|---|
src/routes/ | SvelteKit routes and page components |
src/lib/ | Shared libraries and utilities |
src/lib/apis/ | API client implementations |
src/lib/utils/ | Utility functions |
src/lib/constants.ts | Application constants and configuration |
API Architecture
API Endpoint Structure
Open WebUI exposes multiple API bases for different services:
graph LR
subgraph Gateway["API Gateway"]
Base["/"]
end
subgraph Services["Service Endpoints"]
API["/api/v1<br/>REST API"]
Ollama["/ollama<br/>Ollama Proxy"]
OpenAI["/openai<br/>OpenAI Proxy"]
Audio["/api/v1/audio<br/>Audio Processing"]
Images["/api/v1/images<br/>Image Processing"]
Retrieval["/api/v1/retrieval<br/>RAG Retrieval"]
end
Base --> API
Base --> Ollama
Base --> OpenAI
Base --> Audio
Base --> Images
Base --> RetrievalAPI Constants Configuration
API base URLs are defined in src/lib/constants.ts:
| Constant | Default Value | Purpose |
|---|---|---|
WEBUI_BASE_URL | Dynamic (dev/prod) | Base application URL |
WEBUI_API_BASE_URL | ${WEBUI_BASE_URL}/api/v1 | Main REST API |
OLLAMA_API_BASE_URL | ${WEBUI_BASE_URL}/ollama | Ollama API proxy |
OPENAI_API_BASE_URL | ${WEBUI_BASE_URL}/openai | OpenAI API proxy |
AUDIO_API_BASE_URL | ${WEBUI_BASE_URL}/api/v1/audio | Audio processing |
IMAGES_API_BASE_URL | ${WEBUI_BASE_URL}/api/v1/images | Image generation |
RETRIEVAL_API_BASE_URL | ${WEBUI_BASE_URL}/api/v1/retrieval | RAG retrieval |
Sources: src/lib/constants.ts:1-15
API Client Pattern
The frontend uses a consistent API client pattern implemented in src/lib/apis/:
// Pattern used across all API clients
const res = await fetch(`${WEBUI_API_BASE_URL}/endpoint`, {
method: 'METHOD',
headers: {
Accept: 'application/json',
'Content-Type': 'application/json',
authorization: `Bearer ${token}`
},
body: JSON.stringify({ /* payload */ })
})
.then(async (res) => {
if (!res.ok) throw await res.json();
return res.json();
});
Sources: src/lib/apis/knowledge/index.ts:1-35
Configuration System
Environment Setup
The backend loads configuration from environment variables and .env files using the following hierarchy defined in backend/open_webui/env.py:
| Variable | Description |
|---|---|
OPEN_WEBUI_DIR | Application directory (location of env.py) |
BACKEND_DIR | Parent of open_webui/ |
BASE_DIR | Repository root |
DOCKER | Docker environment flag |
USE_CUDA_DOCKER | CUDA/GPU acceleration flag |
Sources: backend/open_webui/env.py:1-45
Persistent Configuration
Configuration values are stored persistently using the PersistentConfig system:
ENABLE_CODE_EXECUTION = PersistentConfig(
'ENABLE_CODE_EXECUTION',
'code_execution.enable',
os.environ.get('ENABLE_CODE_EXECUTION', 'True').lower() == 'true',
)
CODE_EXECUTION_ENGINE = PersistentConfig(
'CODE_EXECUTION_ENGINE',
'code_execution.engine',
os.environ.get('CODE_EXECUTION_ENGINE', 'pyodide'),
)
Sources: backend/open_webui/config.py:1-50
Supported File Types
The application supports various file upload types:
| Category | MIME Types |
|---|---|
| Documents | application/pdf, application/epub+zip, application/vnd.openxmlformats-officedocument.wordprocessingml.document |
| Text | text/plain, text/csv, text/xml, text/html, text/x-python, text/css, text/markdown |
| Code | text/x-python, text/css, application/x-javascript |
| Media | audio/mpeg, audio/wav |
| Other | application/octet-stream |
Sources: src/lib/constants.ts:20-35
WebSocket Communication
Real-time communication uses Socket.IO for bidirectional messaging:
sequenceDiagram
participant Client as Frontend
participant Socket as Socket.IO Server
participant Main as Main Application
Client->>Socket: Connect with auth token
Socket->>Main: Validate session
Main->>Socket: Session valid
Socket->>Client: Connection established
Client->>Socket: Send message event
Socket->>Main: Forward message
Main->>Main: Process with LLM
Main->>Socket: Stream response
Socket->>Client: Stream chunks
Client->>Socket: Disconnect
Socket->>Client: Connection closedSources: backend/open_webui/socket/main.py
Middleware Pipeline
The middleware system processes responses and transforms content for the frontend. The build_output() function in backend/open_webui/utils/middleware.py handles special content types:
Content Type Processing
| reasoning | `' )
| Content Type | Rendering | Description |
|---|
Sources: [backend/open_webui/utils/middleware.py:1-80]()
### Deep Merge Utility
The middleware also provides a `deep_merge()` function for combining configuration:
| Behavior | Description |
|----------|-------------|
| Dicts | Recursive merge |
| Strings | Concatenation |
| Others | Overwrite |
Sources: [backend/open_webui/utils/middleware.py:75-85]()
## Frontend Application Structure
### Layout System
The main layout is defined in `src/routes/+layout.svelte` which serves as the root component:
graph TD Layout["+layout.svelte<br/>Root Layout"] Splash["Splash Screen<br/>(#splash-screen)"] Progress["Progress Bar<br/>(#progress-bar)"] Logo["Logo Elements<br/>(#logo, #logo-her)"] Theme["Theme Detection<br/>(.dark, .her)"]
Layout --> Splash Layout --> Progress Layout --> Logo Layout --> Theme
Sources: [src/app.html:1-60]()
### Utility Libraries
| Library | Purpose |
|---------|---------|
| `src/lib/utils/index.ts` | Content processing, sanitization, Chinese language handling |
| `src/lib/utils/codeHighlight.ts` | Code syntax highlighting with Shiki |
| `src/lib/apis/index.ts` | API client exports |
### Content Processing Pipeline
The `processResponseContent()` function handles special content transformations:
export const processResponseContent = (content: string) => { content = processChineseContent(content); return content.trim(); };
export const sanitizeResponseContent = (content: string) => { return content .replace(/<\|[a-z]*$/, '') .replace(/<\|[a-z]+\|$/, '') .replace(/<$/, '') .replaceAll('<', '<') .replaceAll('>', '>') .replaceAll(/<\|[a-z]+\|>/g, ' ') .trim(); };
Sources: [src/lib/utils/index.ts:1-50]()
## Retrieval System
The RAG (Retrieval-Augmented Generation) system supports multiple document loaders and search engines:
### Supported Document Sources
| Source | Configuration |
|--------|--------------|
| External Document Loader | `EXTERNAL_DOCUMENT_LOADER_URL`, `EXTERNAL_DOCUMENT_LOADER_API_KEY` |
| Apache TIKA | `TIKA_SERVER_URL` |
| Docling | `DOCLING_SERVER_URL`, `DOCLING_API_KEY`, `DOCLING_PARAMS` |
| Mistral OCR | `MISTRAL_OCR_API_BASE_URL`, `MISTRAL_OCR_API_KEY` |
| PaddleOCR VL | `PADDLEOCR_VL_BASE_URL`, `PADDLEOCR_VL_TOKEN` |
| MinerU | `MINERU_API_URL`, `MINERU_API_KEY`, `MINERU_PARAMS` |
### Supported Search Providers
| Provider | Notes |
|----------|-------|
| SearXNG | Self-hosted metasearch |
| Google PSE | Programmable Search Engine |
| Brave Search | Privacy-focused search |
| Ollama Cloud | LLM provider search |
| Azure AI Search | Enterprise search |
Sources: [backend/open_webui/retrieval/utils.py:1-60]()
## Code Execution Engine
Open WebUI supports code execution with configurable backends:
### Configuration Options
| Setting | Default | Description |
|---------|---------|-------------|
| `ENABLE_CODE_EXECUTION` | `True` | Enable/disable code execution |
| `CODE_EXECUTION_ENGINE` | `pyodide` | Execution engine (pyodide/jupyter) |
| `CODE_EXECUTION_JUPYTER_URL` | `''` | Jupyter server URL |
| `CODE_EXECUTION_JUPYTER_AUTH` | `''` | Jupyter authentication |
| `CODE_EXECUTION_JUPYTER_AUTH_TOKEN` | `''` | Jupyter auth token |
### Execution Environments
| Engine | Environment | Constraints |
|--------|-------------|-------------|
| Pyodide | Browser-based | Cannot install packages, `pip install` unavailable |
| Jupyter | External server | Requires URL and optional authentication |
Sources: [backend/open_webui/config.py:50-100]()
## Technology Stack
### Backend Dependencies
Key packages from `backend/requirements-min.txt`:
| Package | Version | Purpose |
|---------|---------|---------|
| `fastapi` | 0.135.1 | Web framework |
| `uvicorn[standard]` | 0.41.0 | ASGI server |
| `pydantic` | 2.12.5 | Data validation |
| `python-multipart` | 0.0.22 | Form parsing |
| `python-socketio` | 5.16.1 | WebSocket support |
| `sqlalchemy` | 2.0.48 | ORM |
| `aiosqlite` | 0.21.0 | Async SQLite |
| `psycopg[binary]` | 3.2.9 | PostgreSQL driver |
| `httpx[socks,http2,zstd,cli,brotli]` | 0.28.1 | HTTP client |
| `redis` | latest | Session storage |
| `pycrdt` | 0.12.47 | CRDT for collaboration |
| ` RestrictedPython` | 8.1 | Safe Python execution |
Sources: [backend/requirements-min.txt:1-40]()
### Frontend Architecture
| Technology | Purpose |
|------------|---------|
| SvelteKit | Frontend framework |
| TypeScript | Type safety |
| Shiki | Code syntax highlighting |
## Security Considerations
### Authentication Flow
The system uses Bearer token authentication for API requests:
headers: { authorization: Bearer ${token} }
### Role-Based Access Control (RBAC)
Open WebUI implements RBAC for:
- Ollama endpoint access
- Model creation/pulling rights
- Knowledge base permissions
Sources: [README.md]()
## Deployment Modes
### Docker Deployment
docker run -d -p 3000:8080 \ -v open-webui:/app/backend/data \ --name open-webui \ --add-host=host.docker.internal:host-gateway \ --restart always \ ghcr.io/open-webui/open-webui:latest
### Python pip Installation
pip install open-webui open-webui serve
### Environment Variables
| Variable | Values | Description |
|----------|--------|-------------|
| `DOCKER` | `True`/`False` | Docker environment detection |
| `USE_CUDA_DOCKER` | `true`/`false` | GPU acceleration |
| `HF_HUB_OFFLINE` | `1` | Offline mode (prevent downloads) |
Sources: [README.md](), [backend/open_webui/env.py:30-40]()Sources: [src/lib/constants.ts:1-15]()
Data Models
Related topics: Architecture Overview, API Routers
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Architecture Overview, API Routers
Data Models
Overview
The Open WebUI project implements a comprehensive data modeling layer that manages persistent storage for all core application entities. The data models are built using SQLAlchemy ORM and follow a structured approach to storing user interactions, configurations, and content within the application.
The data model architecture serves as the foundation for:
- User Management: Authentication, authorization, and user preferences
- Chat Persistence: Message history and conversation state
- Knowledge Bases: RAG (Retrieval-Augmented Generation) document storage
- File Management: Document uploads and attachments
- Access Control: Permission management through groups and grants
Sources: backend/open_webui/internal/db.py:1-50
Architecture Overview
Open WebUI uses a layered data access architecture where models are defined as SQLAlchemy ORM classes and accessed through service layers.
graph TD
A[API Routers] --> B[Service Layer]
B --> C[Data Models]
C --> D[SQLAlchemy ORM]
D --> E[(SQLite Database)]
F[ChatMessages Table] --> C
G[Chats Table] --> C
H[Users Table] --> C
I[Knowledge Table] --> C
J[Files Table] --> CCore Data Models
User Model
The User model manages user accounts, authentication, and preferences.
class UserModel(BaseModel):
id: str
name: str
email: Optional[str]
role: str # admin, user, guest
email_verified: bool
created_at: datetime
updated_at: datetime
settings: dict
keys: list
| Field | Type | Description |
|---|---|---|
id | String | Unique user identifier (UUID) |
name | String | Display name |
email | String (nullable) | User email address |
role | Enum | User role: admin, user, guest |
email_verified | Boolean | Email verification status |
created_at | DateTime | Account creation timestamp |
updated_at | DateTime | Last modification timestamp |
settings | JSON | User preferences and configurations |
Sources: backend/open_webui/models/users.py:1-100
Chat Model
The Chat model stores conversation sessions and their associated metadata.
graph LR
A[User] -->|has many| B[Chats]
B -->|contains| C[Messages]
B -->|references| D[ChatMessages Table]
D -->|links to| E[Messages JSON]The Chat model structure:
class ChatModel(BaseModel):
id: str
user_id: str
title: str
chat: dict # Contains history, messages, metadata
created_at: datetime
updated_at: datetime
share_id: Optional[str]
archived: bool
| Field | Type | Description |
|---|---|---|
id | String | Unique chat identifier |
user_id | String | Owner user ID |
title | String | Chat title |
chat | JSON | Full chat history and state |
share_id | String (nullable) | Public sharing identifier |
archived | Boolean | Archive status |
The chat field contains a nested JSON structure:
{
"history": {
"messages": {
"message_id": {
"id": "...",
"type": "human|ai|system",
"content": "...",
"created_at": "..."
}
}
},
"metadata": {}
}
Sources: backend/open_webui/models/chats.py:1-150
Message Model
The Message model represents individual messages within a chat conversation.
graph TD
A[Message] --> B[type]
A --> C[content]
A --> D[role]
A --> E[timestamp]
B --> F[human|ai|system|tool]
C --> G[text|images|files]class MessageModel(BaseModel):
id: str
chat_id: str
message_id: str
type: str # human, ai, system, tool
role: str
content: str
files: list
images: list
created_at: datetime
| Field | Type | Description |
|---|---|---|
id | String | Unique message ID |
chat_id | String | Parent chat ID |
message_id | String | Message identifier within chat |
type | Enum | Message type |
role | String | Role: user, assistant, system, tool |
content | String | Message content |
files | List | Attached file references |
images | List | Embedded image data |
Sources: backend/open_webui/models/messages.py:1-100
Knowledge Model
The Knowledge model manages RAG knowledge bases for document retrieval.
class KnowledgeModel(BaseModel):
id: str
user_id: str
name: str
description: str
created_at: datetime
updated_at: datetime
data: dict # Contains documents and vectors
| Field | Type | Description |
|---|---|---|
id | String | Knowledge base ID |
user_id | String | Owner user ID |
name | String | Knowledge base name |
description | String | Knowledge base description |
data | JSON | Documents and vector embeddings |
Sources: backend/open_webui/models/knowledge.py:1-100
File Model
The File model handles file uploads and attachments.
class FileModel(BaseModel):
id: str
user_id: str
filename: str
path: str
type: str
size: int
created_at: datetime
data: dict # Metadata
| Field | Type | Description |
|---|---|---|
id | String | File identifier |
user_id | String | Owner user ID |
filename | String | Original filename |
path | String | Storage path |
type | String | MIME type |
size | Integer | File size in bytes |
data | JSON | Additional metadata |
Sources: backend/open_webui/models/files.py:1-100
Database Schema
Entity Relationship Diagram
erDiagram
USERS ||--o{ CHATS : "owns"
USERS ||--o{ FILES : "owns"
USERS ||--o{ KNOWLEDGE : "owns"
USERS ||--o{ MESSAGES : "sends"
CHATS ||--o{ CHAT_MESSAGES : "contains"
CHAT_MESSAGES ||--|| MESSAGES : "references"
KNOWLEDGE ||--o{ DOCUMENTS : "contains"
USERS ||--o{ GROUPS : "belongs to"
GROUPS ||--o{ ACCESS_GRANTS : "grants"
CHATS ||--o| SHARES : "can be shared"Database Tables
| Table Name | Primary Key | Description |
|---|---|---|
users | id | User accounts and settings |
chats | id | Chat session storage |
chat_messages | id, chat_id, message_id | Normalized message storage |
messages | id | Message content (embedded in chats) |
knowledge | id | Knowledge base definitions |
documents | id | Knowledge base documents |
files | id | File metadata |
folders | id | Folder organization |
groups | id | User groups |
access_grants | id | Permission grants |
memories | id | User memory storage |
channels | id | Communication channels |
notes | id | User notes |
Sources: backend/open_webui/migrations/versions/7e5b5dc7342b_init.py:1-500
Access Control Models
User Groups
class GroupModel(BaseModel):
id: str
name: str
description: str
created_at: datetime
user_id: str # Creator/owner
Access Grants
graph TD
A[User] -->|belongs to| B[Groups]
B -->|grants| C[Access Grants]
C -->|applies to| D[Resource]
D --> E[Model]
D --> F[Knowledge]
D --> G[Tool]
D --> H[Function]| Field | Type | Description |
|---|---|---|
id | String | Grant identifier |
user_id | String | User receiving access |
group_id | String | Group granting access |
resource_type | Enum | Type: model, knowledge, tool, function |
resource_id | String | Target resource ID |
permission | String | Permission level: read, write, admin |
Sources: backend/open_webui/utils/access_control/__init__.py:1-50
Service Layer Integration
Chat Service Pattern
The Chat model provides methods for message management:
async def get_messages_map_by_chat_id(id: str) -> dict:
"""Get message map for walking history."""
async def get_message_by_id_and_message_id(
id: str,
message_id: str
) -> Optional[dict]:
"""Retrieve specific message from chat."""
async def upsert_message_to_chat_by_id_and_message_id(
id: str,
message_id: str,
message: dict
) -> Optional[ChatModel]:
"""Update or insert message in chat."""
Message Sanitization
Before database operations, message content is sanitized to prevent issues:
def sanitize_text_for_db(text: str) -> str:
"""Remove null characters and invalid sequences."""
This ensures database compatibility and prevents JSON parsing errors when loading chat history.
Sources: backend/open_webui/models/chats.py:100-180
Model Operations
CRUD Operations
| Operation | Method | Description |
|---|---|---|
| Create | Model.create() | Insert new record |
| Read | Model.get() | Retrieve by ID |
| Update | Model.update() | Modify existing record |
| Delete | Model.delete() | Remove record |
| List | Model.get_all() | Retrieve all records |
| Filter | Model.filter_by() | Query with conditions |
Async Database Access
Open WebUI uses async database operations for improved performance:
async def get_chat_by_id(id: str) -> Optional[ChatModel]:
"""Async retrieval of chat by ID."""
async def upsert_message_to_chat_by_id_and_message_id(
id: str,
message_id: str,
message: dict
) -> Optional[ChatModel]:
"""Async upsert operation."""
Data Storage Locations
Database File
By default, Open WebUI uses SQLite stored at:
backend/data/webui.db
File Storage
Uploaded files are stored in:
backend/data/uploads/
Configuration
Database and storage paths are configured via environment variables:
| Variable | Default | Description |
|---|---|---|
DATA_DIR | backend/data | Base data directory |
DATABASE_URL | sqlite:///data/webui.db | Database connection string |
Sources: backend/open_webui/env.py:1-80
Migration System
Open WebUI uses Alembic for database migrations:
graph LR
A[Migration Scripts] --> B[Alembic]
B --> C[Database Schema]
C --> D[Model Definitions]
D --> E[Application]Migration files are located in:
backend/open_webui/migrations/versions/
Sources: backend/open_webui/migrations/versions/7e5b5dc7342b_init.py:1-500
Summary
The Open WebUI data model layer provides a robust foundation for:
- User Management: Complete user lifecycle including authentication and authorization
- Chat Persistence: Flexible JSON-based chat storage with normalized message tables
- Knowledge Management: RAG-capable knowledge bases for document retrieval
- File Handling: Secure file upload and storage with metadata tracking
- Access Control: Fine-grained permissions through groups and resource grants
The architecture prioritizes:
- Performance: Async database operations and message normalization
- Flexibility: JSON-based storage for variable content structures
- Security: Text sanitization and access control enforcement
- Extensibility: Modular model design for future features
Sources: [backend/open_webui/internal/db.py:1-50]()
API Routers
Related topics: Architecture Overview, Data Models
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Architecture Overview, Data Models
API Routers
Overview
The Open WebUI project implements a comprehensive API routing architecture built on FastAPI. API Routers serve as the primary mechanism for organizing and exposing RESTful endpoints across the application. Each router encapsulates a specific functional domain (e.g., authentication, chat management, file handling, knowledge bases) and is mounted at a defined prefix under the /api/v1/ base path.
The router architecture follows a modular design pattern where related endpoints are grouped into dedicated router modules located in backend/open_webui/routers/. This separation of concerns enables maintainability, testability, and clear API boundaries.
Sources: backend/open_webui/main.py:1-60
Router Registration Architecture
Central Router Assembly
All routers are registered in backend/open_webui/main.py using FastAPI's include_router() method. Each router receives a unique URL prefix and OpenAPI tag for documentation and routing purposes.
app.include_router(auths.router, prefix='/api/v1/auths', tags=['auths'])
app.include_router(users.router, prefix='/api/v1/users', tags=['users'])
app.include_router(chats.router, prefix='/api/v1/chats', tags=['chats'])
app.include_router(models.router, prefix='/api/v1/models', tags=['models'])
app.include_router(knowledge.router, prefix='/api/v1/knowledge', tags=['knowledge'])
app.include_router(files.router, prefix='/api/v1/files', tags=['files'])
Sources: backend/open_webui/main.py:35-55
Router Prefix Mapping
| Functional Domain | Router Module | API Prefix | OpenAPI Tag |
|---|---|---|---|
| Authentication | auths | /api/v1/auths | auths |
| User Management | users | /api/v1/users | users |
| Chat Operations | chats | /api/v1/chats | chats |
| Model Management | models | /api/v1/models | models |
| Knowledge Bases | knowledge | /api/v1/knowledge | knowledge |
| File Handling | files | /api/v1/files | files |
| Prompts | prompts | /api/v1/prompts | prompts |
| Tools | tools | /api/v1/tools | tools |
| Skills | skills | /api/v1/skills | skills |
| Memories | memories | /api/v1/memories | memories |
| Folders | folders | /api/v1/folders | folders |
| Groups | groups | /api/v1/groups | groups |
| Functions | functions | /api/v1/functions | functions |
| Evaluations | evaluations | /api/v1/evaluations | evaluations |
| Audio Processing | audio | /api/v1/audio | audio |
| Image Processing | images | /api/v1/images | images |
| Retrieval | retrieval | /api/v1/retrieval | retrieval |
| Configurations | configs | /api/v1/configs | configs |
| Channels | channels | /api/v1/channels | channels |
| Notes | notes | /api/v1/notes | notes |
| Tasks | tasks | /api/v1/tasks | tasks |
| Utils | utils | /api/v1/utils | utils |
| Terminals | terminals | /api/v1/terminals | terminals |
| Automations | automations | /api/v1/automations | automations |
| Calendars | calendar | /api/v1/calendars | calendars |
| SCIM Identity | scim | /api/v1/scim/v2 | scim |
| Analytics | analytics | /api/v1/analytics | analytics |
Sources: backend/open_webui/main.py:35-65
Request Flow and Middleware Pipeline
Middleware Stack
The API request lifecycle involves multiple middleware layers that process requests before they reach individual route handlers.
graph TD
A[HTTP Request] --> B[ASGI Middleware]
B --> C[Authentication Middleware]
C --> D[Token Extraction<br/>API Key/Cookie/Bearer]
D --> E[Audit Logging Middleware<br/>Conditional]
E --> F[Pipeline Inlet Filter]
F --> G[Route Handler]
G --> H[Pipeline Outlet Filter]
H --> I[Response]Sources: backend/open_webui/utils/asgi_middleware.py:1-30
Authentication Middleware
The ASGI middleware (asgi_middleware.py) handles credential extraction from multiple sources:
- Bearer Token: Extracted from
Authorizationheader - Cookie Token: Retrieved from
tokencookie - API Key: Retrieved from custom header specified by
CUSTOM_API_KEY_HEADERenvironment variable
The extracted credentials are stored in request.state.token for downstream route handlers.
Sources: backend/open_webui/utils/asgi_middleware.py:12-40
Pipeline Filter System
The pipelines.py module implements a filter system that allows middleware-like processing at the inlet and outlet of request handling. This enables transformation and validation of payloads through user-defined pipeline stages.
def get_sorted_filters(model_id, models):
filters = [
model
for model in models.values()
if 'pipeline' in model
and 'type' in model['pipeline']
and model['pipeline']['type'] == 'filter'
and (
model['pipeline']['pipelines'] == ['*']
or any(model_id == target_model_id for target_model_id in model['pipeline']['pipelines'])
)
]
sorted_filters = sorted(filters, key=lambda x: x['pipeline']['priority'])
return sorted_filters
Sources: backend/open_webui/routers/pipelines.py:30-45
Router Module Structure
Standard Router Pattern
Each router module follows a consistent pattern:
from fastapi import APIRouter, Depends, HTTPException, Request, status
from pydantic import BaseModel
from typing import Optional
from open_webui.utils.auth import get_verified_user, get_admin_user
router = APIRouter()
class EndpointForm(BaseModel):
# Request payload schema
@router.post('/endpoint')
async def endpoint_handler(
request: Request,
form_data: EndpointForm,
user=Depends(get_verified_user)
):
# Handler implementation
Sources: backend/open_webui/routers/prompts.py:1-30
Authentication Dependencies
| Dependency | Purpose | Access Level |
|---|---|---|
get_verified_user | Validates authenticated user | Authenticated users |
get_admin_user | Validates admin privileges | Admin only |
Sources: backend/open_webui/routers/prompts.py:25-30
Core Router Modules
Tasks Router
The tasks router (tasks.py) handles asynchronous operations for chat-related tasks including title generation, follow-up generation, query generation, and image prompt generation.
Task Types Available:
| Task | Purpose | Template Function |
|---|---|---|
| Title Generation | Create chat titles | title_generation_template() |
| Follow-up Generation | Generate follow-up questions | follow_up_generation_template() |
| Query Generation | Create search queries | query_generation_template() |
| Image Prompt Generation | Generate image prompts | image_prompt_generation_template() |
| Autocomplete | Autocomplete suggestions | autocomplete_generation_template() |
| Tags Generation | Generate content tags | tags_generation_template() |
| Emoji Generation | Generate emoji suggestions | emoji_generation_template() |
| MoA Response | Mixture of Agents response | moa_response_generation_template() |
Sources: backend/open_webui/routers/tasks.py:1-40
Prompts Router
The prompts router manages user-defined prompt templates with command-based activation. It implements access control based on user roles and resource grants.
Access Control Logic:
write_access=(
(user.role == 'admin' and BYPASS_ADMIN_ACCESS_CONTROL)
or user.id == prompt.user_id
or await AccessGrants.has_access(
user_id=user.id,
resource_type='prompt',
resource_id=prompt.id,
permission='write',
db=db,
)
)
Sources: backend/open_webui/routers/prompts.py:50-70
Conditional Router Loading
Some routers are conditionally loaded based on configuration flags:
SCIM Router
The SCIM 2.0 router for identity management is enabled via the ENABLE_SCIM environment variable:
if ENABLE_SCIM:
app.include_router(scim.router, prefix='/api/v1/scim/v2', tags=['scim'])
Analytics Router
The analytics router is loaded when admin analytics are enabled:
if ENABLE_ADMIN_ANALYTICS:
app.include_router(analytics.router, prefix='/api/v1/analytics', tags=['analytics'])
Audit Logging Middleware
Audit logging is conditionally applied based on the AUDIT_LOG_LEVEL configuration:
try:
audit_level = AuditLevel(AUDIT_LOG_LEVEL)
except ValueError as e:
logger.error(f'Invalid audit level: {AUDIT_LOG_LEVEL}. Error: {e}')
audit_level = AuditLevel.NONE
if audit_level != AuditLevel.NONE:
app.add_middleware(
AuditLoggingMiddleware,
audit_level=audit_level,
excluded_paths=AUDIT_EXCLUDED_PATHS,
)
Sources: backend/open_webui/main.py:55-70
Utility Functions and Helpers
Middleware Utility Imports
The middleware.py module aggregates utility functions from multiple sources for use by route handlers:
from open_webui.utils.chat import generate_chat_completion
from open_webui.utils.task import get_task_model_id, rag_template
from open_webui.utils.tools import get_tools, get_terminal_tools
from open_webui.utils.misc import (
deep_update, extract_urls, get_message_list,
add_or_update_system_message, merge_system_messages
)
from open_webui.utils.files import (
convert_markdown_base64_images,
get_file_url_from_base64,
get_image_base64_from_url,
)
Sources: backend/open_webui/utils/middleware.py:1-35
Security Architecture
Token-Based Authentication
sequenceDiagram
participant C as Client
participant M as ASGI Middleware
participant R as Route Handler
C->>M: Request + Credentials
M->>M: Extract Bearer/Cookie/API-Key
M->>R: Set request.state.token
R->>R: Verify with get_verified_user
alt Invalid Token
R-->>C: 401 Unauthorized
else Valid Token
R->>R: Process Request
R-->>C: Response
endSources: backend/open_webui/utils/asgi_middleware.py:20-50
Frontend API Integration
The frontend TypeScript codebase in src/lib/apis/ provides typed interfaces for all major routers:
| Router Domain | Frontend Module |
|---|---|
| Knowledge Bases | src/lib/apis/knowledge/index.ts |
| Skills | src/lib/apis/skills/index.ts |
| OpenAI Config | src/lib/apis/openai/index.ts |
| Tool Servers | src/lib/apis/index.ts |
The frontend uses WEBUI_API_BASE_URL constant (${WEBUI_BASE_URL}/api/v1) as the base for all API calls.
Sources: src/lib/constants.ts:1-20
Summary
The API Routers system in Open WebUI implements a well-organized, FastAPI-based architecture with:
- Modular Design: 26+ functional router modules organized by domain
- Consistent Patterns: Standardized router structure with Pydantic models and authentication dependencies
- Middleware Pipeline: Request processing through ASGI middleware, authentication, audit logging, and pipeline filters
- Conditional Loading: Feature flags for SCIM, analytics, and audit logging
- Access Control: Role-based and grant-based authorization at the router and endpoint levels
- Frontend Integration: TypeScript API clients aligned with backend router structure
Sources: [backend/open_webui/main.py:1-60]()
Retrieval System
Related topics: Ollama Integration, RAG Pipeline
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Ollama Integration, RAG Pipeline
Retrieval System
The Retrieval System in Open WebUI is a comprehensive framework for document loading, web searching, and vector-based information retrieval. It enables users to ingest documents, perform web searches, and leverage retrieval-augmented generation (RAG) capabilities to enhance LLM responses with contextual information.
Architecture Overview
The retrieval system is composed of three primary subsystems:
graph TD
subgraph Retrieval["Retrieval System"]
subgraph Loaders["Document Loaders"]
PDF[PDF Loader]
OCR[OCR Loaders]
WebLoader[Web Loader]
end
subgraph WebSearch["Web Search Providers"]
SearXNG[SearXNG]
DuckDuckGo[DuckDuckGo]
GooglePSE[Google PSE]
Brave[Brave Search]
YouDC[You.com]
end
subgraph VectorDB["Vector Stores"]
Chroma[Chroma]
FAISS[FAISS]
Milvus[Milvus]
Qdrant[Qdrant]
PGVector[pgvector]
end
end
API[API Router] --> Loaders
API --> WebSearch
API --> VectorDBCore Components
| Component | Purpose | Location |
|---|---|---|
| Document Loaders | Ingest various file formats into the system | backend/open_webui/retrieval/loaders/ |
| Web Search | Query external search engines for information | backend/open_webui/retrieval/web/ |
| Vector Database | Store and query embeddings for semantic search | backend/open_webui/retrieval/vector/ |
| API Router | Expose retrieval endpoints to the frontend | backend/open_webui/routers/retrieval.py |
Document Loaders
The document loader subsystem handles ingestion of various file formats into the retrieval pipeline.
Supported File Types
The system supports the following file types for upload and processing:
| Category | MIME Types |
|---|---|
| Documents | PDF, EPUB, DOCX, TXT, CSV, XML, HTML, Markdown |
| Code | Python, JavaScript, CSS |
| Audio | MP3, WAV |
| Images | PNG, JPG (with OCR) |
Sources: src/lib/constants.ts
OCR Processing
For scanned documents and images, Open WebUI supports multiple OCR engines:
PaddleOCR VL is one of the supported OCR backends. It processes documents page-by-page, extracting text and returning structured Document objects with metadata.
# Processing flow in paddleocr_vl.py
for i, page in enumerate(doc):
markdown_text = run_paddle_ocr(page)
cleaned_content = clean_markdown(markdown_text)
documents.append(
Document(
page_content=cleaned_content,
metadata={
'page': i,
'page_label': i + 1,
'total_pages': total_pages,
'file_name': self.file_name,
'processing_engine': 'paddleocr-vl',
}
)
)
Sources: backend/open_webui/retrieval/loaders/paddleocr_vl.py
Configuration Options
The retrieval loaders are configured through the following environment variables:
| Variable | Description |
|---|---|
EXTERNAL_DOCUMENT_LOADER_URL | URL for external document loader service |
EXTERNAL_DOCUMENT_LOADER_API_KEY | API key for external loader |
TIKA_SERVER_URL | Apache Tika server endpoint |
DOCLING_SERVER_URL | Docling OCR server endpoint |
DOCLING_API_KEY | API key for Docling service |
DOCLING_PARAMS | Additional Docling parameters |
PDF_EXTRACT_IMAGES | Enable image extraction from PDFs |
PDF_LOADER_MODE | PDF loading mode configuration |
DOCUMENT_INTELLIGENCE_ENDPOINT | Azure Document Intelligence endpoint |
DOCUMENT_INTELLIGENCE_KEY | Azure Document Intelligence API key |
DOCUMENT_INTELLIGENCE_MODEL | Model identifier for document processing |
MISTRAL_OCR_API_BASE_URL | Mistral OCR API base URL |
MISTRAL_OCR_API_KEY | Mistral OCR API key |
PADDLEOCR_VL_BASE_URL | PaddleOCR VL server URL |
PADDLEOCR_VL_TOKEN | Authentication token for PaddleOCR VL |
MINERU_API_MODE | MinerU API mode |
MINERU_API_URL | MinerU API endpoint |
MINERU_API_KEY | MinerU API key |
MINERU_API_TIMEOUT | MinerU API timeout in seconds |
Sources: backend/open_webui/retrieval/utils.py
Web Search
The web search subsystem provides integration with multiple search providers for retrieving up-to-date information from the internet.
Supported Providers
| Provider | Implementation | Features |
|---|---|---|
| SearXNG | Self-hosted meta-search engine | Privacy-focused, aggregated results |
| DuckDuckGo | Public search API | No API key required |
| Google PSE | Google Programmable Search | Requires API key |
| Brave Search | Privacy-focused search | API-based |
| You.com | AI-enhanced search | Rich snippets and descriptions |
| Tavily | AI-optimized search | Structured outputs |
| Perplexity | LLM-optimized search | Citations included |
Search Result Structure
Search results are normalized into a common SearchResult format:
@dataclass
class SearchResult:
link: str # URL of the result
title: str # Title of the page
snippet: str # Text snippet/summary
#### You.com Implementation
The You.com provider demonstrates the search result normalization:
def _build_snippet(result: dict) -> str:
"""Combine the description and snippets list into a single string."""
parts: list[str] = []
description = result.get('description')
if description:
parts.append(description)
snippets = result.get('snippets')
if snippets and isinstance(snippets, list):
parts.extend(snippets)
return '\n\n'.join(parts)
Sources: backend/open_webui/retrieval/web/ydc.py
Web Loader Configuration
The web loader for content extraction supports the following configuration:
| Setting | Description |
|---|---|
ENABLE_WEB_LOADER_SSL_VERIFICATION | Enable SSL certificate verification |
WEB_LOADER_CONCURRENT_REQUESTS | Rate limiting for concurrent requests |
WEB_SEARCH_TRUST_ENV | Trust environment variables for requests |
BYPASS_WEB_SEARCH_WEB_LOADER | Skip content extraction, use snippets only |
BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL | Skip embedding and retrieval stages |
Sources: backend/open_webui/routers/retrieval.py
Web Search Flow
sequenceDiagram
participant Client
participant API as /api/v1/retrieval/web/search
participant SearchProvider as Search Provider
participant WebLoader as Web Loader
participant VectorDB as Vector Store
Client->>API: POST /search {query, urls}
API->>SearchProvider: Execute search queries
SearchProvider-->>API: Raw search results
API->>WebLoader: Extract content from URLs
WebLoader-->>API: Document objects
API->>VectorDB: Store documents
VectorDB-->>API: Collection confirmation
API-->>Client: {status, collection_name, files}Vector Database Integration
The vector database subsystem handles storage and retrieval of document embeddings for semantic search.
Supported Vector Stores
| Database | Implementation | Use Case |
|---|---|---|
| Chroma | chromadb | Lightweight, local-first |
| FAISS | faiss-cpu/faiss-gpu | Large-scale similarity search |
| Milvus | pymilvus | Cloud-native, scalable |
| Qdrant | qdrant-client | High-performance, hybrid search |
| pgvector | psycopg2 | PostgreSQL extension for vectors |
Vector Factory Pattern
The system uses a factory pattern to instantiate vector databases:
class VectorStoreFactory:
@staticmethod
def get_vector_store(config: Config) -> VectorStore:
provider = config.VECTOR_DB
if provider == "chromadb":
return ChromaDBStore()
elif provider == "pgvector":
return PGVectorStore()
# ... other providers
Sources: backend/open_webui/retrieval/vector/factory.py
pgvector Implementation
For PostgreSQL-based vector storage:
class PGVectorStore:
def __init__(self, connection_string: str, embedding_dim: int = 1536):
self.conn = psycopg2.connect(connection_string)
self.embedding_dim = embedding_dim
def insert(self, collection: str, documents: list[Document]):
# Insert vectors with pgvector extension
Sources: backend/open_webui/retrieval/vector/dbs/pgvector.py
API Endpoints
The retrieval system exposes REST API endpoints through the router.
Web Search Endpoint
POST /api/v1/retrieval/web/search
Request Body:
{
"query": "search query string",
"collection_name": "optional_collection",
"retrieval_enabled": true,
"k": 5
}
Response:
{
"status": true,
"collection_name": "web_20240115_abc123",
"filenames": ["python.org", "wikipedia.org"],
"content": "extracted content...",
"sources": [
{"url": "https://python.org", "content": "..."}
]
}
File Upload and Processing
POST /api/v1/retrieval/upload
Handles file uploads, runs document loaders, and stores in the configured vector database.
Sources: backend/open_webui/routers/retrieval.py
Configuration Reference
Environment Variables
| Variable | Default | Description |
|---|---|---|
VECTOR_DB | chroma | Vector database provider |
RAG_TOP_K | 5 | Number of top results to retrieve |
RAG_RELEVANCE_THRESHOLD | 0.0 | Minimum relevance score threshold |
WEB_SEARCH_ENABLED | True | Enable web search functionality |
Frontend API URLs
The frontend communicates with these API base URLs:
export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;
Sources: src/lib/constants.ts
Data Flow
graph LR
subgraph Input["Input Sources"]
Files[Uploaded Files]
WebSearch[Web Search]
URLs[Direct URLs]
end
subgraph Processing["Processing Pipeline"]
Loaders[Document Loaders]
Chunks[Text Chunking]
Embed[Embedding Model]
end
subgraph Storage["Storage"]
Vector[Vector Store]
Meta[Metadata Store]
end
subgraph Query["Query Processing"]
QueryEmb[Query Embedding]
Similarity[Similarity Search]
Rerank[Reranking]
end
Files --> Loaders
WebSearch --> Loaders
URLs --> Loaders
Loaders --> Chunks
Chunks --> Embed
Embed --> Vector
Query --> QueryEmb
QueryEmb --> Similarity
Similarity --> Rerank
Rerank --> Context[LLM Context]Error Handling
The retrieval system implements comprehensive error handling:
| Error Type | HTTP Code | Message |
|---|---|---|
| Web search failure | 400 | WEB_SEARCH_ERROR with exception details |
| No results found | 404 | No results found from web search |
| Loader failure | 500 | Loader-specific error message |
| Vector store error | 500 | Database connection or query errors |
Sources: backend/open_webui/routers/retrieval.py:1-50
Extension Points
The retrieval system is designed for extensibility:
- Custom Document Loaders: Implement the
DocumentLoaderinterface inloaders/ - New Search Providers: Add provider class in
web/following theSearchProviderprotocol - Vector Store Adapters: Implement
VectorStoreabstract class invector/dbs/ - Embedding Models: Configure through
EMBEDDING_MODELsetting
Sources: [src/lib/constants.ts](https://github.com/open-webui/open-webui/blob/main/src/lib/constants.ts)
Frontend Structure
Related topics: Chat Interface, Architecture Overview
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Chat Interface, Architecture Overview
Frontend Structure
Overview
Open WebUI uses a modern SvelteKit-based frontend architecture built with TypeScript. The frontend is responsible for the user interface, real-time chat interactions, multimedia handling, and communication with the backend API. The application runs as a Single Page Application (SPA) with server-side rendering capabilities provided by SvelteKit.
技术栈
| Layer | Technology |
|---|---|
| Framework | SvelteKit |
| Language | TypeScript |
| Styling | CSS (with custom properties) |
| State Management | Svelte Stores |
| API Communication | Fetch API |
| Internationalization | i18n module |
| Code Highlighting | Shiki |
| Build Tool | Vite (via SvelteKit) |
Sources: src/lib/constants.ts:1
Sources: [src/lib/constants.ts:1]()
Chat Interface
Related topics: Ollama Integration, Frontend Structure
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Ollama Integration, Frontend Structure
Chat Interface
Overview
The Chat Interface is the core user-facing component of Open WebUI, providing an interactive environment for conversations with AI models. It handles message composition, response rendering, conversation state management, and integration with various backend services including Ollama, OpenAI-compatible APIs, and code execution engines.
The interface is built with SvelteKit on the frontend and Python/FastAPI on the backend, enabling real-time streaming responses, multi-model conversations, and rich content rendering including markdown, code blocks, and embedded media.
Architecture Overview
graph TD
subgraph Frontend["Frontend (Svelte)"]
Chat[Chat.svelte]
MessageInput[MessageInput.svelte]
Message[Message.svelte]
Markdown[Markdown.svelte]
ModelSelector[ModelSelector.svelte]
Navbar[Navbar.svelte]
end
subgraph StateManagement["State Management"]
Stores[index.ts - Svelte Stores]
end
subgraph Backend["Backend (Python/FastAPI)"]
ChatModel[models/chats.py]
Config[config.py]
Middleware[middleware.py]
end
Chat --> Stores
MessageInput --> Stores
Message --> Stores
Chat --> MessageInput
Chat --> Message
Message --> Markdown
Stores --> ChatModel
Stores --> MiddlewareState Management
The chat interface relies heavily on Svelte stores for reactive state management. These stores maintain the current conversation state, UI visibility flags, and application-wide settings.
Core Chat Stores
All chat-related state is managed through Svelte writable stores defined in src/lib/stores/index.ts:
| Store | Type | Purpose |
|---|---|---|
chatId | Writable<string> | Current active chat identifier |
chatTitle | Writable<string> | Title of the current chat |
chats | Writable<null> | Cached chat objects |
pinnedChats | Writable<Chat[]> | Pinned conversations |
models | Writable<Model[]> | Available AI models |
chatRequestQueues | Writable<Record<string, QueueItem[]>> | Request queue management |
Sources: src/lib/stores/index.ts:53-58
UI Visibility Stores
The interface uses boolean stores to control component visibility:
| Store | Type | Purpose |
|---|---|---|
showSidebar | Writable<boolean> | Sidebar visibility |
showSettings | Writable<boolean> | Settings panel visibility |
showShortcuts | Writable<boolean> | Keyboard shortcuts overlay |
showControls | Writable<boolean> | Chat controls visibility |
showEmbeds | Writable<boolean> | Embedded content display |
showArtifacts | Writable<boolean> | Code artifacts panel |
Sources: src/lib/stores/index.ts:22-30
Audio and Transcription Stores
| Store | Type | Purpose | |
|---|---|---|---|
audioQueue | `Writable<AudioQueue \ | null>` | TTS audio queue |
TTSWorker | `Writable<Worker \ | null>` | Text-to-speech web worker |
Message Processing Pipeline
Content Sanitization
Before rendering, message content undergoes sanitization to prevent XSS attacks and normalize special tokens:
export const sanitizeResponseContent = (content: string) => {
return content
.replace(/<\|[a-z]*$/, '')
.replace(/<\|[a-z]+\|$/, '')
.replace(/<$/, '')
.replaceAll('<', '<')
.replaceAll('>', '>')
.replaceAll(/<\|[a-z]+\|>/g, ' ')
.trim();
};
Sources: src/lib/utils/index.ts:180-189
Content Processing for Chinese Text
The system includes special handling for Chinese content to address markdown and LaTeX formatting issues:
function processChineseContent(content: string): string {
if (!/[\u4e00-\u9fa5]/.test(content)) return content;
const lines = content.split('\n');
const processedLines = lines.map((line) => {
// Chinese-specific processing logic
});
return processedLines.join('\n');
}
Sources: src/lib/utils/index.ts:195-208
Sentence and Paragraph Extraction
For audio processing (text-to-speech), messages are split into appropriate segments:
export const extractSentencesForAudio = (text: string) => {
return extractSentences(text).reduce((mergedTexts, currentText) => {
const lastIndex = mergedTexts.length - 1;
if (lastIndex >= 0) {
const previousText = mergedTexts[lastIndex];
const wordCount = previousText.split(/\s+/).length;
const charCount = previousText.length;
if (wordCount < 4 || charCount < 50) {
mergedTexts[lastIndex] = previousText + ' ' + currentText;
} else {
mergedTexts.push(currentText);
}
}
return mergedTexts;
}, []);
};
Sources: src/lib/utils/index.ts:300-319
Chat Data Models
Backend Chat Model
The backend defines chat structures in backend/open_webui/models/chats.py:
class ChatModel:
async def get_message_list(self, id: str) -> Optional[dict]:
"""Message map for walking history.
Prefer chat_message rows to avoid loading the large chat
JSON blob; fall back to embedded history when no rows exist
(legacy chats).
"""
messages_map = await ChatMessages.get_messages_map_by_chat_id(id)
if messages_map is not None:
return messages_map
# Fall back to embedded JSON blob for legacy chats
chat = await self.get_chat_by_id(id)
if chat is None:
return None
return chat.chat.get('history', {}).get('messages', {}) or {}
Sources: backend/open_webui/models/chats.py:1-25
Message Structure
Messages support both normalized storage (via chat_message rows) and legacy embedded JSON format:
| Field | Type | Description | |
|---|---|---|---|
id | string | Unique message identifier | |
parentId | `string \ | null` | Parent message ID for threading |
childrenIds | string[] | Child message IDs | |
role | `user \ | assistant` | Message author role |
content | string | Message content | |
model | string | Model used for assistant responses | |
timestamp | number | Unix timestamp of creation | |
done | boolean | Whether response is complete |
Configuration and Prompt Templates
Voice Mode Configuration
Voice mode settings are configurable via environment variables:
| Config Key | Environment Variable | Default | Description |
|---|---|---|---|
ENABLE_VOICE_MODE_PROMPT | ENABLE_VOICE_MODE_PROMPT | True | Enable voice mode prompt |
VOICE_MODE_PROMPT_TEMPLATE | VOICE_MODE_PROMPT_TEMPLATE | '' | Custom voice prompt template |
Sources: backend/open_webui/config.py:1-20
Code Interpreter Configuration
The chat interface integrates code execution capabilities:
| Config Key | Environment Variable | Default | Description |
|---|---|---|---|
ENABLE_CODE_EXECUTION | ENABLE_CODE_EXECUTION | True | Enable code execution |
CODE_EXECUTION_ENGINE | CODE_EXECUTION_ENGINE | pyodide | Execution engine (pyodide/jupyter) |
CODE_EXECUTION_JUPYTER_URL | CODE_EXECUTION_JUPYTER_URL | '' | Jupyter server URL |
CODE_EXECUTION_JUPYTER_AUTH | CODE_EXECUTION_JUPYTER_AUTH | '' | Jupyter authentication |
Sources: backend/open_webui/config.py:35-60
Prompt Generation Templates
The system uses configurable prompt templates for various tasks:
| Template | Purpose |
|---|---|
DEFAULT_MOA_GENERATION_PROMPT_TEMPLATE | Multi-model answer synthesis |
IMAGE_PROMPT_GENERATION_PROMPT_TEMPLATE | Image generation prompt creation |
FOLLOW_UP_GENERATION_PROMPT_TEMPLATE | Suggesting follow-up questions |
Code Interpreter Integration
Backend Middleware Rendering
The backend middleware handles code interpreter rendering in the streaming response pipeline:
elif item_type == 'open_webui:code_interpreter':
# Code interpreter needs to inspect/mutate prior accumulated content
# to strip trailing unclosed code fences
content = '\n'.join(parts)
content_stripped, original_whitespace = split_content_and_whitespace(content)
if is_opening_code_block(content_stripped):
content = content_stripped.rstrip('`').rstrip() + original_whitespace
else:
content = content_stripped + original_whitespace
# Render as
# Ollama Integration
## Overview
The Ollama Integration is a core component of Open WebUI that enables seamless communication between the frontend application and local Ollama instances. This integration provides a unified interface for managing, accessing, and interacting with LLM models hosted locally through Ollama, supporting both native Ollama API calls and OpenAI-compatible endpoints.
Ollama serves as the primary backend inference engine for Open WebUI, allowing users to run large language models entirely on their local hardware without relying on cloud-based services.
## Architecture Overview
The Ollama Integration follows a proxy pattern where the backend server acts as an intermediary, forwarding requests from the frontend to Ollama instances while applying access controls, model routing, and API transformations.
graph TD subgraph Frontend["Frontend (Svelte)"] UI[User Interface] API_CLIENT[API Client<br/>src/lib/apis/ollama/index.ts] end
subgraph Backend["Backend Server (Python/FastAPI)"] OLLAMA_ROUTER[Ollama Router<br/>routers/ollama.py] OPENAI_ROUTER[OpenAI Router<br/>routers/openai.py] MODEL_UTILS[Model Utilities<br/>utils/models.py] CONFIG[Configuration<br/>config.py] end
subgraph OllamaInstances["Ollama Instances"] OLLAMA_LOCAL[Local Ollama<br/>localhost:11434] OLLAMA_CUSTOM[Custom Ollama<br/>Configured URLs] end
UI --> API_CLIENT API_CLIENT -->|HTTP Requests| OLLAMA_ROUTER API_CLIENT -->|OpenAI-compatible| OPENAI_ROUTER OLLAMA_ROUTER --> MODEL_UTILS OLLAMA_ROUTER --> CONFIG OLLAMA_ROUTER -->|Native API| OLLAMA_LOCAL OLLAMA_ROUTER -->|Native API| OLLAMA_CUSTOM OPENAI_ROUTER -->|v1/chat/completions| OLLAMA_LOCAL OPENAI_ROUTER -->|v1/chat/completions| OLLAMA_CUSTOM
style Frontend fill:#e1f5fe style Backend fill:#f3e5f5 style OllamaInstances fill:#fff3e0
## Core Components
### Backend Router (routers/ollama.py)
The Ollama router (`backend/open_webui/routers/ollama.py`) handles all native Ollama API operations. It provides endpoints for model management, chat completions, and model operations.
**Primary Endpoints:**
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/chat` | POST | Send chat completion requests |
| `/api/generate` | POST | Generate text with model |
| `/api/tags` | GET | List available models |
| `/api/pull` | POST | Pull a new model |
| `/api/push` | POST | Push a model to registry |
| `/api/delete` | DELETE | Delete a model |
| `/api/create` | POST | Create a new model |
| `/config` | GET/POST | Get/update Ollama configuration |
| `/verify` | POST | Verify connection to Ollama |
| `/v1/chat/completions` | POST | OpenAI-compatible chat endpoint |
| `/v1/models` | GET | OpenAI-compatible models list |
| `/v1/messages` | POST | Anthropic-compatible messages endpoint |
| `/v1/responses` | POST | Ollama Responses API endpoint |
Sources: [backend/open_webui/routers/ollama.py:1-500](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
### Model Resolution and URL Selection
The system supports multiple Ollama instances through a URL index system. When a request is made, the router resolves the appropriate Ollama instance based on model configuration.
sequenceDiagram participant Client participant Router participant Config participant Ollama
Client->>Router: POST /api/chat {model: "llama2"} Router->>Config: get_ollama_url(model, url_idx) Config->>Config: Check model-to-URL mapping Config->>Config: Check url_idx or default Config-->>Router: (url, url_idx) Router->>Ollama: Forward request to url Ollama-->>Router: Response Router-->>Client: Forwarded response
The `get_ollama_url` function performs the following resolution logic:
1. If `url_idx` is provided, use the corresponding URL from `OLLAMA_BASE_URLS`
2. Check model-specific URL mappings stored in `OLLAMA_MODELS`
3. Fall back to the primary `OLLAMA_BASE_URL`
Sources: [backend/open_webui/routers/ollama.py:100-200](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
### Prefix ID Handling
For multi-tenant deployments, the system supports `prefix_id` configuration. When a prefix is configured, model names are automatically transformed:
prefix_id = api_config.get('prefix_id', None) if prefix_id: payload['model'] = payload['model'].replace(f'{prefix_id}.', '')
This allows users to use short model names (e.g., `llama2`) while the backend automatically transforms them to prefixed names (e.g., `tenant1.llama2`) for the Ollama API.
Sources: [backend/open_webui/routers/ollama.py:200-220](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
## Configuration
### Environment Variables
The Ollama integration is configured through environment variables in `backend/open_webui/config.py`:
| Variable | Default | Description |
|----------|---------|-------------|
| `ENABLE_OLLAMA_API` | `True` | Enable/disable Ollama API |
| `OLLAMA_API_BASE_URL` | `http://localhost:11434/api` | Primary Ollama API URL |
| `OLLAMA_BASE_URL` | Auto-derived | Base URL for Ollama connections |
| `USE_OLLAMA_DOCKER` | `false` | Use all-in-one Docker container |
| `K8S_FLAG` | Empty | Kubernetes deployment flag |
ENABLE_OLLAMA_API = PersistentConfig( 'ENABLE_OLLAMA_API', 'ollama.enable', os.environ.get('ENABLE_OLLAMA_API', 'True').lower() == 'true', )
OLLAMA_API_BASE_URL = os.environ.get('OLLAMA_API_BASE_URL', 'http://localhost:11434/api')
Sources: [backend/open_webui/config.py:1-100](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)
### Port Fallback Resolution
The configuration includes automatic port fallback logic for environments where the default Ollama port (11434) might be blocked:
def _resolve_ollama_base_url(url: str) -> str: """If the default Ollama port (11434) is unreachable, try the fallback port (12434).""" # Checks port 11434 first, then falls back to 12434 if unreachable
This enables seamless operation in environments like certain corporate networks or containerized setups where only specific ports are accessible.
Sources: [backend/open_webui/config.py:50-80](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)
### Docker and Kubernetes Handling
The configuration adapts to different deployment scenarios:
if OLLAMA_BASE_URL == '/ollama' and not K8S_FLAG: if USE_OLLAMA_DOCKER.lower() == 'true': OLLAMA_BASE_URL = 'http://localhost:11434' else: OLLAMA_BASE_URL = 'http://host.docker.internal:11434' elif K8S_FLAG: OLLAMA_BASE_URL = 'http://ollama-service.open-webui.svc.cluster.local:11434'
Sources: [backend/open_webui/config.py:40-50](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/config.py)
## API Compatibility Layers
### OpenAI-Compatible API
The Ollama router provides OpenAI-compatible endpoints that translate requests to Ollama's API format:
**Endpoint:** `POST /ollama/v1/chat/completions`
The system transforms OpenAI-format requests into Ollama-native format:
payload = apply_model_params_to_body_openai(params, payload) payload = await apply_system_prompt_to_body(system, payload, metadata, user)
This transformation includes:
- Converting OpenAI parameter names to Ollama format
- Applying model-specific parameter modifications
- Injecting system prompts from user metadata
Sources: [backend/open_webui/routers/ollama.py:150-180](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
### Anthropic-Compatible API
Support for Anthropic's `/v1/messages` endpoint is provided through the Responses API:
**Endpoint:** `POST /ollama/v1/messages`
@router.post('/v1/messages') async def generate_anthropic_messages( request: Request, form_data: dict, url_idx: Optional[int] = None, user=Depends(get_verified_user), ): """ Proxy for Ollama's Anthropic-compatible /v1/messages endpoint. Forwards the request as-is to the Ollama backend. """
The request is forwarded to Ollama's `/v1/responses` endpoint with appropriate streaming headers.
Sources: [backend/open_webui/routers/ollama.py:250-280](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
## Frontend Integration
### API Client (src/lib/apis/ollama/index.ts)
The frontend provides a TypeScript API client for communicating with the backend Ollama proxy:
**Key Functions:**
| Function | Purpose |
|----------|---------|
| `deleteModel()` | Delete a model from Ollama |
| `pullModel()` | Pull a new model with progress tracking |
| `verifyOllamaConnection()` | Test connectivity to Ollama instance |
| `getOllamaConfig()` | Retrieve current Ollama configuration |
export const pullModel = async (token: string, tagName: string, urlIdx: number | null = null) => { const controller = new AbortController(); const res = await fetch( ${OLLAMA_API_BASE_URL}/api/pull${urlIdx !== null ? /${urlIdx} : ''}, { signal: controller.signal, method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: Bearer ${token} }, body: JSON.stringify({ name: tagName }) } ); return res; };
Sources: [src/lib/apis/ollama/index.ts:1-150](https://github.com/open-webui/open-webui/blob/main/src/lib/apis/ollama/index.ts)
### API Base URL Configuration
Frontend constants define the base URLs for API communication:
export const OLLAMA_API_BASE_URL = ${WEBUI_BASE_URL}/ollama; export const WEBUI_API_BASE_URL = ${WEBUI_BASE_URL}/api/v1;
The system automatically configures the base URL based on environment:
- **Development:** `http://hostname:8080`
- **Production:** Uses the configured domain
Sources: [src/lib/constants.ts:1-30](https://github.com/open-webui/open-webui/blob/main/src/lib/constants.ts)
## Request Flow
graph LR A[User Request] --> B[Frontend API Client] B --> C[Backend Router]
C --> D{Request Type?}
D -->|Native Ollama| E[Native API Handler] D -->|OpenAI Format| F[OpenAI-Compatible Handler] D -->|Anthropic Format| G[Anthropic-Compatible Handler]
E --> H[Model Resolution] F --> H G --> H
H --> I[Access Control Check] I --> J{Model Access Allowed?}
J -->|Yes| K[Forward to Ollama] J -->|No| L[HTTP 403 Forbidden]
K --> M[Ollama Instance] M --> N[Response] N --> O[Stream/Return to Client]
## Model Management
### Model Registration
Models discovered from Ollama instances are registered in the application state:
app.state.OLLAMA_MODELS = {}
Each model entry contains:
- `urls`: Array of Ollama instance URLs where the model is available
- `details`: Model metadata (size, capabilities, etc.)
Sources: [backend/open_webui/main.py:100-120](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/main.py)
### Model Access Control
Before forwarding requests, the system checks user access permissions:
model_info = await Models.get_model_by_id(model_id) if model_info: if model_info.base_model_id: payload['model'] = model_info.base_model_id await check_model_access(user, model_info) else: await check_model_access(user, None)
Sources: [backend/open_webui/routers/ollama.py:130-150](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
## Error Handling
### Connection Verification
The system provides a connection verification endpoint for testing Ollama connectivity:
export const verifyOllamaConnection = async (token: string = '', connection: dict = {}) => { const res = await fetch(${OLLAMA_API_BASE_URL}/verify, { method: 'POST', headers: { Authorization: Bearer ${token}, 'Content-Type': 'application/json' }, body: JSON.stringify({ ...connection }) }); return res; };
Sources: [src/lib/apis/ollama/index.ts:150-180](https://github.com/open-webui/open-webui/blob/main/src/lib/apis/ollama/index.ts)
### Error Messages
Common error scenarios include:
| Scenario | HTTP Status | Error Message |
|----------|-------------|---------------|
| Ollama API Disabled | 503 | `OLLAMA_API_DISABLED` |
| Model Not Found | 400 | `MODEL_NOT_FOUND` |
| Network Problem | Various | `Ollama: Network Problem` |
| Invalid Config | 500 | `DEFAULT(e)` |
Sources: [backend/open_webui/routers/ollama.py:50-80](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/ollama.py)
## Summary
The Ollama Integration provides a robust, flexible bridge between Open WebUI and local Ollama instances. Key features include:
- **Multi-instance support** through URL indexing
- **API compatibility layers** for OpenAI and Anthropic formats
- **Automatic port fallback** for network flexibility
- **Access control** integration for model permissions
- **Prefix-based multi-tenancy** support
- **Streaming support** for real-time responses
- **Docker and Kubernetes** deployment optimizations
This integration enables users to run powerful LLM models entirely locally while maintaining a modern, feature-rich web interface for interaction.Sources: [src/lib/stores/index.ts:53-58]()
RAG Pipeline
Related topics: Ollama Integration, Retrieval System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Ollama Integration, Retrieval System
RAG Pipeline
Retrieval-Augmented Generation (RAG) Pipeline in Open WebUI enables users to upload documents, process them into searchable vector embeddings, and augment LLM responses with relevant context from user knowledge bases.
Architecture Overview
The RAG Pipeline consists of multiple integrated components that work together to provide document retrieval and context augmentation capabilities.
graph TD
A[User Upload] --> B[Document Processing]
B --> C[Text Extraction]
C --> D[Chunking]
D --> E[Embedding Generation]
E --> F[Vector Storage]
F --> G[Retrieval]
G --> H[Context Injection]
H --> I[LLM Response]
J[Knowledge Management] --> K[Access Control]
K --> FCore Components
| Component | Location | Purpose |
|---|---|---|
| Knowledge Router | backend/open_webui/routers/knowledge.py | REST API endpoints for knowledge management |
| Retrieval Utils | backend/open_webui/retrieval/utils.py | Document loading and text extraction |
| API Router | backend/open_webui/main.py | Registers retrieval endpoints |
| Frontend | src/lib/components/workspace/Knowledge.svelte | UI for knowledge management |
Sources: backend/open_webui/main.py:17-30
Supported Document Types
Open WebUI supports a wide range of document formats through configurable document loaders.
// src/lib/constants.ts
export const SUPPORTED_FILE_TYPE = [
'application/epub+zip',
'application/pdf',
'text/plain',
'text/csv',
'text/xml',
'text/html',
'text/x-python',
'text/css',
'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
'application/octet-stream',
'application/x-javascript',
'text/markdown',
'audio/mpeg',
'audio/wav',
'video/mp4',
'video/mpeg'
];
Sources: src/lib/constants.ts:16-30
Document Processing Pipeline
Text Extraction
The retrieval utility module handles document parsing through multiple backends:
# backend/open_webui/retrieval/utils.py
def _extract_text_from_binary_response(request, response, url):
"""Download response body to a temp file and extract text using the Loader pipeline."""
import mimetypes
import tempfile
import urllib.parse
Supported Document Loaders
| Loader | Purpose | Configuration |
|---|---|---|
| TIKA Server | Apache Tika for generic document parsing | TIKA_SERVER_URL |
| DOCLING | Advanced PDF and document processing | DOCLING_SERVER_URL, DOCLING_API_KEY |
| PDF Loader | Configurable PDF extraction | PDF_LOADER_MODE, PDF_EXTRACT_IMAGES |
| Document Intelligence | Azure AI document analysis | DOCUMENT_INTELLIGENCE_ENDPOINT |
| Mistral OCR | OCR for scanned documents | MISTRAL_OCR_API_BASE_URL |
| PaddleOCR VL | Visual language OCR | PADDLEOCR_VL_BASE_URL |
| MinerU | Chinese document processing | MINERU_API_MODE, MINERU_API_URL |
Sources: backend/open_webui/retrieval/utils.py:1-25
Knowledge Management API
Endpoints Overview
The knowledge router provides CRUD operations for managing user knowledge bases.
/api/v1/knowledge - List and create knowledge bases
/api/v1/knowledge/{id} - Get, update, delete specific knowledge
/api/v1/knowledge/{id}/file/add - Add file to knowledge base
/api/v1/knowledge/{id}/search - Search within knowledge base
Access Control
Knowledge resources are protected by role-based access control:
# backend/open_webui/routers/knowledge.py
if not (
user.role == 'admin'
or knowledge.user_id == user.id
or await AccessGrants.has_access(
user_id=user.id,
resource_type='knowledge',
resource_id=knowledge.id,
permission='read',
db=db,
)
):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail=ERROR_MESSAGES.ACCESS_PROHIBITED,
)
Sources: backend/open_webui/routers/knowledge.py:40-55
Search Functionality
The search endpoint supports pagination and filtering:
| Parameter | Type | Default | Description |
|---|---|---|---|
page | int | 1 | Page number (minimum 1) |
query | string | - | Search query text |
view_option | string | - | Filter option |
order_by | string | - | Sort field |
direction | string | - | Sort direction |
page = max(page, 1)
limit = 30
skip = (page - 1) * limit
filter = {}
if query:
filter['query'] = query
Sources: backend/open_webui/routers/knowledge.py:57-70
Configuration Options
Environment Variables
| Variable | Default | Description |
|---|---|---|
EXTERNAL_DOCUMENT_LOADER_URL | - | External document loader endpoint |
EXTERNAL_DOCUMENT_LOADER_API_KEY | - | API key for external loader |
TIKA_SERVER_URL | - | Apache Tika server URL |
DOCLING_SERVER_URL | - | Docling server endpoint |
DOCLING_API_KEY | - | Docling API authentication |
PDF_LOADER_MODE | - | PDF extraction mode |
PDF_EXTRACT_IMAGES | - | Enable image extraction from PDFs |
DOCUMENT_INTELLIGENCE_ENDPOINT | - | Azure AI endpoint |
DOCUMENT_INTELLIGENCE_KEY | - | Azure AI API key |
MISTRAL_OCR_API_BASE_URL | - | Mistral OCR service URL |
MISTRAL_OCR_API_KEY | - | Mistral OCR authentication |
PADDLEOCR_VL_BASE_URL | - | PaddleOCR endpoint |
PADDLEOCR_VL_TOKEN | - | PaddleOCR token |
MINERU_API_MODE | - | MinerU processing mode |
MINERU_API_URL | - | MinerU API endpoint |
MINERU_API_KEY | - | MinerU API key |
MINERU_API_TIMEOUT | - | MinerU request timeout |
Sources: backend/open_webui/config.py:1-25
Data Flow
sequenceDiagram
participant U as User
participant F as Frontend
participant API as Knowledge API
participant DL as Document Loader
participant VC as Vector Cache
participant LLM as LLM
U->>F: Upload Document
F->>API: POST /api/v1/knowledge/{id}/file/add
API->>DL: Extract Text
DL-->>API: Raw Text Content
API->>VC: Generate Embeddings
VC-->>API: Vector Embeddings
API-->>F: Success Response
U->>F: Query with RAG
F->>API: POST /api/v1/retrieval
API->>VC: Search Vectors
VC-->>API: Relevant Chunks
API-->>F: Augmented Context
F->>LLM: Prompt + Context
LLM-->>U: Generated ResponseFrontend Integration
The knowledge management interface is implemented as a Svelte component:
- Location:
src/lib/components/workspace/Knowledge.svelte - Provides file upload, management, and search UI
- Communicates with backend via REST API
API Base URLs
// src/lib/constants.ts
export const RETRIEVAL_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/retrieval`;
export const AUDIO_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/audio`;
export const IMAGES_API_BASE_URL = `${WEBUI_BASE_URL}/api/v1/images`;
Sources: src/lib/constants.ts:9-13
Dependencies
Key Python packages for RAG functionality:
| Package | Version | Purpose |
|---|---|---|
sqlalchemy | 2.0.48 | Database ORM |
requests | 2.33.1 | HTTP client |
httpx | 0.28.1 | Async HTTP with HTTP/2 support |
aiofiles | - | Async file operations |
redis | - | Vector caching |
pycrdt | 0.12.47 | CRDT operations |
Sources: backend/requirements-min.txt:1-35
Error Handling
The system uses centralized error messages:
ERROR_MESSAGES.NOT_FOUND = "Knowledge base not found"
ERROR_MESSAGES.ACCESS_PROHIBITED = "Access denied to this knowledge base"
Best Practices
- Document Preparation: Use supported formats for optimal extraction quality
- Chunking Strategy: Configure appropriate chunk sizes based on use case
- Access Control: Leverage RBAC to protect sensitive knowledge bases
- Loader Selection: Choose appropriate document loader based on document complexity
- Resource Management: Monitor vector storage size for large knowledge bases
Sources: [backend/open_webui/main.py:17-30]()
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
The project should not be treated as fully validated until this signal is reviewed.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
The project may affect permissions, credentials, data exposure, or host boundaries.
The project may affect permissions, credentials, data exposure, or host boundaries.
Doramagic Pitfall Log
Doramagic extracted 6 source-linked risk signals. Review them before installing or handing real data to the project.
1. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:701547123 | https://github.com/open-webui/open-webui | README/documentation is current enough for a first validation pass.
2. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | last_activity_observed missing
3. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:701547123 | https://github.com/open-webui/open-webui | no_demo; severity=medium
4. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:701547123 | https://github.com/open-webui/open-webui | no_demo; severity=medium
5. Maintenance risk: issue_or_pr_quality=unknown
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | issue_or_pr_quality=unknown
6. Maintenance risk: release_recency=unknown
- Severity: low
- Finding: release_recency=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:701547123 | https://github.com/open-webui/open-webui | release_recency=unknown
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using open-webui with real data or production workflows.
- [[BUG] v0.9.3 - Notes completely broken: cannot open or create notes (Typ](https://github.com/open-webui/open-webui/issues/24484) - github / github_issue
- issue: llamacpp load/unload indicator doesn't work - github / github_issue
- issue: When continuing a conversation in the new version using a chat cr - github / github_issue
- issue: image_gen is exposed to the model even when image generation is d - github / github_issue
- feat: Add file types per MCP Integration - github / github_issue
- feat: apply filter in tool call loop - github / github_issue
- issue: Cmd+r on Mac (refresh page) causes chat to generate a new respons - github / github_issue
- v0.9.5 - github / github_release
- v0.9.4 - github / github_release
- v0.9.3 - github / github_release
- v0.9.2 - github / github_release
- v0.9.1 - github / github_release
Source: Project Pack community evidence and pitfall evidence