# https://github.com/infiniflow/ragflow Project Manual

Generated at: 2026-06-08 05:17:13 UTC

## Table of Contents

- [Project Overview and System Architecture](#page-1)
- [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2)
- [Agent System, Tools, and Workflow Orchestration](#page-3)
- [Deployment, Configuration, Administration, and Model Integration](#page-4)

<a id='page-1'></a>

## Project Overview and System Architecture

### Related Pages

Related topics: [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Agent System, Tools, and Workflow Orchestration](#page-3), [Deployment, Configuration, Administration, and Model Integration](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)
- [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)
- [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)
- [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)
- [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)
- [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)
- [agent/sandbox/README.md](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/README.md)
- [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)
- [tools/firecrawl/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md)
- [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)
</details>

# Project Overview and System Architecture

## 1. Purpose and Scope

RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that fuses RAG with Agent capabilities to create a context layer for large language models. According to the project README, RAGFlow is "powered by a converged context engine and pre-built agent templates" and is intended for "enterprises of any scale" — ranging from individual developers running it locally to large organizations operating multi-tenant deployments. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md).

The project's stated design goals — paraphrased from the README — include:

- **Quality in, quality out**: Deep document understanding for extracting knowledge from unstructured data, including formats such as Word, slides, Excel, txt, images, scanned copies, structured data, and web pages. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md).
- **Template-based chunking** that is explainable and configurable per use case.
- **Grounded citations** with visualization of text chunking to support traceable answers and reduce hallucinations.
- **Heterogeneous data-source compatibility** through an extensible connector layer.
- **Configurable LLMs and embedding models** with multiple recall fused with re-ranking.

The community roadmap (tracking issues #4214 and #162) shows the project has progressed through v0.9.0, v0.10.0, v0.21.0, v0.22.0, v0.23.0, v0.24.0, and the v0.25.x line, with active demand for features such as Text2SQL, TTS, Kubernetes deployment (issue #864), Ollama rerank integration (issue #4406), and the Docling parser (issue #3443).

## 2. Architectural Layers

RAGFlow follows a layered architecture in which the Python API, Go services, and a Vite-based web frontend collaborate through clearly defined interfaces. The diagram below summarizes the high-level data flow from ingestion to retrieval and agent execution.

```mermaid
flowchart LR
    UI["Web Frontend<br/>(Vite + React)"]
    CLI["CLI / Virtual FS<br/>(internal/cli)"]
    MCP["MCP Server<br/>(mcp/server)"]
    PYAPI["Python API<br/>(api/ragflow_server)"]
    GOAPI["Go Service<br/>(cmd/server_main)"]
    ENGINE["Doc Engine<br/>(internal/engine)"]
    DEEPDOC["DeepDoc<br/>(parser + vision)"]
    SANDBOX["Agent Sandbox<br/>(agent/sandbox)"]
    DS["Data Sources<br/>(firecrawl, S3, RSS, etc.)"]
    LLM["LLM / Embedding / Rerank"]

    UI --> PYAPI
    CLI --> PYAPI
    MCP --> GOAPI
    PYAPI --> DEEPDOC
    PYAPI --> ENGINE
    PYAPI --> SANDBOX
    PYAPI --> LLM
    GOAPI --> ENGINE
    DS --> PYAPI
    SANDBOX --> LLM
```

### 2.1 Web Frontend

The web UI lives under the `web/` directory and is a Vite-based single-page application. Source: [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json) lists scripts such as `dev` (Vite dev server), `build` (production), `lint` (ESLint), and `test` (Jest), and depends on `@ant-design/icons`, `@antv/g2`, `@antv/g6`, and form-handling libraries. UI components for building structured JSON schemas — used by the Agent designer — live in `web/src/components/jsonjoy-builder/lib/schema-editor.ts`, which exports helpers such as `createFieldSchema`, `validateFieldName`, and `getSchemaProperties`. Source: [web/src/components/jsonjoy-builder/lib/schema-editor.ts](https://github.com/infiniflow/ragflow/blob/main/web/src/components/jsonjoy-builder/lib/schema-editor.ts).

### 2.2 API and Service Tier

The Python Flask API (`api/ragflow_server.py`) serves as the public HTTP surface for the web UI and external integrations. It delegates heavy lifting — embedding, retrieval, agent execution — to background workers, while delegating document storage and search to the Go-side **Doc Engine**.

The Go service (`cmd/server_main.go`) initializes the Doc Engine on startup using the `engine.Init(&cfg...)` pattern documented in [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md). The Go side exposes retrieval-test, search, and admin RPCs that the Python layer consumes.

### 2.3 Model Context Protocol (MCP) Server

RAGFlow ships an MCP server at [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) that exposes RAGFlow datasets and retrieval operations as Model Context Protocol tools. The `RAGFlowConnector` class implements `_fetch_datasets_page`, `list_datasets`, `resolve_dataset_ids`, and a `call_tool` dispatcher that routes the `ragflow_retrieval` tool. The retrieval tool accepts `dataset_ids`, `document_ids`, `question`, `page`, `page_size`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, `rerank_id`, and `force_refresh` parameters, allowing any MCP-compatible client (e.g., Claude Desktop) to query RAGFlow datasets directly.

## 3. Key Subsystems

### 3.1 DeepDoc — Document Understanding

`deepdoc/` contains the document parsing pipeline and the vision subsystem. According to [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), DeepDoc provides OCR, layout recognition (with 10 components — text, title, figure, figure caption, table, table caption, header, footer, reference, equation), and Table Structure Recognition (TSR). The CLI test scripts `deepdoc/vision/t_ocr.py` and `deepdoc/vision/t_recognizer.py` accept `--inputs` and `--output_dir` arguments so developers can verify model behavior on local PDFs and images.

### 3.2 Doc Engine — Pluggable Storage and Retrieval

The Doc Engine in [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) abstracts over Elasticsearch and Infinity (an in-house database). The engine is configured via `conf/service_conf.yaml` under the `doc_engine` key with sub-keys `es` (hosts, username, password) and `infinity` (uri, postgres_port, db_name). The Go package layout separates `client.go`, `search.go`, `index.go`, and `document.go` per backend so that switching engines only requires changing `doc_engine.type`.

The schema used by the engine — exposed in [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py) — shows the underlying document model with fields such as `content_with_weight`, `content_ltks`, `content_sm_ltks`, `important_kwd`, `question_kwd`, `tag_kwd`, and `available_int`. This schema documents the fields an embedder/retriever must populate.

### 3.3 CLI and Virtual Filesystem

The CLI in [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) exposes a unified, path-based interface over RAGFlow REST APIs. Paths include `/datasets`, `/datasets/{name}` (documents), and `/datasets/{name}/{doc}` (document info). The implementation uses a provider pattern (`parser/`, `filesystem/`, `engine.go`, `base.go`, `dataset.go`, `file.go`, `utils.go`).

A notable subsystem — documented in [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) — is the **skill management** layer. It supports `install-skill <space> <source>` from local paths, GitHub repos (`github.com/owner/repo/path`), ClawHub (`clawhub://owner/skill-name`), and skills.sh (`skill://skill-name`). The system enforces a defense-in-depth security model: HTTPS source validation, quarantine of downloaded artifacts, regex-based static analysis across 100+ threat patterns in six categories (exfiltration, injection, destructive operations, persistence, network, obfuscation), trust tiers based on source reputation, mandatory `--force` for high-risk installs, and audit logging. Skills must be ≤ 50 MB total, ≤ 5 MB per file, text-only, with lowercase alphanumeric names.

### 3.4 Agent Sandbox

The Agent subsystem in [agent/sandbox/README.md](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/README.md) runs agent code inside isolated containers managed by gVisor. Sandboxed agents execute Python and Node.js workloads via base images `sandbox-base-python` and `sandbox-base-nodejs`, orchestrated by `sandbox-executor-manager`. The README warns that older executor-manager images shipped Docker CLI 24.x, which cannot talk to newer Docker daemons; rebuilding with Docker CLI 29.1.0+ is required.

### 3.5 Data-Source Connectors

The `tools/` directory hosts pluggable connectors. The Firecrawl integration ([tools/firecrawl/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md)) implements single-URL scraping, website crawling, batch processing, multiple output formats, rate limiting, and language detection — surfacing as a selectable data source in the RAGFlow UI.

## 4. Deployment, Operations, and Community Context

### 4.1 Self-Hosting

Per the README, RAGFlow is deployed via Docker Compose with minimum requirements of 4 CPU cores and ≥ 8 GB RAM (the README line is truncated in this snapshot). The project roadmap and community issue #864 ("How to deploy based on kubernetes?") confirm that Helm/YAML deployment is a long-standing user demand, currently addressed by Docker Compose only.

### 4.2 Release Cadence and Roadmap

Releases follow a numbered scheme from v0.9.0 through v0.25.x, with a rolling `nightly` build. Recent milestones include:

| Release | Notable change | Source |
|---|---|---|
| v0.24.0 | Memory APIs/SDK for agents; metadata batch management; ToC renamed to PageIndex; Chat-like Agent management | Community release notes |
| v0.25.0 | Seven ingestion-pipeline templates; new data sources (Seafile, RSS, DingTalk AI Sheet); deletion sync | Community release notes |
| v0.25.4 | Generic RESTful API data-source connector; gpt-5.4-mini/nano support | Community release notes |
| v0.25.5 | Local & SSH providers in admin panel; ~50–100% dataset-search latency reduction | Community release notes |
| v0.25.6 | Browser component for autonomous web navigation; Ψ-RAG (AHC) mode for RAPTOR | Community release notes |

### 4.3 Known Type and API Issues

Community issue #15714 reports a Go-side `tenant_rerank_id` type mismatch (`*string` vs. `*int`) in `service.RetrievalTestRequest` and `SearchBotRetrievalTestRequest`, illustrating that the Go ↔ Python API contract remains a focus area for engineering work.

## See Also

- DeepDoc — Document Understanding
- Doc Engine — Storage and Retrieval
- Agent Sandbox — Secure Execution
- MCP Server — Tool Integration
- Data Sources and Connectors

---

<a id='page-2'></a>

## Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge

### Related Pages

Related topics: [Project Overview and System Architecture](#page-1), [Agent System, Tools, and Workflow Orchestration](#page-3), [Deployment, Configuration, Administration, and Model Integration](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)
- [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)
- [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)
- [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)
- [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)
- [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)
- [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)
- [tools/es-to-oceanbase-migration/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md)
</details>

# Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge

## Overview

The Core RAG Engine is the heart of RAGFlow, an open-source Retrieval-Augmented Generation engine described in [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md). It implements the full pipeline from raw unstructured documents to grounded, citation-backed LLM responses. The engine is split into four collaborating subsystems:

1. **Parsing (DeepDoc)** — turns raw bytes (PDF, DOCX, images, slides) into structured layout-aware text plus tables, figures, and equations.
2. **Chunking & Knowledge Structuring** — segments parsed content into explainable chunks and indexes them into a multi-field schema.
3. **Retrieval** — performs hybrid (vector + keyword) recall with optional rerank across one or more datasets.
4. **Document Engine / Storage** — persists chunks, vectors, and metadata in pluggable backends (Elasticsearch or Infinity).

The following diagram illustrates how a query flows from input to grounded answer through these subsystems.

```mermaid
flowchart LR
    A[Unstructured Document] --> B[DeepDoc Parser<br/>OCR / Layout / TSR]
    B --> C[Template-based Chunker]
    C --> D[Doc Engine<br/>Elasticsearch or Infinity]
    D --> E[Hybrid Retrieval<br/>vector + keyword]
    E --> F[Rerank Model]
    F --> G[LLM with Citations]
```

Source: [README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/README.md), [deepdoc/README.md:1-60](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), [internal/engine/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)

---

## Parsing (DeepDoc)

RAGFlow's parsing layer is implemented as the `DeepDoc` module, which has two cooperating sub-modules: **Vision** and **Parser**.

### Vision

The vision subsystem performs OCR, layout recognition, and Table Structure Recognition (TSR) on images and PDFs. As described in [deepdoc/README.md:1-40](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), it recognizes 10 layout components — Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, and Equation — and decides whether successive text parts should be merged or whether a region must be handed off to TSR.

Vision is exposed through two test entry points:

```bash
python deepdoc/vision/t_ocr.py --inputs path/to/docs --output_dir ./ocr_outputs
python deepdoc/vision/t_recognizer.py --inputs path/to/docs --threshold 0.2 --mode layout --output_dir ./out
```

Source: [deepdoc/README.md:1-80](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)

### Parser

The parser consumes vision output and produces a clean, layout-aware representation. The release notes for **v0.25.1** added the [OpenDataLoader](https://github.com/opendataloader-project/opendataloader-pdf) PDF backend, and **v0.22.0** integrated MinerU as an additional document parser. These plug-ins share the same parser interface so users can swap backends without changing downstream pipelines.

---

## Chunking and Knowledge Structure

After parsing, RAGFlow applies **template-based chunking** — described in [README.md:30-40](https://github.com/infiniflow/ragflow/blob/main/README.md) as "Intelligent and explainable … plenty of template options to choose from." Each chunk is materialized as a multi-field document that supports vector recall, keyword recall, and metadata filtering.

The schema captured in the OceanBase migration tool mirrors the production schema used by the Doc Engine (see [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)):

| Field | Type | Purpose |
|---|---|---|
| `title_tks` | TEXT | Tokenized title for keyword search |
| `content_with_weight` | LONGTEXT | Original chunk content |
| `content_ltks` | LONGTEXT | Long-text token stream for keyword recall |
| `content_sm_ltks` | LONGTEXT | Fine-grained token stream |
| `important_kwd` | ARRAY(String) | Extracted keywords |
| `important_tks` | TEXT | Tokenized keywords |
| `question_kwd` | ARRAY(String(1024)) | Synthesized questions per chunk (used for QA-style recall) |
| `tag_kwd` | ARRAY(String(256)) | User/system tags |
| `available_int` | Integer | Soft-delete flag (0 = unavailable) |
| `create_time` | TIMESTAMP | Ingestion timestamp |

The presence of both long-text and fine-grained token streams, plus question-style keywords, is what enables RAGFlow's "find a needle in a haystack" behavior advertised in [README.md:20-30](https://github.com/infiniflow/ragflow/blob/main/README.md).

Source: [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)

---

## Retrieval

The retrieval layer exposes a unified interface across REST APIs, the MCP server, and the CLI. All three surfaces accept the same conceptual parameters.

The MCP tool `ragflow_retrieval` defined in [mcp/server/server.py:1-60](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) accepts:

- `dataset_ids` and `document_ids` — scope of search.
- `question` — natural-language query.
- `page`, `page_size` — pagination.
- `similarity_threshold` (default `0.2`) — minimum vector similarity.
- `vector_similarity_weight` (default `0.3`) — balance between vector and keyword scores.
- `keyword` (bool) — toggle pure keyword recall.
- `top_k` (default `1024`) — recall depth before rerank.
- `rerank_id` — optional rerank model identifier.
- `force_refresh` — bypass caches.

The CLI mirrors the same knobs in [internal/cli/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md):

```bash
search "RAG" datasets/kb1 --output plain -n 20
SEARCH 'AI' ON DATASETS 'kb1' WITH top_k 1024 similarity_threshold 0.0 vector_similarity_weight 0.3 keyword true
```

The v0.25.5 release notes state that the dataset search path was accelerated by 50–100% by removing an expensive vector-fetch + rerank-similarity step, which directly affects how these parameters behave under load.

> **Community note:** [Issue #15714](https://github.com/infiniflow/ragflow/issues/15714) reports a Go-side type mismatch on `TenantRerankID` (declared `*string`, expected `*int`) inside the retrieval test path. Operators wiring custom rerank IDs into the retrieval test pipeline should validate their schema version. Separately, [Issue #4406](https://github.com/infiniflow/ragflow/issues/4406) requests rerank support for Ollama, which is currently not exposed as a `rerank_id` provider.

Source: [mcp/server/server.py:1-80](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py), [internal/cli/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)

---

## Document Engine & Storage

The retrieval layer is decoupled from storage by a `DocEngine` interface defined in [internal/engine/README.md:1-40](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md). Two implementations ship today:

- **`elasticsearch`** — fully functional. Configured under `doc_engine.es` in `conf/service_conf.yaml` with hosts and basic-auth credentials.
- **`infinity`** — placeholder awaiting the official Infinity Go SDK; only the directory skeleton exists.

The factory in `engine_factory.go` selects an engine by `doc_engine.type`, and `cmd/server_main.go` calls `engine.Init` once at startup so every retrieval path shares one global instance (`global.go`).

This design is what enables the OceanBase migration tooling ([tools/es-to-oceanbase-migration/README.md:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md)) to re-host the same RAGFlow schema on a third backend without touching parsers or retrieval code.

Source: [internal/engine/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md), [tools/es-to-oceanbase-migration/README.md:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md)

---

## See Also

- [DeepDoc Vision & Parser](./deepdoc-vision-and-parser.md)
- [Document Engine Backends (Elasticsearch / Infinity)](./doc-engine-backends.md)
- [Retrieval APIs (REST, MCP, CLI)](./retrieval-apis.md)
- [Knowledge Schema & Indexing](./knowledge-schema.md)

---

<a id='page-3'></a>

## Agent System, Tools, and Workflow Orchestration

### Related Pages

Related topics: [Project Overview and System Architecture](#page-1), [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Deployment, Configuration, Administration, and Model Integration](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agent/canvas.py](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py)
- [agent/component/base.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/base.py)
- [agent/component/begin.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/begin.py)
- [agent/component/llm.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/llm.py)
- [agent/component/retrieval (via tools/retrieval.py)](https://github.com/infiniflow/ragflow/blob/main/agent/component/retrieval (via tools/retrieval.py))
- [agent/component/code_exec (referenced via tools/code_exec.py)](https://github.com/infiniflow/ragflow/blob/main/agent/component/code_exec (referenced via tools/code_exec.py))
- [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)
- [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)
- [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)
- [admin/client/README.md](https://github.com/infiniflow/ragflow/blob/main/admin/client/README.md)
</details>

# Agent System, Tools, and Workflow Orchestration

## Overview and Purpose

RAGFlow's agent system fuses retrieval-augmented generation with agentic capabilities to deliver a configurable context layer for LLM applications. The runtime is assembled from modular components that can be composed into workflows for both personal and enterprise deployments (Source: [README.md]()). Pre-built agent templates and a converged context engine allow developers to transform complex data into production-ready AI systems with high efficiency (Source: [README.md]()).

Recent releases have progressively expanded the agent surface area:

- **Memory for AI agents** (added 2025-12-26)
- **Agentic workflow and MCP integration** (added 2025-08-01)
- **Python/JavaScript code executor component** (added 2025-05-23)
- **Browser component for autonomous web navigation** (added in v0.25.6, May 2026)
- **Chat-like Agent conversation management** (v0.24.0)

## Component-Based Architecture

The agent runtime follows a component-based design in which each node in a workflow is implemented as a self-contained class inheriting from a common base. The canvas is responsible for assembling components, routing data between them, and orchestrating execution order (Source: [agent/canvas.py]()). All components share a unified interface defined in the base class, covering parameter validation, execution lifecycle, and canvas serialization (Source: [agent/component/base.py]()).

### Core Component Types

- **`Begin`** — Defines initial input and conversation start parameters; the entry point of every workflow (Source: [agent/component/begin.py]()).
- **`LLM`** — Performs language model inference with configurable prompts, model parameters, and tool bindings (Source: [agent/component/llm.py]()).
- **`Retrieval`** — Performs RAG retrieval against datasets and documents; the implementation bridges to the shared `tools/retrieval.py` logic (Source: [agent/component/retrieval]()).
- **`Code Exec`** — Executes Python or JavaScript snippets in a sandboxed environment to support computational reasoning (Source: [tools/code_exec.py]()).
- **`Base`** — Foundation class providing the common contract (input/output schema, `invoke` lifecycle, canvas representation) that all other components extend (Source: [agent/component/base.py]()).

### Workflow Execution Flow

```mermaid
flowchart LR
    A[User Input] --> B[Begin Component]
    B --> C{Route / Branch}
    C -->|Retrieval needed| D[Retrieval Component]
    C -->|Compute needed| E[Code Exec Component]
    C -->|Reasoning needed| F[LLM Component]
    D --> F
    E --> F
    F --> G[Output / Tool Call]
    G -.iterates.-> C
```

## MCP (Model Context Protocol) Integration

RAGFlow exposes its retrieval layer as MCP tools so that external agent clients can invoke retrieval against managed datasets. The MCP server registers a `ragflow_retrieval` tool that accepts `document_ids`, `dataset_ids`, `question`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, `rerank_id`, `force_refresh`, and pagination parameters (`page`, `page_size`) (Source: [mcp/server/server.py]():).

The server runs as a Starlette ASGI application in either `HOST` mode or standalone mode, gated by an `AuthMiddleware` that validates the API key on every request (Source: [mcp/server/server.py]():). It fetches accessible datasets via the `/datasets` REST endpoint and paginates through all results when resolving the full set of dataset IDs for MCP retrieval fallback (Source: [mcp/server/server.py]():).

## Tools, Skills, and Memory Management

Beyond built-in components, RAGFlow supports a pluggable skills and memory system exposed through the CLI filesystem (Source: [internal/cli/filesystem/README.md]():). The CLI parses commands using a recursive descent parser (`parser/parser.go`) over a lexer, and routes them to a virtual filesystem backed by providers (`dataset.go`, `file.go`) that wrap RAGFlow's RESTful APIs (Source: [internal/cli/README.md]():).

### Skill Sources

The `install-skill` command accepts skills from local paths, GitHub URLs, ClawHub references, or skills.sh identifiers, then validates and stores them in an isolated space (Source: [internal/cli/filesystem/README.md]():).

### Security Validation

The skill manager applies defense-in-depth checks before installation:

- HTTPS source validation with SSL certificate verification
- Quarantine isolation of downloaded skills prior to install
- Static analysis scanning 100+ threat patterns across six categories: Exfiltration, Injection, Destructive, Persistence, Network, and Obfuscation
- Trust tiers based on source reputation
- Explicit `--force` user confirmation for high-risk installs
- Audit logging of every installation with its scan results (Source: [internal/cli/filesystem/README.md]():)

### Memory System

Memory is organized hierarchically into category folders (e.g., `memory/categories/category1`, `category2`) and per-agent memory files for tool and skill usage patterns, supporting retrieval augmentation across long-lived agent sessions (Source: [internal/cli/filesystem/README.md]():).

## Common Failure Modes and Community Notes

- **TenantRerankID type mismatch in Go SDK**: `service.RetrievalTestRequest.TenantRerankID` and `SearchBotRetrievalTestRequest.TenantRerankID` are declared as `*string` but are consumed as `*int` in some retrieval-test code paths, which can surface as runtime errors when invoking the retrieval benchmark (Source: [issue #15714]()).
- **Empty memory object on startup**: The RAGFlow server previously failed to start when an empty memory object existed; this was fixed in v0.23.1.
- **Memory extraction stability**: When all memory types are selected simultaneously, extraction stability was hardened in v0.23.1.
- **Browser component**: Newly added in v0.25.6, the Browser component enables autonomous web navigation; expect evolving behavior and config knobs (Source: [README.md]()).
- **Kubernetes deployment**: Helm charts / raw Kubernetes manifests are not first-class supported; production deployment remains primarily via `docker-compose`.

## See Also

- [Project README](https://github.com/infiniflow/ragflow/blob/main/README.md)
- [MCP Server Source](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)
- [Admin CLI Documentation](https://github.com/infiniflow/ragflow/blob/main/admin/client/README.md)
- [Internal CLI Filesystem](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)
- [DeepDoc Module](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)
- [Firecrawl Integration](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md)

---

<a id='page-4'></a>

## Deployment, Configuration, Administration, and Model Integration

### Related Pages

Related topics: [Project Overview and System Architecture](#page-1), [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Agent System, Tools, and Workflow Orchestration](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)
- [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md)
- [docker/.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env)
- [docker/docker-compose.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose.yml)
- [docker/docker-compose-base.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose-base.yml)
- [docker/service_conf.yaml.template](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml.template)
- [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)
- [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)
- [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)
- [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)
- [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)
- [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)
</details>

# Deployment, Configuration, Administration, and Model Integration

RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that fuses RAG with Agent capabilities. Operating the system at production scale requires mastering four interrelated concerns: deploying the runtime stack, configuring infrastructure and models, administering tenants and resources, and integrating third-party model providers. This page covers all four areas, drawing on the project's official deployment manifests, engine abstractions, and operator interfaces. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)

## Deployment

### Prerequisites and Runtime Stack

The official deployment path is Docker Compose. The system requires at minimum 4 CPU cores, 16 GB RAM, 50 GB disk, Docker >= 24.0.0 with Docker Compose >= v2.26.1, Python >= 3.13, and (optionally) [gVisor](https://gvisor.dev/docs/user_guide/install/) when the Agent's code executor sandbox is enabled. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)

Before starting, the host kernel parameter `vm.max_map_count` must be set to at least 262144 (Elasticsearch requirement). The README documents how to check and set it via `sysctl -w vm.max_map_count=262144`. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)

### Compose Topology

Two compose files are maintained:

- `docker/docker-compose.yml` brings up the RAGFlow application service on top of a dependency stack.
- `docker/docker-compose-base.yml` provides the dependencies: Elasticsearch (or [Infinity](https://github.com/infiniflow/infinity)), MySQL, MinIO, and Redis.

A legacy `docker-compose-CN-oc9.yml` and a `docker-compose-macos.yml` exist but are not actively maintained. Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md)

The high-level deployment topology is:

```mermaid
flowchart LR
  User[User / MCP Client] --> Web[Web Frontend<br/>Vite + React]
  User --> API[RAGFlow API Server<br/>Python + Go]
  API --> MySQL[(MySQL)]
  API --> Redis[(Redis)]
  API --> MinIO[(MinIO)]
  API --> Engine{Doc Engine}
  Engine -->|type=elasticsearch| ES[(Elasticsearch)]
  Engine -->|type=infinity| INF[(Infinity)]
  Web -.->|serves| User
```

### Kubernetes and Cloud

A community-requested Kubernetes deployment path (Helm charts or raw manifests) is tracked in issue [#864](https://github.com/infiniflow/ragflow/issues/864). As of the most recent releases, official Helm support is not yet shipped; the supported path remains Docker Compose on a single host, optionally scaled by externalizing the dependency services. Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md)

## Configuration

### Docker Environment Variables

The `[docker/.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env)` file is the primary configuration surface for the container stack. The following variables are documented:

| Variable | Default | Purpose |
|----------|---------|---------|
| `STACK_VERSION` | `8.11.3` | Elasticsearch image version |
| `ES_PORT` | `1200` | Host port exposed for Elasticsearch |
| `ELASTIC_PASSWORD` | — | Elasticsearch bootstrap password |
| `KIBANA_PORT` | — | Host port for the Kibana UI |

Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md)

### Service Configuration

`docker/service_conf.yaml.template` is rendered at startup and configures the RAGFlow service. The internal Go engine selects a document store via a `doc_engine.type` key. Two backend values are supported:

- `elasticsearch` — fully implemented, configured with `doc_engine.es.hosts`, `username`, `password`.
- `infinity` — a placeholder backend waiting for the official Infinity Go SDK; configuration keys include `uri`, `postgres_port`, `db_name`.

Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)

Engine selection happens once at process startup. The Go factory in `internal/engine/engine_factory.go` returns a `DocEngine` interface implementation that the rest of the service consumes uniformly for indexing, search, and document operations. Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)

### Parsers and OCR

The `[deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)` introduces *Deep*Doc, RAGFlow's vision and parser subsystem. The vision pipeline provides OCR, layout recognition (10 base layout components: Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation), and Table Structure Recognition (TSR). Operators can smoke-test these on local files using `python deepdoc/vision/t_ocr.py` and `python deepdoc/vision/t_recognizer.py`. Source: [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)

## Administration

### Admin Panel

Release v0.25.5 introduced local and SSH providers in the admin panel (PR #15039), allowing administrators to manage users, datasets, and storage backends directly from the web console. Source: [GitHub release v0.25.5](https://github.com/infiniflow/ragflow/releases/tag/v0.25.5)

### CLI and Virtual Filesystem

The Go CLI under `internal/cli` exposes a virtual filesystem layered over RAGFlow's RESTful APIs. The design follows three principles: (1) no server-side changes, (2) a provider pattern with a common `Provider` interface in `filesystem/base.go`, and (3) unified commands (`ls`, `search`, `cat`, `mkdir`) over virtual paths. Supported paths include `/datasets`, `/datasets/{name}` (lists documents), and `/datasets/{name}/{doc}` (fetches a single document). Source: [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md)

### Skill Management

The CLI's `install-skill` command supports four source types: local paths, `github.com/owner/repo/path`, `clawhub://owner/skill-name` (ClawHub), and `skill://skill-name` (skills.sh). A defense-in-depth security architecture validates sources over HTTPS with SSL verification, quarantines downloads, runs regex-based static analysis against 100+ threat patterns (exfiltration, injection, destructive operations, persistence, network, obfuscation), and applies trust tiers. Limits: total skill size <= 50 MB, individual file <= 5 MB, text files only, lowercase alphanumeric names with hyphens/underscores. Source: [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)

### Frontend Build

The web UI is a Vite + React project. Build and development entry points live in `[web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)`: `npm run dev` starts the dev server, `npm run build` produces a production bundle, and `npm run type-check` validates TypeScript. The schema editor component in `web/src/components/jsonjoy-builder/lib/schema-editor.ts` enforces JSONSchema field-name validation against the pattern `^[a-zA-Z_$][a-zA-Z0-9_$]*$`. Source: [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)

## Model Integration

### LLM and Embedding Providers

RAGFlow ships with a model registry supporting OpenAI-compatible APIs. Release v0.25.4 added `gpt-5.4-mini` and `gpt-5.4-nano` to the OpenAI model list, and release v0.25.6 extended the Agent with a new Browser component that lets models navigate and interact with web pages autonomously (PR #14888). DeepSeek v4 support was added on 2026-04-24, and Gemini 3 Pro support on 2025-11-19. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) and [GitHub release v0.25.4](https://github.com/infiniflow/ragflow/releases/tag/v0.25.4)

### Rerank and Retrieval

A community feature request to add Ollama rerank integration is tracked in issue [#4406](https://github.com/infiniflow/ragflow/issues/4406). Rerank models are referenced by ID through the `rerank_id` parameter on retrieval calls. The MCP server's `ragflow_retrieval` tool accepts `rerank_id`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, and `force_refresh` arguments that all flow into the unified retrieval service. Source: [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)

Release v0.25.5 accelerated the dataset search path, reducing latency by 50–100% by removing an expensive vector fetch and rerank similarity computation from the hot path (PR #14970). Source: [GitHub release v0.25.5](https://github.com/infiniflow/ragflow/releases/tag/v0.25.5)

### MCP Server

The `[mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)` exposes RAGFlow as a Model Context Protocol server. Two tool entry points are registered: `list_datasets` (paginates `/datasets` and returns newline-delimited JSON) and `ragflow_retrieval` (performs cross-dataset retrieval with the parameters above). When `MODE == HOST`, the server installs an `AuthMiddleware` to enforce API key authentication. Source: [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)

### Document Parsers

The v0.25.0 release added 7 built-in ingestion pipeline templates aligned with RAGFlow's native parsers, and v0.25.1 added the [OpenDataLoader](https://github.com/opendataloader-project/opendataloader-pdf) PDF backend. A community request to integrate [Docling](https://github.com/DS4SD/docling) as an additional parser is tracked in issue [#3443](https://github.com/infiniflow/ragflow/issues/3443). For users migrating from Elasticsearch to OceanBase, the schema mapping in `tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py` documents how chunk fields (content, tokens, keywords, tags, PageRank) translate to OceanBase column types. Source: [GitHub release v0.25.1](https://github.com/infiniflow/ragflow/releases/tag/v0.25.1)

## Common Failure Modes

1. **`vm.max_map_count` too low** — Elasticsearch container fails to start. Mitigate with `sudo sysctl -w vm.max_map_count=262144`. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md)
2. **Infinity backend selected without SDK** — the Infinity implementation is a placeholder; only Elasticsearch is fully functional. Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md)
3. **Type mismatch on rerank IDs** — the Go `service.RetrievalTestRequest.TenantRerankID` field has a known `*string` vs `*int` mismatch with retrieval tests (issue [#15714](https://github.com/infiniflow/ragflow/issues/15714)). Source: [issue #15714](https://github.com/infiniflow/ragflow/issues/15714)
4. **Skill installation blocked** — over-size archives, binary files, or suspicious patterns are rejected by the static analyzer. Source: [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md)

## See Also

- [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) — Project overview and quickstart
- [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) — Full Docker deployment reference
- [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) — Vision and parser subsystem
- [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) — Doc engine abstraction
- [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) — CLI and virtual filesystem
- [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) — MCP server reference

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: infiniflow/ragflow

Summary: Found 20 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

## 1. Configuration risk - Configuration risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_7154f1df73d9467aa3d747477287e392 | https://github.com/infiniflow/ragflow/issues/15714

## 2. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_8d8565f17f754fe3a6f7ad1f3b4be33d | https://github.com/infiniflow/ragflow/issues/15525

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_408303dfb4fb43a781b7dc14724082b9 | https://github.com/infiniflow/ragflow/issues/15751

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.23.1
- User impact: Upgrade or migration may change expected behavior: v0.23.1
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.23.1. Context: Observed when using docker
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_38f958bf7c9ad232f6049339e1321be7 | https://github.com/infiniflow/ragflow/releases/tag/v0.23.1

## 5. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.24.0
- User impact: Upgrade or migration may change expected behavior: v0.24.0
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.24.0. Context: Observed when using docker
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_0ca2840fc49d848176cce456864aafa3 | https://github.com/infiniflow/ragflow/releases/tag/v0.24.0

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.0
- User impact: Upgrade or migration may change expected behavior: v0.25.0
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.0. Context: Observed when using python, docker
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_7154c897fed0437e0ca58d1f443b8d97 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.0

## 7. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.1
- User impact: Upgrade or migration may change expected behavior: v0.25.1
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.1. Context: Observed during version upgrade or migration.
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_12ff69cd8f090474bcc8768ed255e16a | https://github.com/infiniflow/ragflow/releases/tag/v0.25.1

## 8. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.2
- User impact: Upgrade or migration may change expected behavior: v0.25.2
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.2. Context: Observed when using python
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_7f58552889f29288945720d487e8fbb7 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.2

## 9. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.3
- User impact: Upgrade or migration may change expected behavior: v0.25.3
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.3. Context: Observed when using docker
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_14af37b03860695c40160c241d23e5b1 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.3

## 10. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.4
- User impact: Upgrade or migration may change expected behavior: v0.25.4
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.4. Context: Source discussion did not expose a precise runtime context.
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_026d052ebdc28ef87ab4152d11b96502 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.4

## 11. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.5
- User impact: Upgrade or migration may change expected behavior: v0.25.5
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.5. Context: Observed when using python
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_57690c932d554b7b2b477b7e4564f3f5 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.5

## 12. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.25.6
- User impact: Upgrade or migration may change expected behavior: v0.25.6
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.6. Context: Observed when using python, cuda
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_e1befbd52e751833a5dab041663c4bf0 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.6

## 13. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 15. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 16. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 17. Capability evidence risk - Capability evidence risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this capability risk before relying on the project: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test
- User impact: Developers may hit a documented source-backed failure mode: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test. Context: Observed when using python
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_issue | fmev_a84cfda4f8786aaff3acbf0072fb4c08 | https://github.com/infiniflow/ragflow/issues/15714

## 18. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 19. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Suggested check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow

## 20. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this maintenance risk before relying on the project: nightly
- User impact: Upgrade or migration may change expected behavior: nightly
- Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: nightly. Context: Source discussion did not expose a precise runtime context.
- Guardrail: State this as source-backed community evidence, not as Doramagic reproduction.
- Evidence: failure_mode_cluster:github_release | fmev_57bc13a92eaec92fbf9f0b315ce0baec | https://github.com/infiniflow/ragflow/releases/tag/nightly

<!-- canonical_name: infiniflow/ragflow; human_manual_source: deepwiki_human_wiki -->
