# https://github.com/infiniflow/ragflow Project Manual Generated at: 2026-06-08 05:17:13 UTC ## Table of Contents - [Project Overview and System Architecture](#page-1) - [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2) - [Agent System, Tools, and Workflow Orchestration](#page-3) - [Deployment, Configuration, Administration, and Model Integration](#page-4) ## Project Overview and System Architecture ### Related Pages Related topics: [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Agent System, Tools, and Workflow Orchestration](#page-3), [Deployment, Configuration, Administration, and Model Integration](#page-4)

Related Source Files

The following source files were used to generate this page: - [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) - [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) - [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) - [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) - [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) - [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) - [agent/sandbox/README.md](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/README.md) - [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json) - [tools/firecrawl/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md) - [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)

# Project Overview and System Architecture ## 1. Purpose and Scope RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that fuses RAG with Agent capabilities to create a context layer for large language models. According to the project README, RAGFlow is "powered by a converged context engine and pre-built agent templates" and is intended for "enterprises of any scale" — ranging from individual developers running it locally to large organizations operating multi-tenant deployments. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md). The project's stated design goals — paraphrased from the README — include: - **Quality in, quality out**: Deep document understanding for extracting knowledge from unstructured data, including formats such as Word, slides, Excel, txt, images, scanned copies, structured data, and web pages. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md). - **Template-based chunking** that is explainable and configurable per use case. - **Grounded citations** with visualization of text chunking to support traceable answers and reduce hallucinations. - **Heterogeneous data-source compatibility** through an extensible connector layer. - **Configurable LLMs and embedding models** with multiple recall fused with re-ranking. The community roadmap (tracking issues #4214 and #162) shows the project has progressed through v0.9.0, v0.10.0, v0.21.0, v0.22.0, v0.23.0, v0.24.0, and the v0.25.x line, with active demand for features such as Text2SQL, TTS, Kubernetes deployment (issue #864), Ollama rerank integration (issue #4406), and the Docling parser (issue #3443). ## 2. Architectural Layers RAGFlow follows a layered architecture in which the Python API, Go services, and a Vite-based web frontend collaborate through clearly defined interfaces. The diagram below summarizes the high-level data flow from ingestion to retrieval and agent execution. ```mermaid flowchart LR UI["Web Frontend
(Vite + React)"] CLI["CLI / Virtual FS
(internal/cli)"] MCP["MCP Server
(mcp/server)"] PYAPI["Python API
(api/ragflow_server)"] GOAPI["Go Service
(cmd/server_main)"] ENGINE["Doc Engine
(internal/engine)"] DEEPDOC["DeepDoc
(parser + vision)"] SANDBOX["Agent Sandbox
(agent/sandbox)"] DS["Data Sources
(firecrawl, S3, RSS, etc.)"] LLM["LLM / Embedding / Rerank"] UI --> PYAPI CLI --> PYAPI MCP --> GOAPI PYAPI --> DEEPDOC PYAPI --> ENGINE PYAPI --> SANDBOX PYAPI --> LLM GOAPI --> ENGINE DS --> PYAPI SANDBOX --> LLM ``` ### 2.1 Web Frontend The web UI lives under the `web/` directory and is a Vite-based single-page application. Source: [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json) lists scripts such as `dev` (Vite dev server), `build` (production), `lint` (ESLint), and `test` (Jest), and depends on `@ant-design/icons`, `@antv/g2`, `@antv/g6`, and form-handling libraries. UI components for building structured JSON schemas — used by the Agent designer — live in `web/src/components/jsonjoy-builder/lib/schema-editor.ts`, which exports helpers such as `createFieldSchema`, `validateFieldName`, and `getSchemaProperties`. Source: [web/src/components/jsonjoy-builder/lib/schema-editor.ts](https://github.com/infiniflow/ragflow/blob/main/web/src/components/jsonjoy-builder/lib/schema-editor.ts). ### 2.2 API and Service Tier The Python Flask API (`api/ragflow_server.py`) serves as the public HTTP surface for the web UI and external integrations. It delegates heavy lifting — embedding, retrieval, agent execution — to background workers, while delegating document storage and search to the Go-side **Doc Engine**. The Go service (`cmd/server_main.go`) initializes the Doc Engine on startup using the `engine.Init(&cfg...)` pattern documented in [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md). The Go side exposes retrieval-test, search, and admin RPCs that the Python layer consumes. ### 2.3 Model Context Protocol (MCP) Server RAGFlow ships an MCP server at [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) that exposes RAGFlow datasets and retrieval operations as Model Context Protocol tools. The `RAGFlowConnector` class implements `_fetch_datasets_page`, `list_datasets`, `resolve_dataset_ids`, and a `call_tool` dispatcher that routes the `ragflow_retrieval` tool. The retrieval tool accepts `dataset_ids`, `document_ids`, `question`, `page`, `page_size`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, `rerank_id`, and `force_refresh` parameters, allowing any MCP-compatible client (e.g., Claude Desktop) to query RAGFlow datasets directly. ## 3. Key Subsystems ### 3.1 DeepDoc — Document Understanding `deepdoc/` contains the document parsing pipeline and the vision subsystem. According to [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), DeepDoc provides OCR, layout recognition (with 10 components — text, title, figure, figure caption, table, table caption, header, footer, reference, equation), and Table Structure Recognition (TSR). The CLI test scripts `deepdoc/vision/t_ocr.py` and `deepdoc/vision/t_recognizer.py` accept `--inputs` and `--output_dir` arguments so developers can verify model behavior on local PDFs and images. ### 3.2 Doc Engine — Pluggable Storage and Retrieval The Doc Engine in [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) abstracts over Elasticsearch and Infinity (an in-house database). The engine is configured via `conf/service_conf.yaml` under the `doc_engine` key with sub-keys `es` (hosts, username, password) and `infinity` (uri, postgres_port, db_name). The Go package layout separates `client.go`, `search.go`, `index.go`, and `document.go` per backend so that switching engines only requires changing `doc_engine.type`. The schema used by the engine — exposed in [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py) — shows the underlying document model with fields such as `content_with_weight`, `content_ltks`, `content_sm_ltks`, `important_kwd`, `question_kwd`, `tag_kwd`, and `available_int`. This schema documents the fields an embedder/retriever must populate. ### 3.3 CLI and Virtual Filesystem The CLI in [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) exposes a unified, path-based interface over RAGFlow REST APIs. Paths include `/datasets`, `/datasets/{name}` (documents), and `/datasets/{name}/{doc}` (document info). The implementation uses a provider pattern (`parser/`, `filesystem/`, `engine.go`, `base.go`, `dataset.go`, `file.go`, `utils.go`). A notable subsystem — documented in [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) — is the **skill management** layer. It supports `install-skill ` from local paths, GitHub repos (`github.com/owner/repo/path`), ClawHub (`clawhub://owner/skill-name`), and skills.sh (`skill://skill-name`). The system enforces a defense-in-depth security model: HTTPS source validation, quarantine of downloaded artifacts, regex-based static analysis across 100+ threat patterns in six categories (exfiltration, injection, destructive operations, persistence, network, obfuscation), trust tiers based on source reputation, mandatory `--force` for high-risk installs, and audit logging. Skills must be ≤ 50 MB total, ≤ 5 MB per file, text-only, with lowercase alphanumeric names. ### 3.4 Agent Sandbox The Agent subsystem in [agent/sandbox/README.md](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/README.md) runs agent code inside isolated containers managed by gVisor. Sandboxed agents execute Python and Node.js workloads via base images `sandbox-base-python` and `sandbox-base-nodejs`, orchestrated by `sandbox-executor-manager`. The README warns that older executor-manager images shipped Docker CLI 24.x, which cannot talk to newer Docker daemons; rebuilding with Docker CLI 29.1.0+ is required. ### 3.5 Data-Source Connectors The `tools/` directory hosts pluggable connectors. The Firecrawl integration ([tools/firecrawl/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md)) implements single-URL scraping, website crawling, batch processing, multiple output formats, rate limiting, and language detection — surfacing as a selectable data source in the RAGFlow UI. ## 4. Deployment, Operations, and Community Context ### 4.1 Self-Hosting Per the README, RAGFlow is deployed via Docker Compose with minimum requirements of 4 CPU cores and ≥ 8 GB RAM (the README line is truncated in this snapshot). The project roadmap and community issue #864 ("How to deploy based on kubernetes?") confirm that Helm/YAML deployment is a long-standing user demand, currently addressed by Docker Compose only. ### 4.2 Release Cadence and Roadmap Releases follow a numbered scheme from v0.9.0 through v0.25.x, with a rolling `nightly` build. Recent milestones include: | Release | Notable change | Source | |---|---|---| | v0.24.0 | Memory APIs/SDK for agents; metadata batch management; ToC renamed to PageIndex; Chat-like Agent management | Community release notes | | v0.25.0 | Seven ingestion-pipeline templates; new data sources (Seafile, RSS, DingTalk AI Sheet); deletion sync | Community release notes | | v0.25.4 | Generic RESTful API data-source connector; gpt-5.4-mini/nano support | Community release notes | | v0.25.5 | Local & SSH providers in admin panel; ~50–100% dataset-search latency reduction | Community release notes | | v0.25.6 | Browser component for autonomous web navigation; Ψ-RAG (AHC) mode for RAPTOR | Community release notes | ### 4.3 Known Type and API Issues Community issue #15714 reports a Go-side `tenant_rerank_id` type mismatch (`*string` vs. `*int`) in `service.RetrievalTestRequest` and `SearchBotRetrievalTestRequest`, illustrating that the Go ↔ Python API contract remains a focus area for engineering work. ## See Also - DeepDoc — Document Understanding - Doc Engine — Storage and Retrieval - Agent Sandbox — Secure Execution - MCP Server — Tool Integration - Data Sources and Connectors --- ## Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge ### Related Pages Related topics: [Project Overview and System Architecture](#page-1), [Agent System, Tools, and Workflow Orchestration](#page-3), [Deployment, Configuration, Administration, and Model Integration](#page-4)

Related Source Files

The following source files were used to generate this page: - [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) - [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) - [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) - [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) - [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py) - [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) - [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) - [tools/es-to-oceanbase-migration/README.md](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md)

# Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge ## Overview The Core RAG Engine is the heart of RAGFlow, an open-source Retrieval-Augmented Generation engine described in [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md). It implements the full pipeline from raw unstructured documents to grounded, citation-backed LLM responses. The engine is split into four collaborating subsystems: 1. **Parsing (DeepDoc)** — turns raw bytes (PDF, DOCX, images, slides) into structured layout-aware text plus tables, figures, and equations. 2. **Chunking & Knowledge Structuring** — segments parsed content into explainable chunks and indexes them into a multi-field schema. 3. **Retrieval** — performs hybrid (vector + keyword) recall with optional rerank across one or more datasets. 4. **Document Engine / Storage** — persists chunks, vectors, and metadata in pluggable backends (Elasticsearch or Infinity). The following diagram illustrates how a query flows from input to grounded answer through these subsystems. ```mermaid flowchart LR A[Unstructured Document] --> B[DeepDoc Parser
OCR / Layout / TSR] B --> C[Template-based Chunker] C --> D[Doc Engine
Elasticsearch or Infinity] D --> E[Hybrid Retrieval
vector + keyword] E --> F[Rerank Model] F --> G[LLM with Citations] ``` Source: [README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/README.md), [deepdoc/README.md:1-60](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), [internal/engine/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) --- ## Parsing (DeepDoc) RAGFlow's parsing layer is implemented as the `DeepDoc` module, which has two cooperating sub-modules: **Vision** and **Parser**. ### Vision The vision subsystem performs OCR, layout recognition, and Table Structure Recognition (TSR) on images and PDFs. As described in [deepdoc/README.md:1-40](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md), it recognizes 10 layout components — Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, and Equation — and decides whether successive text parts should be merged or whether a region must be handed off to TSR. Vision is exposed through two test entry points: ```bash python deepdoc/vision/t_ocr.py --inputs path/to/docs --output_dir ./ocr_outputs python deepdoc/vision/t_recognizer.py --inputs path/to/docs --threshold 0.2 --mode layout --output_dir ./out ``` Source: [deepdoc/README.md:1-80](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) ### Parser The parser consumes vision output and produces a clean, layout-aware representation. The release notes for **v0.25.1** added the [OpenDataLoader](https://github.com/opendataloader-project/opendataloader-pdf) PDF backend, and **v0.22.0** integrated MinerU as an additional document parser. These plug-ins share the same parser interface so users can swap backends without changing downstream pipelines. --- ## Chunking and Knowledge Structure After parsing, RAGFlow applies **template-based chunking** — described in [README.md:30-40](https://github.com/infiniflow/ragflow/blob/main/README.md) as "Intelligent and explainable … plenty of template options to choose from." Each chunk is materialized as a multi-field document that supports vector recall, keyword recall, and metadata filtering. The schema captured in the OceanBase migration tool mirrors the production schema used by the Doc Engine (see [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py)): | Field | Type | Purpose | |---|---|---| | `title_tks` | TEXT | Tokenized title for keyword search | | `content_with_weight` | LONGTEXT | Original chunk content | | `content_ltks` | LONGTEXT | Long-text token stream for keyword recall | | `content_sm_ltks` | LONGTEXT | Fine-grained token stream | | `important_kwd` | ARRAY(String) | Extracted keywords | | `important_tks` | TEXT | Tokenized keywords | | `question_kwd` | ARRAY(String(1024)) | Synthesized questions per chunk (used for QA-style recall) | | `tag_kwd` | ARRAY(String(256)) | User/system tags | | `available_int` | Integer | Soft-delete flag (0 = unavailable) | | `create_time` | TIMESTAMP | Ingestion timestamp | The presence of both long-text and fine-grained token streams, plus question-style keywords, is what enables RAGFlow's "find a needle in a haystack" behavior advertised in [README.md:20-30](https://github.com/infiniflow/ragflow/blob/main/README.md). Source: [tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py) --- ## Retrieval The retrieval layer exposes a unified interface across REST APIs, the MCP server, and the CLI. All three surfaces accept the same conceptual parameters. The MCP tool `ragflow_retrieval` defined in [mcp/server/server.py:1-60](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) accepts: - `dataset_ids` and `document_ids` — scope of search. - `question` — natural-language query. - `page`, `page_size` — pagination. - `similarity_threshold` (default `0.2`) — minimum vector similarity. - `vector_similarity_weight` (default `0.3`) — balance between vector and keyword scores. - `keyword` (bool) — toggle pure keyword recall. - `top_k` (default `1024`) — recall depth before rerank. - `rerank_id` — optional rerank model identifier. - `force_refresh` — bypass caches. The CLI mirrors the same knobs in [internal/cli/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md): ```bash search "RAG" datasets/kb1 --output plain -n 20 SEARCH 'AI' ON DATASETS 'kb1' WITH top_k 1024 similarity_threshold 0.0 vector_similarity_weight 0.3 keyword true ``` The v0.25.5 release notes state that the dataset search path was accelerated by 50–100% by removing an expensive vector-fetch + rerank-similarity step, which directly affects how these parameters behave under load. > **Community note:** [Issue #15714](https://github.com/infiniflow/ragflow/issues/15714) reports a Go-side type mismatch on `TenantRerankID` (declared `*string`, expected `*int`) inside the retrieval test path. Operators wiring custom rerank IDs into the retrieval test pipeline should validate their schema version. Separately, [Issue #4406](https://github.com/infiniflow/ragflow/issues/4406) requests rerank support for Ollama, which is currently not exposed as a `rerank_id` provider. Source: [mcp/server/server.py:1-80](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py), [internal/cli/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) --- ## Document Engine & Storage The retrieval layer is decoupled from storage by a `DocEngine` interface defined in [internal/engine/README.md:1-40](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md). Two implementations ship today: - **`elasticsearch`** — fully functional. Configured under `doc_engine.es` in `conf/service_conf.yaml` with hosts and basic-auth credentials. - **`infinity`** — placeholder awaiting the official Infinity Go SDK; only the directory skeleton exists. The factory in `engine_factory.go` selects an engine by `doc_engine.type`, and `cmd/server_main.go` calls `engine.Init` once at startup so every retrieval path shares one global instance (`global.go`). This design is what enables the OceanBase migration tooling ([tools/es-to-oceanbase-migration/README.md:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md)) to re-host the same RAGFlow schema on a third backend without touching parsers or retrieval code. Source: [internal/engine/README.md:1-50](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md), [tools/es-to-oceanbase-migration/README.md:1-30](https://github.com/infiniflow/ragflow/blob/main/tools/es-to-oceanbase-migration/README.md) --- ## See Also - [DeepDoc Vision & Parser](./deepdoc-vision-and-parser.md) - [Document Engine Backends (Elasticsearch / Infinity)](./doc-engine-backends.md) - [Retrieval APIs (REST, MCP, CLI)](./retrieval-apis.md) - [Knowledge Schema & Indexing](./knowledge-schema.md) --- ## Agent System, Tools, and Workflow Orchestration ### Related Pages Related topics: [Project Overview and System Architecture](#page-1), [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Deployment, Configuration, Administration, and Model Integration](#page-4)

Related Source Files

The following source files were used to generate this page: - [agent/canvas.py](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py) - [agent/component/base.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/base.py) - [agent/component/begin.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/begin.py) - [agent/component/llm.py](https://github.com/infiniflow/ragflow/blob/main/agent/component/llm.py) - [agent/component/retrieval (via tools/retrieval.py)](https://github.com/infiniflow/ragflow/blob/main/agent/component/retrieval (via tools/retrieval.py)) - [agent/component/code_exec (referenced via tools/code_exec.py)](https://github.com/infiniflow/ragflow/blob/main/agent/component/code_exec (referenced via tools/code_exec.py)) - [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) - [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) - [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) - [admin/client/README.md](https://github.com/infiniflow/ragflow/blob/main/admin/client/README.md)

# Agent System, Tools, and Workflow Orchestration ## Overview and Purpose RAGFlow's agent system fuses retrieval-augmented generation with agentic capabilities to deliver a configurable context layer for LLM applications. The runtime is assembled from modular components that can be composed into workflows for both personal and enterprise deployments (Source: [README.md]()). Pre-built agent templates and a converged context engine allow developers to transform complex data into production-ready AI systems with high efficiency (Source: [README.md]()). Recent releases have progressively expanded the agent surface area: - **Memory for AI agents** (added 2025-12-26) - **Agentic workflow and MCP integration** (added 2025-08-01) - **Python/JavaScript code executor component** (added 2025-05-23) - **Browser component for autonomous web navigation** (added in v0.25.6, May 2026) - **Chat-like Agent conversation management** (v0.24.0) ## Component-Based Architecture The agent runtime follows a component-based design in which each node in a workflow is implemented as a self-contained class inheriting from a common base. The canvas is responsible for assembling components, routing data between them, and orchestrating execution order (Source: [agent/canvas.py]()). All components share a unified interface defined in the base class, covering parameter validation, execution lifecycle, and canvas serialization (Source: [agent/component/base.py]()). ### Core Component Types - **`Begin`** — Defines initial input and conversation start parameters; the entry point of every workflow (Source: [agent/component/begin.py]()). - **`LLM`** — Performs language model inference with configurable prompts, model parameters, and tool bindings (Source: [agent/component/llm.py]()). - **`Retrieval`** — Performs RAG retrieval against datasets and documents; the implementation bridges to the shared `tools/retrieval.py` logic (Source: [agent/component/retrieval]()). - **`Code Exec`** — Executes Python or JavaScript snippets in a sandboxed environment to support computational reasoning (Source: [tools/code_exec.py]()). - **`Base`** — Foundation class providing the common contract (input/output schema, `invoke` lifecycle, canvas representation) that all other components extend (Source: [agent/component/base.py]()). ### Workflow Execution Flow ```mermaid flowchart LR A[User Input] --> B[Begin Component] B --> C{Route / Branch} C -->|Retrieval needed| D[Retrieval Component] C -->|Compute needed| E[Code Exec Component] C -->|Reasoning needed| F[LLM Component] D --> F E --> F F --> G[Output / Tool Call] G -.iterates.-> C ``` ## MCP (Model Context Protocol) Integration RAGFlow exposes its retrieval layer as MCP tools so that external agent clients can invoke retrieval against managed datasets. The MCP server registers a `ragflow_retrieval` tool that accepts `document_ids`, `dataset_ids`, `question`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, `rerank_id`, `force_refresh`, and pagination parameters (`page`, `page_size`) (Source: [mcp/server/server.py]():). The server runs as a Starlette ASGI application in either `HOST` mode or standalone mode, gated by an `AuthMiddleware` that validates the API key on every request (Source: [mcp/server/server.py]():). It fetches accessible datasets via the `/datasets` REST endpoint and paginates through all results when resolving the full set of dataset IDs for MCP retrieval fallback (Source: [mcp/server/server.py]():). ## Tools, Skills, and Memory Management Beyond built-in components, RAGFlow supports a pluggable skills and memory system exposed through the CLI filesystem (Source: [internal/cli/filesystem/README.md]():). The CLI parses commands using a recursive descent parser (`parser/parser.go`) over a lexer, and routes them to a virtual filesystem backed by providers (`dataset.go`, `file.go`) that wrap RAGFlow's RESTful APIs (Source: [internal/cli/README.md]():). ### Skill Sources The `install-skill` command accepts skills from local paths, GitHub URLs, ClawHub references, or skills.sh identifiers, then validates and stores them in an isolated space (Source: [internal/cli/filesystem/README.md]():). ### Security Validation The skill manager applies defense-in-depth checks before installation: - HTTPS source validation with SSL certificate verification - Quarantine isolation of downloaded skills prior to install - Static analysis scanning 100+ threat patterns across six categories: Exfiltration, Injection, Destructive, Persistence, Network, and Obfuscation - Trust tiers based on source reputation - Explicit `--force` user confirmation for high-risk installs - Audit logging of every installation with its scan results (Source: [internal/cli/filesystem/README.md]():) ### Memory System Memory is organized hierarchically into category folders (e.g., `memory/categories/category1`, `category2`) and per-agent memory files for tool and skill usage patterns, supporting retrieval augmentation across long-lived agent sessions (Source: [internal/cli/filesystem/README.md]():). ## Common Failure Modes and Community Notes - **TenantRerankID type mismatch in Go SDK**: `service.RetrievalTestRequest.TenantRerankID` and `SearchBotRetrievalTestRequest.TenantRerankID` are declared as `*string` but are consumed as `*int` in some retrieval-test code paths, which can surface as runtime errors when invoking the retrieval benchmark (Source: [issue #15714]()). - **Empty memory object on startup**: The RAGFlow server previously failed to start when an empty memory object existed; this was fixed in v0.23.1. - **Memory extraction stability**: When all memory types are selected simultaneously, extraction stability was hardened in v0.23.1. - **Browser component**: Newly added in v0.25.6, the Browser component enables autonomous web navigation; expect evolving behavior and config knobs (Source: [README.md]()). - **Kubernetes deployment**: Helm charts / raw Kubernetes manifests are not first-class supported; production deployment remains primarily via `docker-compose`. ## See Also - [Project README](https://github.com/infiniflow/ragflow/blob/main/README.md) - [MCP Server Source](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) - [Admin CLI Documentation](https://github.com/infiniflow/ragflow/blob/main/admin/client/README.md) - [Internal CLI Filesystem](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) - [DeepDoc Module](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) - [Firecrawl Integration](https://github.com/infiniflow/ragflow/blob/main/tools/firecrawl/README.md) --- ## Deployment, Configuration, Administration, and Model Integration ### Related Pages Related topics: [Project Overview and System Architecture](#page-1), [Core RAG Engine: Parsing, Chunking, Retrieval, and Knowledge](#page-2), [Agent System, Tools, and Workflow Orchestration](#page-3)

Related Source Files

The following source files were used to generate this page: - [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) - [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) - [docker/.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) - [docker/docker-compose.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose.yml) - [docker/docker-compose-base.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose-base.yml) - [docker/service_conf.yaml.template](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml.template) - [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) - [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) - [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) - [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) - [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) - [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)

# Deployment, Configuration, Administration, and Model Integration RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that fuses RAG with Agent capabilities. Operating the system at production scale requires mastering four interrelated concerns: deploying the runtime stack, configuring infrastructure and models, administering tenants and resources, and integrating third-party model providers. This page covers all four areas, drawing on the project's official deployment manifests, engine abstractions, and operator interfaces. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) ## Deployment ### Prerequisites and Runtime Stack The official deployment path is Docker Compose. The system requires at minimum 4 CPU cores, 16 GB RAM, 50 GB disk, Docker >= 24.0.0 with Docker Compose >= v2.26.1, Python >= 3.13, and (optionally) [gVisor](https://gvisor.dev/docs/user_guide/install/) when the Agent's code executor sandbox is enabled. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) Before starting, the host kernel parameter `vm.max_map_count` must be set to at least 262144 (Elasticsearch requirement). The README documents how to check and set it via `sysctl -w vm.max_map_count=262144`. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) ### Compose Topology Two compose files are maintained: - `docker/docker-compose.yml` brings up the RAGFlow application service on top of a dependency stack. - `docker/docker-compose-base.yml` provides the dependencies: Elasticsearch (or [Infinity](https://github.com/infiniflow/infinity)), MySQL, MinIO, and Redis. A legacy `docker-compose-CN-oc9.yml` and a `docker-compose-macos.yml` exist but are not actively maintained. Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) The high-level deployment topology is: ```mermaid flowchart LR User[User / MCP Client] --> Web[Web Frontend
Vite + React] User --> API[RAGFlow API Server
Python + Go] API --> MySQL[(MySQL)] API --> Redis[(Redis)] API --> MinIO[(MinIO)] API --> Engine{Doc Engine} Engine -->|type=elasticsearch| ES[(Elasticsearch)] Engine -->|type=infinity| INF[(Infinity)] Web -.->|serves| User ``` ### Kubernetes and Cloud A community-requested Kubernetes deployment path (Helm charts or raw manifests) is tracked in issue [#864](https://github.com/infiniflow/ragflow/issues/864). As of the most recent releases, official Helm support is not yet shipped; the supported path remains Docker Compose on a single host, optionally scaled by externalizing the dependency services. Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) ## Configuration ### Docker Environment Variables The `[docker/.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env)` file is the primary configuration surface for the container stack. The following variables are documented: | Variable | Default | Purpose | |----------|---------|---------| | `STACK_VERSION` | `8.11.3` | Elasticsearch image version | | `ES_PORT` | `1200` | Host port exposed for Elasticsearch | | `ELASTIC_PASSWORD` | — | Elasticsearch bootstrap password | | `KIBANA_PORT` | — | Host port for the Kibana UI | Source: [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) ### Service Configuration `docker/service_conf.yaml.template` is rendered at startup and configures the RAGFlow service. The internal Go engine selects a document store via a `doc_engine.type` key. Two backend values are supported: - `elasticsearch` — fully implemented, configured with `doc_engine.es.hosts`, `username`, `password`. - `infinity` — a placeholder backend waiting for the official Infinity Go SDK; configuration keys include `uri`, `postgres_port`, `db_name`. Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) Engine selection happens once at process startup. The Go factory in `internal/engine/engine_factory.go` returns a `DocEngine` interface implementation that the rest of the service consumes uniformly for indexing, search, and document operations. Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) ### Parsers and OCR The `[deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md)` introduces *Deep*Doc, RAGFlow's vision and parser subsystem. The vision pipeline provides OCR, layout recognition (10 base layout components: Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation), and Table Structure Recognition (TSR). Operators can smoke-test these on local files using `python deepdoc/vision/t_ocr.py` and `python deepdoc/vision/t_recognizer.py`. Source: [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) ## Administration ### Admin Panel Release v0.25.5 introduced local and SSH providers in the admin panel (PR #15039), allowing administrators to manage users, datasets, and storage backends directly from the web console. Source: [GitHub release v0.25.5](https://github.com/infiniflow/ragflow/releases/tag/v0.25.5) ### CLI and Virtual Filesystem The Go CLI under `internal/cli` exposes a virtual filesystem layered over RAGFlow's RESTful APIs. The design follows three principles: (1) no server-side changes, (2) a provider pattern with a common `Provider` interface in `filesystem/base.go`, and (3) unified commands (`ls`, `search`, `cat`, `mkdir`) over virtual paths. Supported paths include `/datasets`, `/datasets/{name}` (lists documents), and `/datasets/{name}/{doc}` (fetches a single document). Source: [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) ### Skill Management The CLI's `install-skill` command supports four source types: local paths, `github.com/owner/repo/path`, `clawhub://owner/skill-name` (ClawHub), and `skill://skill-name` (skills.sh). A defense-in-depth security architecture validates sources over HTTPS with SSL verification, quarantines downloads, runs regex-based static analysis against 100+ threat patterns (exfiltration, injection, destructive operations, persistence, network, obfuscation), and applies trust tiers. Limits: total skill size <= 50 MB, individual file <= 5 MB, text files only, lowercase alphanumeric names with hyphens/underscores. Source: [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) ### Frontend Build The web UI is a Vite + React project. Build and development entry points live in `[web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json)`: `npm run dev` starts the dev server, `npm run build` produces a production bundle, and `npm run type-check` validates TypeScript. The schema editor component in `web/src/components/jsonjoy-builder/lib/schema-editor.ts` enforces JSONSchema field-name validation against the pattern `^[a-zA-Z_$][a-zA-Z0-9_$]*$`. Source: [web/package.json](https://github.com/infiniflow/ragflow/blob/main/web/package.json) ## Model Integration ### LLM and Embedding Providers RAGFlow ships with a model registry supporting OpenAI-compatible APIs. Release v0.25.4 added `gpt-5.4-mini` and `gpt-5.4-nano` to the OpenAI model list, and release v0.25.6 extended the Agent with a new Browser component that lets models navigate and interact with web pages autonomously (PR #14888). DeepSeek v4 support was added on 2026-04-24, and Gemini 3 Pro support on 2025-11-19. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) and [GitHub release v0.25.4](https://github.com/infiniflow/ragflow/releases/tag/v0.25.4) ### Rerank and Retrieval A community feature request to add Ollama rerank integration is tracked in issue [#4406](https://github.com/infiniflow/ragflow/issues/4406). Rerank models are referenced by ID through the `rerank_id` parameter on retrieval calls. The MCP server's `ragflow_retrieval` tool accepts `rerank_id`, `similarity_threshold`, `vector_similarity_weight`, `keyword`, `top_k`, and `force_refresh` arguments that all flow into the unified retrieval service. Source: [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) Release v0.25.5 accelerated the dataset search path, reducing latency by 50–100% by removing an expensive vector fetch and rerank similarity computation from the hot path (PR #14970). Source: [GitHub release v0.25.5](https://github.com/infiniflow/ragflow/releases/tag/v0.25.5) ### MCP Server The `[mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py)` exposes RAGFlow as a Model Context Protocol server. Two tool entry points are registered: `list_datasets` (paginates `/datasets` and returns newline-delimited JSON) and `ragflow_retrieval` (performs cross-dataset retrieval with the parameters above). When `MODE == HOST`, the server installs an `AuthMiddleware` to enforce API key authentication. Source: [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) ### Document Parsers The v0.25.0 release added 7 built-in ingestion pipeline templates aligned with RAGFlow's native parsers, and v0.25.1 added the [OpenDataLoader](https://github.com/opendataloader-project/opendataloader-pdf) PDF backend. A community request to integrate [Docling](https://github.com/DS4SD/docling) as an additional parser is tracked in issue [#3443](https://github.com/infiniflow/ragflow/issues/3443). For users migrating from Elasticsearch to OceanBase, the schema mapping in `tools/es-to-oceanbase-migration/src/es_ob_migration/schema.py` documents how chunk fields (content, tokens, keywords, tags, PageRank) translate to OceanBase column types. Source: [GitHub release v0.25.1](https://github.com/infiniflow/ragflow/releases/tag/v0.25.1) ## Common Failure Modes 1. **`vm.max_map_count` too low** — Elasticsearch container fails to start. Mitigate with `sudo sysctl -w vm.max_map_count=262144`. Source: [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) 2. **Infinity backend selected without SDK** — the Infinity implementation is a placeholder; only Elasticsearch is fully functional. Source: [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) 3. **Type mismatch on rerank IDs** — the Go `service.RetrievalTestRequest.TenantRerankID` field has a known `*string` vs `*int` mismatch with retrieval tests (issue [#15714](https://github.com/infiniflow/ragflow/issues/15714)). Source: [issue #15714](https://github.com/infiniflow/ragflow/issues/15714) 4. **Skill installation blocked** — over-size archives, binary files, or suspicious patterns are rejected by the static analyzer. Source: [internal/cli/filesystem/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/filesystem/README.md) ## See Also - [README.md](https://github.com/infiniflow/ragflow/blob/main/README.md) — Project overview and quickstart - [docker/README.md](https://github.com/infiniflow/ragflow/blob/main/docker/README.md) — Full Docker deployment reference - [deepdoc/README.md](https://github.com/infiniflow/ragflow/blob/main/deepdoc/README.md) — Vision and parser subsystem - [internal/engine/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/engine/README.md) — Doc engine abstraction - [internal/cli/README.md](https://github.com/infiniflow/ragflow/blob/main/internal/cli/README.md) — CLI and virtual filesystem - [mcp/server/server.py](https://github.com/infiniflow/ragflow/blob/main/mcp/server/server.py) — MCP server reference --- --- ## Pitfall Log Project: infiniflow/ragflow Summary: Found 20 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification. ## 1. Configuration risk - Configuration risk requires verification - Severity: high - Evidence strength: source_linked - Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow. - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: community_evidence:github | cevd_7154f1df73d9467aa3d747477287e392 | https://github.com/infiniflow/ragflow/issues/15714 ## 2. Security or permission risk - Security or permission risk requires verification - Severity: high - Evidence strength: source_linked - Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow. - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: community_evidence:github | cevd_8d8565f17f754fe3a6f7ad1f3b4be33d | https://github.com/infiniflow/ragflow/issues/15525 ## 3. Installation risk - Installation risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow. - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: community_evidence:github | cevd_408303dfb4fb43a781b7dc14724082b9 | https://github.com/infiniflow/ragflow/issues/15751 ## 4. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.23.1 - User impact: Upgrade or migration may change expected behavior: v0.23.1 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.23.1. Context: Observed when using docker - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_38f958bf7c9ad232f6049339e1321be7 | https://github.com/infiniflow/ragflow/releases/tag/v0.23.1 ## 5. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.24.0 - User impact: Upgrade or migration may change expected behavior: v0.24.0 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.24.0. Context: Observed when using docker - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_0ca2840fc49d848176cce456864aafa3 | https://github.com/infiniflow/ragflow/releases/tag/v0.24.0 ## 6. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.0 - User impact: Upgrade or migration may change expected behavior: v0.25.0 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.0. Context: Observed when using python, docker - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_7154c897fed0437e0ca58d1f443b8d97 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.0 ## 7. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.1 - User impact: Upgrade or migration may change expected behavior: v0.25.1 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.1. Context: Observed during version upgrade or migration. - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_12ff69cd8f090474bcc8768ed255e16a | https://github.com/infiniflow/ragflow/releases/tag/v0.25.1 ## 8. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.2 - User impact: Upgrade or migration may change expected behavior: v0.25.2 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.2. Context: Observed when using python - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_7f58552889f29288945720d487e8fbb7 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.2 ## 9. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.3 - User impact: Upgrade or migration may change expected behavior: v0.25.3 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.3. Context: Observed when using docker - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_14af37b03860695c40160c241d23e5b1 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.3 ## 10. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.4 - User impact: Upgrade or migration may change expected behavior: v0.25.4 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.4. Context: Source discussion did not expose a precise runtime context. - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_026d052ebdc28ef87ab4152d11b96502 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.4 ## 11. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.5 - User impact: Upgrade or migration may change expected behavior: v0.25.5 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.5. Context: Observed when using python - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_57690c932d554b7b2b477b7e4564f3f5 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.5 ## 12. Configuration risk - Configuration risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Developers should check this configuration risk before relying on the project: v0.25.6 - User impact: Upgrade or migration may change expected behavior: v0.25.6 - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: v0.25.6. Context: Observed when using python, cuda - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_e1befbd52e751833a5dab041663c4bf0 | https://github.com/infiniflow/ragflow/releases/tag/v0.25.6 ## 13. Capability evidence risk - Capability evidence risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: README/documentation is current enough for a first validation pass. - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: capability.assumptions | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 14. Maintenance risk - Maintenance risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow. - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 15. Security or permission risk - Security or permission risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: no_demo - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: downstream_validation.risk_items | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 16. Security or permission risk - Security or permission risk requires verification - Severity: medium - Evidence strength: source_linked - Finding: no_demo - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: risks.scoring_risks | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 17. Capability evidence risk - Capability evidence risk requires verification - Severity: low - Evidence strength: source_linked - Finding: Developers should check this capability risk before relying on the project: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test - User impact: Developers may hit a documented source-backed failure mode: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: [Go] tenant_rerank_id type mismatch: *string should be *int — retrieval_test. Context: Observed when using python - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_issue | fmev_a84cfda4f8786aaff3acbf0072fb4c08 | https://github.com/infiniflow/ragflow/issues/15714 ## 18. Maintenance risk - Maintenance risk requires verification - Severity: low - Evidence strength: source_linked - Finding: issue_or_pr_quality=unknown。 - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 19. Maintenance risk - Maintenance risk requires verification - Severity: low - Evidence strength: source_linked - Finding: release_recency=unknown。 - User impact: May increase setup, validation, or first-run risk for the user. - Suggested check: Reproduce the official install and quickstart path in an isolated environment. - Evidence: evidence.maintainer_signals | github_repo:730534580 | https://github.com/infiniflow/ragflow ## 20. Maintenance risk - Maintenance risk requires verification - Severity: low - Evidence strength: source_linked - Finding: Developers should check this maintenance risk before relying on the project: nightly - User impact: Upgrade or migration may change expected behavior: nightly - Suggested check: Before packaging this project, run the relevant install/config/quickstart check for: nightly. Context: Source discussion did not expose a precise runtime context. - Guardrail: State this as source-backed community evidence, not as Doramagic reproduction. - Evidence: failure_mode_cluster:github_release | fmev_57bc13a92eaec92fbf9f0b315ce0baec | https://github.com/infiniflow/ragflow/releases/tag/nightly