Doramagic Project Pack · Human Manual

UltraRAG

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

Overview and Core Architecture

Related topics: MCP Servers and Core Components, Pipelines, Workflows and Examples

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: MCP Servers and Core Components, Pipelines, Workflows and Examples

Overview and Core Architecture

1. Purpose and Scope

UltraRAG is a lightweight RAG (Retrieval-Augmented Generation) development framework built on the Model Context Protocol (MCP) architecture. It is jointly developed by THUNLP at Tsinghua University, NEUIR at Northeastern University, OpenBMB, and AI9stars, and is positioned for both research exploration and industrial prototyping. Source: README.md:43-49.

The framework standardizes core RAG components — such as retrievers, generators, corpus processors, and evaluators — as independent MCP Servers, while a centralized MCP Client handles workflow orchestration. Developers express control flow (sequential, loop, and conditional branches) declaratively in YAML, letting them implement complex iterative RAG logic in a few dozen lines of configuration. Source: README.md:9-15.

The project targets two distinct user audiences:

  • Researchers who need standardized evaluation workflows, ready-to-use benchmarks, and reproducible metric management.
  • Developers / end users who need a fast path from a pipeline definition to a working conversational Web UI.

This dual-purpose design is reflected in the repository layout, which separates the orchestration layer (YAML pipelines, MCP client) from the component layer (atomic MCP servers such as corpus, retriever, generator, evaluation, and custom). Source: README.md:9-19, README.md:75-83.

2. Core Architecture

At a high level, UltraRAG is organized as a thin orchestration shell that talks to many small, pluggable MCP servers. Each server is a fastmcp application that registers its functionality through the @app.tool decorator, exposing one or more typed functions to the client.

flowchart LR
    YAML[YAML Pipeline Config] --> Client[MCP Client / Orchestrator]
    UI[UltraRAG UI / Canvas] --> Client
    Client -->|tool call| S1[corpus Server]
    Client -->|tool call| S2[retriever Server]
    Client -->|tool call| S3[generator Server]
    Client -->|tool call| S4[evaluation Server]
    Client -->|tool call| S5[custom Server]
    S2 -->|backend| W1[FAISS / Milvus]
    S2 -->|backend| W2[Exa / Tavily / ZhipuAI]
    S1 --> Corpus[JSONL Chunks]
    S4 --> Results[Metrics + Reports]

The MCP architecture is the defining feature: every functional unit (chunking, embedding, retrieval, web search, generation, evaluation, prompt transformation) is decoupled into an independent server. New features only need to be registered as function-level tools to integrate into existing workflows, giving very high reusability. Source: README.md:13-19.

The UI is a separate React/TypeScript application that consumes the same pipelines. It uses @xyflow/react for the visual canvas, @tanstack/react-query for server state, and renders chat output through a custom Markdown pipeline that supports KaTeX math, tables, and citation link rewriting. Source: ui/frontend/package.json:11-29, ui/frontend/src/shared/lib/chatMarkdown.ts:6-22.

3. Pluggable Backend Pattern

A defining implementation detail of the framework is the backend registry pattern used by the retriever and corpus servers. Each category of capability (index storage, web search provider) is encapsulated behind an abstract base class, and concrete implementations are registered in a dictionary that maps a short name to a (module, class) pair. The factory function dynamically imports the module and instantiates the class.

For example, the index backend registry maps "faiss" and "milvus" to their respective backend classes, and a create_index_backend() factory is the single entry point used by callers. Source: servers/retriever/src/index_backends/__init__.py:5-26. The same pattern is used for web search, where the registry maps "exa", "tavily", and "zhipuai" to their backend classes, all inheriting from a common BaseWebSearchBackend. Source: servers/retriever/src/websearch_backends/__init__.py:8-32.

This pattern delivers three concrete benefits:

  1. Uniform configuration surface — callers only need to know the backend name and a config dict, not the underlying SDK.
  2. Optional dependency isolation — if a backend's SDK (e.g., exa_py, tavily, pymilvus) is missing, only that backend fails to load; the rest of the framework still runs. Source: servers/retriever/src/index_backends/milvus_backend.py:18-25, servers/retriever/src/websearch_backends/exa_backend.py:17-23.
  3. Swappable providers — switching from local FAISS to a managed Milvus cluster, or from Tavily to ZhipuAI web search, is purely a configuration change.

The base web search class also implements _parallel_search, a generic concurrency-controlled async dispatcher with a configurable retrieve_thread_num, so every concrete backend automatically gets rate-limited parallel execution. Source: servers/retriever/src/websearch_backends/base.py:24-46.

4. Component Inventory

The repository ships the following atomic servers, each addressing a single RAG concern:

ServerModule PathResponsibility
corpusservers/corpus/src/corpus.pyToken/sentence/recursive chunking via chonkie + tiktoken into JSONL. Source: servers/corpus/src/corpus.py:80-90
retrieverservers/retriever/src/index_backends/Vector indexing + similarity search over FAISS or Milvus. Source: servers/retriever/src/index_backends/faiss_backend.py:18-26, servers/retriever/src/index_backends/milvus_backend.py:34-50
retriever (web)servers/retriever/src/websearch_backends/Pluggable web search across Exa, Tavily, ZhipuAI. Source: servers/retriever/src/websearch_backends/exa_backend.py:11-30, servers/retriever/src/websearch_backends/tavily_backend.py:12-30, servers/retriever/src/websearch_backends/zhipuai_backend.py:14-32
evaluationservers/evaluation/src/evaluation.pyStandardized metric collection, JSON + Markdown reporting with timestamped output. Source: servers/evaluation/src/evaluation.py:6-29
customservers/custom/src/custom.pyRAG-specific prompt transforms: Search-o1 information extraction, IterRetGen query building, \boxed{} answer extraction. Source: servers/custom/src/custom.py:6-9, servers/custom/src/custom.py:80-95

Each tool is registered with an explicit output="a,b->c" mapping that tells the orchestrator how to feed the tool's return value into downstream step inputs. This is the contract that makes the YAML pipeline declarative. Source: servers/custom/src/custom.py:6-9, servers/custom/src/custom.py:80-95.

5. Deployment and the "Pipeline-as-API" Question

The framework deliberately separates authoring from serving. A pipeline is first authored as a YAML file that the orchestrator runs locally, and the same YAML can be loaded by the UI's Pipeline Builder (canvas + code, with bidirectional sync) for visual debugging. Source: README.md:51-60, README.md:9-15.

A common community question (GitHub issue #95) is whether a finished pipeline can be exposed as a callable API, similar to Dify. The architecture already supports this direction: every server runs as a standalone fastmcp process over stdio, and the MCP client is the only component that needs to wrap the YAML execution loop in an HTTP service. In practice, the typical patterns are:

  • Wrap the MCP Client runner behind a FastAPI/Flask handler that accepts a query, dispatches to the registered MCP servers, and returns the final answer.
  • For production, deploy each MCP server as a separate container and have the client connect over the network transport instead of stdio.
  • Use the UI as a frontend that already calls the same pipeline over HTTP, so the same backend can serve both the canvas and external API consumers.

The release notes for v0.3.0.2 (2026-04-09) further strengthen this serving story by adding SQLite-backed authentication, persistent chat sessions, nickname and model settings, and a memory-aware RAG demo — all of which assume a stateful HTTP service in front of the orchestrator. Source: README.md:107-117.

See Also

  • Research Experiments — datasets, evaluation workflows, and case-study debugging
  • UI Quick Start — launching the Pipeline Builder and admin mode
  • Deployment Guide — production setup for retrievers, LLMs, and Milvus
  • Code Integration — calling UltraRAG components directly from Python

Source: https://github.com/OpenBMB/UltraRAG / Human Manual

MCP Servers and Core Components

Related topics: Overview and Core Architecture, Pipelines, Workflows and Examples, UI, Memory System and API Deployment

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Index Backends

Continue reading this section for the full explanation and source context.

Section Web-Search Backends

Continue reading this section for the full explanation and source context.

Related topics: Overview and Core Architecture, Pipelines, Workflows and Examples, UI, Memory System and API Deployment

MCP Servers and Core Components

Overview

UltraRAG is a lightweight RAG (Retrieval-Augmented Generation) development framework built on top of the Model Context Protocol (MCP). Its core design philosophy is to decouple every RAG capability into a standalone MCP Server that exposes fine-grained Tools over a standardized interface. A separate MCP Client orchestrates these servers through YAML pipelines, supporting sequential execution, loops, and conditional branches without writing glue code.

Source: README.md:14-18

The framework is jointly maintained by THUNLP at Tsinghua University, NEUIR at Northeastern University, OpenBMB, and AI9stars, and targets both research exploration and industrial prototyping. Because each server is a normal MCP process, the same tool can be reused across pipelines, swapped in benchmarks, or wrapped behind custom UIs.

Core MCP Server Inventory

UltraRAG ships a curated set of servers, each registered as a separate stdio-processable module. The following table summarizes the canonical servers found in the repository tree:

ServerModule PathPrimary Responsibility
corpusservers/corpus/src/corpus.pyDocument loading and chunking (token, sentence, recursive strategies via chonkie)
retrieverservers/retriever/src/retriever.pyEmbedding-based and web-search-based retrieval with pluggable backends
generationservers/generation/src/generation.pyLLM inference, including vLLM, multimodal, and multi-turn generation
evaluationservers/evaluation/src/evaluation.pyMetric computation and result persistence (JSON + Markdown)
memoryservers/memory/src/memory.pyPersistent per-user and per-project memory with filesystem isolation
customservers/custom/src/custom.pyProject-specific utility tools (e.g., Search-o1 reason/final information extraction)

Source: servers/corpus/src/corpus.py:1-50, servers/retriever/src/retriever.py:1-80, servers/memory/src/memory.py:1-30

Each server is instantiated through a shared helper, UltraRAG_MCP_Server, which is imported from the ultrarag.server package. For example, the memory server starts with app = UltraRAG_MCP_Server("memory") and then registers tools via decorators, while the generation server binds methods through mcp_inst.tool(...) with explicit output signatures that double as contract definitions for the client. Source: servers/memory/src/memory.py:14-19, servers/generation/src/generation.py:30-60

Retriever Internals: Pluggable Backends

The retriever is the most backend-rich server in the framework. It separates concerns into two sub-packages, each registered through its own factory:

Index Backends

The index layer is responsible for vector storage and nearest-neighbor search. Backends are dynamically loaded by name:

_INDEX_BACKENDS = {
    "faiss": ".faiss_backend.FaissIndexBackend",
    "milvus": ".milvus_backend.MilvusIndexBackend",
}

Source: servers/retriever/src/index_backends/__init__.py:10-14

When is_demo=True is set on the retriever, both the embedding backend and the index backend are forced to openai and milvus respectively, and the server raises a ValidationError if those keys are missing from configuration. Source: servers/retriever/src/retriever.py:42-58

Web-Search Backends

For open-domain retrieval, UltraRAG wraps three commercial search providers behind a common async interface:

_WEBSEARCH_BACKENDS = {
    "exa":     ".exa_backend.ExaWebSearchBackend",
    "tavily":  ".tavily_backend.TavilyWebSearchBackend",
    "zhipuai": ".zhipuai_backend.ZhipuaiWebSearchBackend",
}

Source: servers/retriever/src/websearch_backends/__init__.py:12-16

The abstract base class BaseWebSearchBackend implements a concurrency-controlled asyncio.Semaphore worker pool that processes queries in parallel, with a tqdm progress bar that integrates with the server's logging. Source: servers/retriever/src/websearch_backends/base.py:18-46

Each concrete backend reads its API key from configuration or environment variables (EXA_API_KEY, TAVILY_API_KEY, ZHIPUAI_API_KEY), and raises explicit ToolError/ImportError exceptions when dependencies or credentials are missing. This is the project's idiomatic way of surfacing misconfiguration to the orchestration layer. Source: servers/retriever/src/websearch_backends/exa_backend.py:18-30, servers/retriever/src/websearch_backends/tavily_backend.py:18-50, servers/retriever/src/websearch_backends/zhipuai_backend.py:15-30

Memory, Evaluation, and the Web UI

The memory server, introduced prominently in v0.3.0.2, stores persistent state under <ui>/storage/memory/<user_id>/, with a MEMORY.md file and a per-project subdirectory. The user_id is validated against ^[A-Za-z0-9_-]+$ to prevent path traversal, and the storage root can be relocated via the ULTRARAG_UI_STORAGE_ROOT environment variable. Source: servers/memory/src/memory.py:14-40

The evaluation server is intentionally lightweight: it accepts a metric dictionary, writes a timestamped JSON file under the configured save_path, and optionally renders the result as a Markdown table for human inspection. Source: servers/evaluation/src/evaluation.py:30-60

The companion Web IDE is built on React 19, Vite, TanStack Query, and @xyflow/react for the pipeline canvas, with highlight.js, marked, katex, and dompurify powering the rich-text rendering of prompts and responses. Source: ui/frontend/package.json:6-28

Community Context

A recurring community question is whether a finished pipeline can be deployed as a callable API, similar to Dify. Because each server speaks MCP natively, the same YAML pipeline that runs locally can be driven by any MCP client; the practical pattern is to keep the orchestration running as a service and expose the client over HTTP, rather than wrapping the YAML itself. The v0.3.0.2 release also adds SQLite-backed authentication and persistent chat sessions in the Web UI, which already expose the pipelines as interactive endpoints that can be reverse-proxied behind a public API gateway.

See Also

  • UltraRAG Pipeline Authoring (YAML control structures)
  • Retriever Index Backend Configuration
  • Memory Server and UI Storage Layout
  • UltraRAG Web UI Overview

Source: https://github.com/OpenBMB/UltraRAG / Human Manual

Pipelines, Workflows and Examples

Related topics: Overview and Core Architecture, MCP Servers and Core Components, UI, Memory System and API Deployment

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Sequential flow (ragfull.yaml)

Continue reading this section for the full explanation and source context.

Section Loop flow (ragloop.yaml and ircot.yaml)

Continue reading this section for the full explanation and source context.

Section Branch flow (ragbranch.yaml)

Continue reading this section for the full explanation and source context.

Related topics: Overview and Core Architecture, MCP Servers and Core Components, UI, Memory System and API Deployment

Pipelines, Workflows and Examples

Overview

A pipeline in UltraRAG is a YAML-declared workflow that orchestrates one or more MCP Servers into an end-to-end RAG or agent procedure. Instead of writing Python glue code, developers describe a graph of node invocations, control structures, and parameter bindings in a single configuration file. The MCP Client (the framework's orchestrator) consumes this YAML and dispatches calls to atomic MCP Servers such as corpus, retriever, generation, evaluation, and custom (Source: README.md).

UltraRAG natively supports the three control structures that RAG research and prototyping actually need: sequential execution, loop (e.g., iterative retrieval-generation), and conditional branch (Source: README.md). Each node in a pipeline is a function-level Tool exposed by an MCP Server, and new tools can be added by registering them — the pipeline layer remains unchanged. The example workflows under examples/experiments/ illustrate every pattern.

flowchart LR
    A[Query] --> B[Retriever Server]
    B --> C{Loop or Branch?}
    C -->|loop| B
    C -->|sequential| D[Generation Server]
    D --> E[Evaluation Server]
    E --> F[Result]

Bundled Example Pipelines

The examples/experiments/ directory ships a small, ordered curriculum of pipeline YAMLs that doubles as both documentation and test fixtures.

PipelinePurposeKey Servers Touched
sayhello.yamlMinimal smoke test that wires one server call end-to-endcustom
rag_full.yamlCanonical RAG: retrieve → generate → evaluateretriever, generation, evaluation
rag_loop.yamlIterative retrieval/generation (e.g., IRCoT-style refinement)retriever, generation, custom
rag_branch.yamlConditional routing (e.g., skip retrieval when confidence is high)retriever, generation
rag_deploy.yamlProduction-shaped pipeline ready to be served as a demoretriever, generation, custom
ircot.yamlInterleaved Retrieval + Chain-of-Thought research reciperetriever, generation, custom

Sequential flow (`rag_full.yaml`)

The default RAG path chains three nodes: corpus indexing (optional), a retriever call, and a generation call. Retriever backends are pluggable — create_index_backend resolves names like faiss and milvus from the index_backends registry (Source: servers/retriever/src/index_backends/__init__.py), and create_websearch_backend resolves exa, tavily, and zhipuai (Source: servers/retriever/src/websearch_backends/__init__.py). This lets rag_full.yaml remain backend-agnostic — switching from FAISS to Milvus or Tavily to Exa is a one-line config change.

Loop flow (`rag_loop.yaml` and `ircot.yaml`)

Loop pipelines repeatedly invoke a sub-graph until a stop condition is met. The custom server supplies the glue: iterretgen_nextquery concatenates the previous query and answer to produce the next retrieval query (Source: servers/custom/src/custom.py), and search_o1_extract_query pulls <|begin_of_query|>-tagged queries out of LLM output for the next iteration. ircot.yaml builds on the same primitives to interleave chain-of-thought reasoning with retrieval steps.

Branch flow (`rag_branch.yaml`)

Conditional branches route execution based on a runtime predicate evaluated against the current state. Typical predicates include "retrieval confidence above threshold" or "answer already contains citation." The branch node is declared in YAML; the underlying evaluation logic is supplied by an MCP Server tool, keeping the orchestration declarative.

Evaluation and Debugging Hooks

Every research pipeline can attach the evaluation server as a terminal node. The save_eval_results tool writes timestamped JSON and optionally prints a Markdown table of averaged metrics, which makes benchmarking reproducible (Source: servers/evaluation/src/evaluation.py). The repo also provides a Structured Debugging Guide covering four layers — input & retrieval, reasoning & planning, state & context, and deployment & runtime — to attribute failures when answers look suspicious (Source: README.md).

From Pipeline to Service

A recurring community question is how to expose a finished pipeline as a callable HTTP API (Dify-style). The rag_deploy.yaml example and the One-Click Delivery workflow address this: a pipeline is converted into an interactive conversational Web UI with a single command (Source: README.md). For developers, the recommended path is to start from rag_deploy.yaml, then consult the Deployment Guide for production environment setup including Retriever, LLM, and Milvus configuration. The Deep Research demo (powered by the AgentCPM-Report model) demonstrates this end-to-end: a pipeline runs multi-step retrieval and integration to produce a long-form report (Source: README.md).

See Also

Source: https://github.com/OpenBMB/UltraRAG / Human Manual

UI, Memory System and API Deployment

Related topics: Overview and Core Architecture, MCP Servers and Core Components, Pipelines, Workflows and Examples

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 3.1 Persistent User Memory

Continue reading this section for the full explanation and source context.

Section 3.2 Project Memory Retrieval

Continue reading this section for the full explanation and source context.

Section 3.3 Memory-Aware RAG Demo

Continue reading this section for the full explanation and source context.

Related topics: Overview and Core Architecture, MCP Servers and Core Components, Pipelines, Workflows and Examples

UI, Memory System and API Deployment

1. Overview and Scope

UltraRAG ships a first-class visual RAG Integrated Development Environment (IDE) that goes beyond a conventional chat interface. The UI combines pipeline orchestration, debugging, and demonstration in a single surface, allowing users to design, run, and inspect MCP-based RAG pipelines without writing code by hand. According to the project README, "UltraRAG UI transcends the boundaries of traditional chat interfaces, evolving into a visual RAG Integrated Development Environment (IDE) that combines orchestration, debugging, and demonstration." Source: README.md

Three concerns are tightly coupled in the ui/backend module:

ConcernSource FileRole
HTTP entry pointui/backend/app.pyHosts REST endpoints consumed by the React/Vite frontend (ui/frontend/package.json)
Identityui/backend/auth.pySQLite-backed authentication, nickname and model settings
Stateui/backend/chat_store.pyPersistent chat sessions
Knowledge base ACLui/backend/kb_visibility_store.pyPer-user knowledge base visibility
Pipeline executionui/backend/pipeline_manager.pyBridges UI actions to MCP servers and pipelines
Path resolutionui/backend/storage_paths.pyCentralizes where artefacts are written on disk

The v0.3.0.2 release (2026-04-09) explicitly introduced a memory upgrade: persistent user memory, project memory retrieval, a memory-aware RAG demo, and SQLite-backed authentication, persistent chat sessions, nickname and model settings management. Source: GitHub Release v0.3.0.2

2. The UI: A Visual RAG IDE

The UI is implemented as a React 19 + Vite single-page application that talks to the FastAPI/Flask-style backend in ui/backend/app.py. The frontend stack (Radix UI primitives, @xyflow/react for the canvas, @tanstack/react-query for data fetching, marked + KaTeX for rendering, and js-yaml for editing) indicates a canvas-based pipeline builder with bidirectional YAML synchronization. Source: ui/frontend/package.json

Key user-facing capabilities, as documented in the README, include:

  • Pipeline Builder with bidirectional real-time synchronization between "Canvas Construction" and "Code Editing," allowing granular online adjustments of pipeline parameters and prompts.
  • Intelligent AI Assistant that assists the full development lifecycle.
  • One-Click Delivery — a Pipeline defined in YAML can be converted into an interactive conversational Web UI. Source: README.md

The flow between a user's click and a pipeline execution is mediated by pipeline_manager.py, which wraps the YAML-driven MCP client and the atomic MCP servers (Retriever, Generation, Corpus, Evaluation, Custom).

flowchart LR
    User[Browser UI] -->|HTTP| App[app.py]
    App --> Auth[auth.py]
    App --> PM[pipeline_manager.py]
    PM -->|YAML| MCPClient[MCP Client]
    MCPClient --> Srv1[Retriever Server]
    MCPClient --> Srv2[Generation Server]
    MCPClient --> Srv3[Corpus / Eval / Custom]
    PM --> Chat[chat_store.py]
    Chat --> SQLite[(SQLite)]
    Auth --> SQLite
    KBV[kb_visibility_store.py] --> SQLite
    SP[storage_paths.py] --> FS[(File System)]

3. Memory System (v0.3.0.2)

The v0.3.0.2 release introduced three layered memory capabilities, all routed through the ui/backend layer:

3.1 Persistent User Memory

auth.py and chat_store.py together provide SQLite-backed authentication, nicknames, model preferences, and persistent chat sessions. This means a returning user sees their prior conversations, selected model, and identity without reconfiguration. Source: GitHub Release v0.3.0.2

3.2 Project Memory Retrieval

Project memory is shared, retrievable state that augments the RAG pipeline itself. The release notes describe "persistent user memory, project memory retrieval, and a dedicated memory-aware RAG demo." This is exposed as additional context fetched by the pipeline orchestrator (pipeline_manager.py) before generation. Source: GitHub Release v0.3.0.2

3.3 Memory-Aware RAG Demo

A dedicated demo showcases how the memory layer plugs into an existing pipeline. The demo is delivered as a configured YAML pipeline plus a UI mode, leveraging storage_paths.py to keep memory artefacts isolated per project. Source: ui/backend/storage_paths.py

Together, the three layers make the demo experience "significantly more stateful and personalized" — every chat turn is grounded in the user's identity, project context, and historical interactions. Source: GitHub Release v0.3.0.2

4. Pipeline Deployment as an API

A recurring community question is whether a tested pipeline can be exposed as a callable HTTP API, comparable to Dify. The top community issue (#95) asks: "Does it support wrapping a pipeline as an API that can be called, similar to Dify?" — confirming strong demand for productionization. Source: GitHub Issue #95

UltraRAG answers this through the same ui/backend/app.py layer used by the web IDE. The pipeline_manager.py module loads a YAML pipeline, instantiates the MCP client, and invokes the configured MCP servers. Because this invocation path is decoupled from the WebSocket/HTTP transport used by the SPA, the same manager can be exposed behind any HTTP route — effectively turning the pipeline into a callable service.

The typical deployment pattern is:

  1. Author the pipeline as YAML (using the visual builder or by hand).
  2. Configure a backend route in app.py that accepts a request body, hands it to pipeline_manager.py, and returns the pipeline's final output.
  3. Reuse the existing MCP servers under servers/* (Retriever, Generation, Corpus, Evaluation, Custom) — for example, the servers/retriever/src/retriever.py server already exposes FAISS and Milvus index backends plus Exa / Tavily / ZhipuAI web search backends, all addressable from the same API surface. Source: servers/retriever/src/retriever.py

For knowledge base isolation between API consumers, kb_visibility_store.py provides per-user access control so that the same deployed API can serve multiple tenants without leaking corpora. Source: ui/backend/kb_visibility_store.py

5. Common Failure Modes

  • Authentication required: With SQLite-backed auth enabled in v0.3.0.2, API callers must provide valid credentials; anonymous calls return 401 from auth.py.
  • Missing index backend: If the pipeline selects Milvus but pymilvus is not installed, the retriever raises ImportError. Source: servers/retriever/src/index_backends/milvus_backend.py
  • Missing web search dependency: Each web search backend (Exa, Tavily, ZhipuAI) lazily imports its client and raises ImportError if the optional dependency is missing. Source: servers/retriever/src/websearch_backends/__init__.py
  • Chunking backend unavailable: The corpus server requires chonkie and tiktoken; absence raises ToolError. Source: servers/corpus/src/corpus.py
  • Stale project paths: Moving the project directory invalidates the resolved paths from storage_paths.py and causes write failures.

See Also

Source: https://github.com/OpenBMB/UltraRAG / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

1. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/OpenBMB/UltraRAG

2. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG

3. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/OpenBMB/UltraRAG

4. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/OpenBMB/UltraRAG

5. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG

6. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 9

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using UltraRAG with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence