# https://github.com/OpenBMB/UltraRAG 项目说明书

生成时间：2026-06-17 05:11:41 UTC

## 目录

- [UltraRAG 框架总览与安装](#page-1)
- [MCP 服务架构与 Pipeline 编排（含 API 封装）](#page-2)
- [UltraRAG UI 可视化 RAG IDE](#page-3)
- [RAG 工作流、记忆系统与部署](#page-4)

<a id='page-1'></a>

## UltraRAG 框架总览与安装

### 相关页面

相关主题：[MCP 服务架构与 Pipeline 编排（含 API 封装）](#page-2)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)
- [servers/corpus/src/corpus.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/corpus/src/corpus.py)
- [servers/retriever/src/websearch_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/__init__.py)
- [servers/retriever/src/websearch_backends/base.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/base.py)
- [servers/retriever/src/websearch_backends/exa_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/exa_backend.py)
- [servers/retriever/src/websearch_backends/tavily_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/tavily_backend.py)
- [servers/retriever/src/websearch_backends/zhipuai_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/zhipuai_backend.py)
- [servers/retriever/src/index_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/__init__.py)
- [servers/retriever/src/index_backends/faiss_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/faiss_backend.py)
- [servers/retriever/src/index_backends/milvus_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/milvus_backend.py)
- [servers/evaluation/src/evaluation.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/evaluation/src/evaluation.py)
- [servers/custom/src/custom.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/custom/src/custom.py)
</details>

# UltraRAG 框架总览与安装

## 框架定位与设计理念

UltraRAG 是首个基于 [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) 架构设计的轻量级 RAG（Retrieval-Augmented Generation，检索增强生成）开发框架，由 THUNLP（清华大学）、NEUIR（东北大学）、OpenBMB 与 AI9stars 共同推出，定位为面向研究探索与工业原型阶段的基础设施。框架原生支持顺序、循环、条件分支等控制结构，开发者仅需编写 YAML 配置文件即可实现数十行代码级别的复杂迭代 RAG 逻辑。资料来源：[README.md]()

其核心设计思想是"原子化服务器 + 客户端编排"：将 RAG 核心组件（Retriever、Generation 等）标准化为独立的 **MCP Servers**，并由 **MCP Client** 承担工作流编排；新增功能只需注册为函数级 Tool，即可被任意 Pipeline 复用。资料来源：[README.md]()

## MCP 架构与核心模块

UltraRAG 的整体架构以 MCP Client 为编排中枢，连接多个原子化 MCP Server。下图展示了主要的服务节点与后端扩展点。

```mermaid
flowchart LR
  Client[MCP Client<br/>YAML 编排] -->|调用 Tool| S1[corpus Server<br/>文档切分]
  Client -->|调用 Tool| S2[retriever Server<br/>检索与索引]
  Client -->|调用 Tool| S3[evaluation Server<br/>指标计算]
  Client -->|调用 Tool| S4[custom Server<br/>业务工具]
  S2 --> WS[WebSearch Backends<br/>Exa / Tavily / ZhipuAI]
  S2 --> IDX[Index Backends<br/>FAISS / Milvus]
```

各 Server 的职责与可扩展点如下：

- **corpus**：基于 `chonkie` 与 `tiktoken` 提供 token、sentence、recursive 三种切分策略，结果默认输出至 `output/corpus/chunks.jsonl`，并支持在 chunk 文本中注入文档标题。资料来源：[servers/corpus/src/corpus.py]()
- **retriever**：通过工厂模式加载后端。`create_index_backend()` 支持 `faiss` 与 `milvus`，新增后端只需在 `_INDEX_BACKENDS` 映射表追加条目即可被自动发现。资料来源：[servers/retriever/src/index_backends/__init__.py]()、`create_websearch_backend()` 支持 `exa`、`tavily`、`zhipuai` 三种联网搜索后端，遵循相同的工厂注册模式。资料来源：[servers/retriever/src/websearch_backends/__init__.py]()
- **index 后端**：`FaissIndexBackend` 依赖 `faiss-cpu` 或 `faiss-gpu-cu12`，提供 `index_use_gpu` 与 `device_num` 参数控制 GPU 加速。资料来源：[servers/retriever/src/index_backends/faiss_backend.py]()、`MilvusIndexBackend` 通过 `pymilvus` 连接远程服务，使用 `uri` / `token` 鉴权并按集合名管理数据。资料来源：[servers/retriever/src/index_backends/milvus_backend.py]()
- **websearch 后端**：`TavilyWebSearchBackend` 强制要求 `TAVILY_API_KEY` 环境变量。资料来源：[servers/retriever/src/websearch_backends/tavily_backend.py]()、`ZhipuaiWebSearchBackend` 强制要求 `ZHIPUAI_API_KEY`，并支持 `search_engine`、`search_recency_filter` 等参数。资料来源：[servers/retriever/src/websearch_backends/zhipuai_backend.py]()、`ExaWebSearchBackend` 依赖 `exa_py` 异步客户端。资料来源：[servers/retriever/src/websearch_backends/exa_backend.py]()、所有搜索后端共用 `BaseWebSearchBackend` 的 `_parallel_search` 方法，通过 `asyncio.Semaphore` 控制并发。资料来源：[servers/retriever/src/websearch_backends/base.py]()
- **evaluation**：将指标结果按时间戳保存为 JSON 文件，并支持以 Markdown 表格形式即时回显，便于多轮实验对比。资料来源：[servers/evaluation/src/evaluation.py]()
- **custom**：内置如 `search_o1_extract_reasoning`、`search_o1_combine_final_information` 等业务级工具，供 Search-o1 范式 Pipeline 直接调用。资料来源：[servers/custom/src/custom.py]()

## 安装部署方式

UltraRAG 提供两种安装路径：本地源码安装（推荐使用 [uv](https://github.com/astral-sh/uv) 进行包管理）与 Docker 容器部署。资料来源：[README.md]()

源码安装的第一步是准备 `uv` 环境，二选一即可：

```shell
# 方式一：直接通过 pip 安装
pip install uv
# 方式二：下载官方安装脚本
curl -LsSf https://astral.sh/uv/install
```

框架同时保留了 `v1`、`v2` 等历史大版本分支，迁移或对比时可按需切换。资料来源：[README.md]()

值得注意的是，社区 Issue #95 反映了用户的典型诉求：希望把测试通过的 Pipeline 像 Dify 那样一键上线为可调用 API。当前仓库的核心定位仍是"研究实验框架 + 可视化 IDE"，源码中尚未内置 Pipeline-as-API 网关，部署为对外服务仍需用户自行在 MCP Client 外层封装 HTTP 接口，参考链接：[Issue #95](https://github.com/OpenBMB/UltraRAG/issues/95)。

## 核心能力总览

UltraRAG 通过四层能力覆盖从研究想法到演示交付的完整链路：流程编排（YAML 描述顺序/循环/条件分支）、功能扩展（基于 MCP 注册为原子 Tool）、评测对比（标准化指标 + 基线集成）、快速演示（Pipeline 一键转换为交互式 Web UI）。资料来源：[README.md]()

最新发布版本 v0.3.0.2 进一步引入了端到端记忆升级：包括用户级持久化记忆、项目级记忆检索、面向记忆的专属 RAG Demo，并基于 SQLite 提供鉴权、持久化聊天会话、昵称与模型设置管理，使 Demo 体验更具状态性与个性化。资料来源：[README.md]()

## 参见

- UltraRAG 服务端协议与 MCP 集成
- retriever 服务后端扩展指南
- evaluation 评测流程与指标体系
- custom 业务工具集与 Search-o1 范式

---

<a id='page-2'></a>

## MCP 服务架构与 Pipeline 编排（含 API 封装）

### 相关页面

相关主题：[UltraRAG 框架总览与安装](#page-1), [UltraRAG UI 可视化 RAG IDE](#page-3), [RAG 工作流、记忆系统与部署](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)
- [servers/corpus/src/corpus.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/corpus/src/corpus.py)
- [servers/retriever/src/index_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/__init__.py)
- [servers/retriever/src/index_backends/faiss_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/faiss_backend.py)
- [servers/retriever/src/index_backends/milvus_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/milvus_backend.py)
- [servers/retriever/src/websearch_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/__init__.py)
- [servers/retriever/src/websearch_backends/base.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/base.py)
- [servers/retriever/src/websearch_backends/exa_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/exa_backend.py)
- [servers/retriever/src/websearch_backends/tavily_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/tavily_backend.py)
- [servers/retriever/src/websearch_backends/zhipuai_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/zhipuai_backend.py)
- [servers/evaluation/src/evaluation.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/evaluation/src/evaluation.py)
- [servers/custom/src/custom.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/custom/src/custom.py)
- [servers/memory/src/memory.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/memory/src/memory.py)
</details>

# MCP 服务架构与 Pipeline 编排（含 API 封装）

## 一、整体架构与设计目标

UltraRAG 是首个基于 [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) 的轻量化 RAG 开发框架，由 THUNLP、NEUIR、OpenBMB、AI9stars 联合发布。其核心思想是把检索、生成、评估、记忆等核心能力拆解为**原子化 MCP Server**，并由 **MCP Client** 通过 YAML 工作流完成顺序、循环、条件分支等复杂控制结构的统一编排。开发者无需编写胶水代码即可"几十行 YAML 落地复杂迭代 RAG 逻辑"。`资料来源：[README.md:30-50]()`

每个原子 Server 都通过 `UltraRAG_MCP_Server` 工厂创建，并以函数级别（`@app.tool`）对外暴露工具。例如 `corpus` 服务器以 `chunk_documents` 工具提供分块能力，并在内部进一步把请求分派给 chonkie 库的 `TokenChunker` / `SentenceChunker` / `RecursiveChunker` 后端执行；其进程入口使用 `app.run(transport="stdio")` 启动标准 MCP 传输。`资料来源：[servers/corpus/src/corpus.py:80-180]()` `资料来源：[servers/corpus/src/corpus.py:220-260]()`

## 二、Pipeline 编排机制

Pipeline 的核心是**声明式 YAML + 客户端控制流**：节点代表 Server，节点内的 step 描述具体 Tool 与 args，编排层在 `sequential` / `loop` / `branch` 等原语中调度执行。`资料来源：[README.md:18-30]()` 这意味着同一份 YAML 可以横向替换后端实现，而不需要改动业务逻辑。

对于检索与索引这一类具有多种实现的功能，UltraRAG 采用**后端注册表模式**。`create_index_backend("faiss", ...)` 工厂函数会从 `_INDEX_BACKENDS` 字典中按名称加载具体实现，目前内置 `"faiss"` 和 `"milvus"` 两类索引。`资料来源：[servers/retriever/src/index_backends/__init__.py:9-19]()` `资料来源：[servers/retriever/src/index_backends/__init__.py:21-50]()` 在 Pipeline YAML 中只需切换 `backend` 字段，编排层即可在同一组工具签名下复用。

下图展示了 Server 注册、客户端编排与具体后端之间的关系：

```mermaid
flowchart LR
    YAML[YAML Pipeline] --> Client[MCP Client<br/>Orchestrator]
    Client -->|invoke tool| Corpus[corpus server]
    Client -->|invoke tool| Retriever[retriever server]
    Client -->|invoke tool| Memory[memory server]
    Client -->|invoke tool| Eval[evaluation server]
    Retriever -->|factory| IndexFaiss[FAISS backend]
    Retriever -->|factory| IndexMilvus[Milvus backend]
    Retriever -->|factory| WSExa[Exa websearch]
    Retriever -->|factory| WSTavily[Tavily websearch]
    Retriever -->|factory| WSZhipu[ZhipuAI websearch]
```

## 三、API 封装：把 Pipeline 变成可调用服务

社区 [Issue #95](https://github.com/OpenBMB/UltraRAG/issues/95) 反复被问到"如何将 Pipeline 封装成类似 Dify 的可调用 API"。UltraRAG 的设计天然支持该用法，关键路径有三条：

1. **UI 一键导出**：`UltraRAG UI` 自带 Pipeline Builder，提供"Canvas 画布 ↔ 代码编辑"双向同步，并支持将 Pipeline 逻辑一键转换为交互式 Web UI。`资料来源：[README.md:55-72]()`
2. **MCP 标准传输**：每个原子 Server 默认通过 `stdio` 传输运行，因此只要将其挂在任意 MCP 兼容网关（例如 FastAPI / MCP-HTTP）后即可对外暴露为 HTTP API。`资料来源：[servers/corpus/src/corpus.py:255-262]()`
3. **v0.3.0.2 引入的有状态会话**：该版本带来 SQLite 身份认证、持久化会话、昵称与模型设置管理；`servers/memory` 通过 `_ensure_user_memory_paths` 持久化用户级 `MEMORY.md` 与项目级记忆检索，使 API 在多轮调用中保持个性化上下文。`资料来源：[servers/memory/src/memory.py:28-55]()` `资料来源：[servers/memory/src/memory.py:55-90]()`

> 推荐封装流程：先在 `ultrarag run pipeline.yaml` 下完成本地验证 → 将 `servers/*` 注册到 MCP-HTTP 网关或直接使用 UltraRAG UI → 对外提供 OpenAI 兼容的 `/chat` 端点，并使用 `memory` Server 维持会话状态。

## 四、扩展点与常见失败模式

扩展一个新能力时，开发者通常新增 `servers/<name>/src/<name>.py`，实例化 `UltraRAG_MCP_Server` 并通过 `@app.tool(output="a,b->c")` 声明工具签名；返回值会被序列化到流水线下一节点的入参。例如 `custom` 服务器中的 `search_o1_extract_final_information` 以 `**Final Information**` 标记切分答案文本。`资料来源：[servers/custom/src/custom.py:1-60]()` `资料来源：[servers/custom/src/custom.py:60-110]()`

注册表中常见后端在缺失依赖或环境变量时，会打印以下典型日志，可作为排查依据：

- **索引后端**：`faiss` 缺失 → `faiss is not installed. Please install it with pip install faiss-cpu` `资料来源：[servers/retriever/src/index_backends/faiss_backend.py:18-35]()`；`milvus` 缺失 → `pymilvus is not installed. Install it with pip install pymilvus` `资料来源：[servers/retriever/src/index_backends/milvus_backend.py:21-38]()`。
- **Web 搜索后端**：`exa` 缺失 → `exa_py is not installed` `资料来源：[servers/retriever/src/websearch_backends/exa_backend.py:12-22]()`；`tavily` 缺失 → `tavily is not installed. Please install it with tavily-python` `资料来源：[servers/retriever/src/websearch_backends/tavily_backend.py:13-30]()`；`zhipuai` 在 `ZHIPUAI_API_KEY` 未设置时打印 `ZHIPUAI_API_KEY environment variable is not set.` `资料来源：[servers/retriever/src/websearch_backends/zhipuai_backend.py:18-45]()`。

并发与调度方面，索引与搜索后端都继承自 `BaseIndexBackend` / `BaseWebSearchBackend`，并行检索通过 `asyncio.Semaphore` 与 `retrieve_thread_num` 协作控制，配置项在编排层透明透传。`资料来源：[servers/retriever/src/websearch_backends/base.py:18-50]()` 评估服务器 `evaluation` 在保存结果时按时间戳命名 `*_YYYYMMDD_HHMMSS.json`，并把 `avg_*` 指标渲染成 Markdown 表格，便于人工对比实验。`资料来源：[servers/evaluation/src/evaluation.py:30-80]()` `资料来源：[servers/evaluation/src/evaluation.py:80-130]()`

---

**See Also**：

- 知识库构建与分块策略 → `corpus` 服务器文档
- 多模态 MinerU 解析 → `corpus` 服务器中 MinerU 相关工具
- 评估与基准对比 → `evaluation` 服务器文档
- 持久化记忆与个性化 → `memory` 服务器文档（v0.3.0.2）

---

<a id='page-3'></a>

## UltraRAG UI 可视化 RAG IDE

### 相关页面

相关主题：[MCP 服务架构与 Pipeline 编排（含 API 封装）](#page-2), [RAG 工作流、记忆系统与部署](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [ui/frontend/README.md](https://github.com/OpenBMB/UltraRAG/blob/main/ui/frontend/README.md)
- [ui/backend/app.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/app.py)
- [ui/backend/auth.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/auth.py)
- [ui/backend/chat_store.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/chat_store.py)
- [ui/backend/kb_visibility_store.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/kb_visibility_store.py)
- [ui/backend/pipeline_manager.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/pipeline_manager.py)
- [ui/backend/storage_paths.py](https://github.com/OpenBMB/UltraRAG/blob/main/ui/backend/storage_paths.py)
- [README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)
</details>

# UltraRAG UI 可视化 RAG IDE

## 概述与设计目标

UltraRAG UI 是一套**可视化 RAG 集成开发环境 (IDE)**，由 [README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md) 中的官方说明所定义。它突破了传统聊天界面的边界，将**编排 (Orchestration)、调试 (Debugging) 和演示 (Demonstration)** 三者融合在同一 Web 界面中，主要面向需要快速将 RAG 流水线 (Pipeline) 落地的开发者与终端用户。

核心设计目标包括三点：

- **零代码编排复杂 RAG 流程**：通过内置的 Pipeline Builder 实现"画布搭建"与"YAML 代码编辑"的双向实时同步，开发者可对参数与提示词做细粒度的在线调整。
- **结构化调试**：提供可视化的 Case Study 界面，逐步跟踪工作流的中间输出，便于结果归因与错误定位。
- **一键交付**：仅需一条命令，即可将 Pipeline 逻辑转换为可交互的 Web 对话界面，缩短"算法 → 演示"链路。

v0.3.0.2 版本进一步引入了**端到端记忆升级**，包括持久化用户记忆、项目级记忆检索、基于 SQLite 的身份认证、聊天会话持久化、昵称与模型设置管理等能力 ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))。

## 架构组成

UI 系统由**前端单页应用**与**后端服务进程**两部分构成，二者通过标准 HTTP API 通信。后端基于 `ultrarag show ui` 命令启动，监听端口并同时托管前端静态资源。

```mermaid
flowchart LR
    User[开发者 / 终端用户] -->|浏览器访问| FE[React 19 前端 SPA]
    FE -->|REST API| BE[ultrarag show ui 后端]
    BE --> Auth[auth.py<br/>SQLite 身份认证]
    BE --> PM[pipeline_manager.py<br/>Pipeline 生命周期]
    BE --> CS[chat_store.py<br/>会话持久化]
    BE --> KB[kb_visibility_store.py<br/>知识库可见性]
    BE --> SP[storage_paths.py<br/>存储路径解析]
    PM --> MCP[MCP Server 进程组<br/>Retriever / Generation / Corpus]
    MCP --> Milvus[(Milvus 向量库)]
    MCP --> LLM[LLM 生成模型]
    FE -.静态资源.-> Dist[ui/frontend/dist]
    BE -.托管.-> Dist
```

各后端模块职责清晰解耦：

| 模块 | 职责 |
|------|------|
| `app.py` | Web 服务入口，路由分发与请求处理 |
| `auth.py` | 基于 SQLite 的用户注册、登录与会话校验 |
| `chat_store.py` | 持久化聊天历史与项目记忆 (Project Memory) |
| `kb_visibility_store.py` | 知识库的访问权限与可见性管理 |
| `pipeline_manager.py` | Pipeline 的加载、运行、停止与状态监控 |
| `storage_paths.py` | 统一解析 SQLite、索引、上传文件等存储路径 |

前端栈使用 **React 19 + TypeScript + Vite**，状态管理采用 **Zustand**，数据请求采用 **TanStack Query** ([ui/frontend/README.md:1-15](https://github.com/OpenBMB/UltraRAG/blob/main/ui/frontend/README.md))。

## 核心功能特性

### 1. 可视化 Pipeline Builder
基于 MCP 架构，Retriever、Generation、Corpus 等核心 RAG 组件被标准化为**独立的 MCP Server** ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))。在 UI 中，开发者通过拖拽节点即可组合出**顺序、循环、条件分支**等控制结构，而无需手动维护 Python 代码。所有画布变更会反向同步到对应 YAML 配置文件 ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))。

### 2. 智能 AI 助手
UI 嵌入了覆盖**全开发生命周期**的 AI 助手，可在编排、调参、排错等环节提供上下文相关的建议。

### 3. 管理员模式 (Admin Mode)
通过管理员模式可配置**检索器 (Retriever)、生成模型 (LLM)、Milvus 向量库**等基础设施参数 ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))，便于在多环境间切换后端服务。

### 4. 端到端记忆能力
v0.3.0.2 起，UI 支持**持久化用户记忆**与**项目级记忆检索**，并提供专门的 memory-aware RAG Demo。会话、昵称、模型偏好等设置均会落盘到 SQLite ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))。

## 部署与集成

### 启动方式
后端通过 `ultrarag show ui` 命令启动，监听指定地址与端口，例如：

```bash
ultrarag show ui --host 127.0.0.1 --port 5050
```

启动后，命令会自动从解析出的前端目录提供静态资源 ([ui/frontend/README.md:30-45](https://github.com/OpenBMB/UltraRAG/blob/main/ui/frontend/README.md))。解析优先级为：

1. 环境变量 `ULTRARAG_FRONTEND_DIR` 指定的绝对路径 (最高优先级)
2. 仓库默认路径 `ui/frontend/dist`

约定 `dist/` 目录被提交到版本库，因此用户无需本地构建前端工具链即可直接运行 ([ui/frontend/README.md:30-45](https://github.com/OpenBMB/UltraRAG/blob/main/ui/frontend/README.md))。

### 前端开发模式
进行前端源码改动时，需先构建再提交：

```bash
cd ui/frontend
npm install
npm run dev      # 启动 Vite 开发服务器
npm run build    # 生成生产构建至 dist/
npm run check    # lint + typecheck + build
```

Vite 开发服务器将 `/api` 请求代理到 `http://127.0.0.1:5050` ([ui/frontend/README.md:20-30](https://github.com/OpenBMB/UltraRAG/blob/main/ui/frontend/README.md))。

### 部署 Deep Research 等 Demo
官方文档展示了"Flagship Case"——基于 AgentCPM-Report 的 Deep Research Pipeline，可自动多步检索并生成万字级调研报告 ([README.md:1-40](https://github.com/OpenBMB/UltraRAG/blob/main/README.md))。该模式充分利用了 `pipeline_manager.py` 对长时任务的调度能力。

## 社区关注与扩展方向

社区中关于 **"将 Pipeline 封装为可调用 API"** (类似 Dify 的调用方式) 的讨论 [#95](https://github.com/OpenBMB/UltraRAG/issues/95) 反映出用户对**生产化部署**的强烈需求。当前 UI 本身已经具备 Web 交互形态，但若要将其作为外部系统可集成的 API 网关，需要在 `app.py` 中新增 RESTful 端点并复用 `pipeline_manager.py` 的运行能力，这是项目后续可演进的方向之一。

## See Also

- [UltraRAG MCP 架构与服务器注册](mcp-architecture.md)
- [Retriever 服务器：索引与 Web 检索后端](retriever-server.md)
- [Corpus 服务器：文档分块与加载](corpus-server.md)
- [Custom 服务器：Search-o1 等自定义工具](custom-server.md)

---

<a id='page-4'></a>

## RAG 工作流、记忆系统与部署

### 相关页面

相关主题：[MCP 服务架构与 Pipeline 编排（含 API 封装）](#page-2), [UltraRAG UI 可视化 RAG IDE](#page-3)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)
- [servers/corpus/src/corpus.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/corpus/src/corpus.py)
- [servers/retriever/src/index_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/__init__.py)
- [servers/retriever/src/index_backends/faiss_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/faiss_backend.py)
- [servers/retriever/src/index_backends/milvus_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/milvus_backend.py)
- [servers/retriever/src/websearch_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/__init__.py)
- [servers/retriever/src/websearch_backends/base.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/base.py)
- [servers/retriever/src/websearch_backends/exa_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/exa_backend.py)
- [servers/retriever/src/websearch_backends/tavily_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/tavily_backend.py)
- [servers/retriever/src/websearch_backends/zhipuai_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/zhipuai_backend.py)
- [servers/evaluation/src/evaluation.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/evaluation/src/evaluation.py)
- [servers/custom/src/custom.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/custom/src/custom.py)
- [examples/experiments/ircot.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/ircot.yaml)
- [examples/experiments/iterretgen.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/iterretgen.yaml)
- [examples/experiments/rankcot.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/rankcot.yaml)
- [examples/experiments/search_r1.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/search_r1.yaml)
- [examples/experiments/r1_searcher.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/r1_searcher.yaml)
- [examples/experiments/search_o1.yaml](https://github.com/OpenBMB/UltraRAG/blob/main/examples/experiments/search_o1.yaml)
</details>

# RAG 工作流、记忆系统与部署

## 概述
UltraRAG 是由清华 THUNLP、东北大学 NEUIR、OpenBMB 与 AI9stars 联合推出的轻量级 RAG 开发框架，也是首个把核心 RAG 组件按照 [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) 架构解耦的实现。它将 Retriever、Corpus、Generation、Evaluation、Custom 等模块封装为独立的 MCP Server，开发者通过 YAML 文件即可编排顺序、循环与条件分支等复杂控制结构，从而把研究实验、UI 演示与代码集成三类典型用法统一在同一套底座之上（资料来源：[README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)）。

## 核心 MCP Server 与可插拔后端
- **Corpus Server**：调用 chonkie 库提供 token、sentence、recursive 三种分块策略，可选把 `Title` 拼接到 chunk 内容里，最终以 JSONL 形式落盘，便于后续索引（资料来源：[servers/corpus/src/corpus.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/corpus/src/corpus.py)）。
- **Retriever Server**：通过工厂模式动态加载后端。索引侧支持 FAISS 与 Milvus，两者都实现 `BaseIndexBackend` 接口，可按 `collection_name`、`uri` 等配置项切换（资料来源：[servers/retriever/src/index_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/__init__.py)、[servers/retriever/src/index_backends/faiss_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/faiss_backend.py)、[servers/retriever/src/index_backends/milvus_backend.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/index_backends/milvus_backend.py)）。网络搜索侧同样用工厂模式统一接入 Exa、Tavily 与智谱三家供应商（资料来源：[servers/retriever/src/websearch_backends/__init__.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/__init__.py)），并由抽象基类 `_parallel_search` 统一处理并发信号量、tqdm 进度条与重试退避（资料来源：[servers/retriever/src/websearch_backends/base.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/retriever/src/websearch_backends/base.py)）。
- **Evaluation Server**：把指标结果按时间戳落盘为 JSON 文件，并可选择以 Markdown 表格形式打印所有 `avg_` 前缀的均值指标（资料来源：[servers/evaluation/src/evaluation.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/evaluation/src/evaluation.py)）。
- **Custom Server**：集中暴露论文级工具函数。例如 `search_o1_extract_reason`、`search_o1_extract_final_information`、`search_o1_combine_final_information` 三件套用于 Search-o1；`iterretgen_nextquery` 把上一轮 query 与 answer 拼接为下一轮 query；`output_extract_from_boxed` 解析 LaTeX `\boxed{}` 答案；`surveycpm_clean_title` 与多段 Markdown 清洗函数则服务于 Deep Research 场景下的长报告生成（资料来源：[servers/custom/src/custom.py](https://github.com/OpenBMB/UltraRAG/blob/main/servers/custom/src/custom.py)）。

## YAML 工作流编排与实验流水线
`examples/experiments/` 下的每个 YAML 文件都对应一种已发表的 RAG 范式。工具通过 `@app.tool(output="a,b->c")` 显式声明输入输出变量，使变量能够在不同 Server 之间按类型安全地流动。下表给出示例工作流与所依赖的核心工具的对应关系：

| 流水线 YAML | 对应范式 | 依赖的关键 Custom 工具 |
| --- | --- | --- |
| `ircot.yaml` | IRCoT（Interleaved Retrieval with CoT） | 检索-推理交错工具集 |
| `iterretgen.yaml` | Iter-Retgen | `iterretgen_nextquery` |
| `search_o1.yaml` | Search-o1 | `search_o1_extract_reason` / `search_o1_extract_final_information` / `search_o1_combine_final_information` |
| `search_r1.yaml`、`r1_searcher.yaml` | Search-R1 强化学习 | 检索与重排相关工具 |
| `rankcot.yaml` | RankCoT | 排序与 CoT 相关工具 |

这种"Server 工具 + YAML 编排"的方式让研究者在数十行 YAML 中就能复现一篇论文的核心迭代逻辑，而无需在 Python 里硬编码调用栈。

```mermaid
graph TD
    YAML[YAML 流水线配置] --> Client[MCP Client 编排器]
    Client --> Corpus[Corpus Server]
    Client --> Retriever[Retriever Server]
    Client --> Generation[Generation Server]
    Client --> Custom[Custom Server]
    Client --> Eval[Evaluation Server]
    Retriever --> Index[FAISS / Milvus]
    Retriever --> Web[Exa / Tavily / 智谱]
    Custom --> Tools[search_o1_* / iterretgen_nextquery / boxed 解析 / surveycpm 清洗]
```

## 记忆系统（v0.3.0.2 端到端升级）
最新发布 v0.3.0.2 围绕"记忆"做了端到端升级：引入持久化用户记忆与项目级记忆检索，提供专门的 memory-aware RAG Demo；UI 层新增基于 SQLite 的身份认证、持久化聊天会话、昵称与模型设置管理，使演示体验更加状态化与个性化。本次提供的源码片段中未直接包含这部分实现，但作为项目近期最重要的演进方向，建议在仓库的 memory 相关目录与 release notes 中继续查阅。

## 部署形态与 Pipeline API 化
README 推荐使用 [uv](https://github.com/astral-sh/uv) 管理 Python 环境与依赖，并提供 Docker 容器化部署方案。UltraRAG UI 是一体化的可视化 RAG IDE，内置 Pipeline Builder，支持画布与代码的双向实时同步，以及参数、Prompt 的在线调节，并引入 AI 助手辅助调试（资料来源：[README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)）。

社区 Issue #95 中有用户提出"如何把 pipeline 封装成类似 Dify 的可调用 API"。在当前架构下，UI 本身就是把 YAML 流水线转化为可交互 Web 界面的途径；由于 MCP Client 与 Server 天然解耦，理论上可以在 MCP Client 之上再封装一层 HTTP/JSON 网关，把任意 YAML 流水线作为后端服务对外暴露。具体的 API 化方案建议参考 [UI 部署指南](https://ultrarag.openbmb.cn/pages/en/ui/prepare) 与社区的后续讨论。

## 参见
- 快速开始与实验流水线目录：[examples/experiments/](https://github.com/OpenBMB/UltraRAG/tree/main/examples/experiments)
- UI 部署指南：[ultrarag.openbmb.cn/pages/en/ui/prepare](https://ultrarag.openbmb.cn/pages/en/ui/prepare)
- 论文与引用列表：[README.md](https://github.com/OpenBMB/UltraRAG/blob/main/README.md)
- 社区讨论：Issue #95《请问是否支持将 pipeline 封装成 API，类似 Dify 这种支持被调用的？》

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Doramagic 踩坑日志

项目：OpenBMB/UltraRAG

摘要：发现 6 个潜在踩坑项，其中 0 个为 high/blocking；最高优先级：能力坑 - 能力判断依赖假设。

## 1. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 证据：capability.assumptions | https://github.com/OpenBMB/UltraRAG | README/documentation is current enough for a first validation pass.

## 2. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 证据：evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG | last_activity_observed missing

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 证据：downstream_validation.risk_items | https://github.com/OpenBMB/UltraRAG | no_demo; severity=medium

## 4. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 证据：risks.scoring_risks | https://github.com/OpenBMB/UltraRAG | no_demo; severity=medium

## 5. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 证据：evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG | issue_or_pr_quality=unknown

## 6. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 证据：evidence.maintainer_signals | https://github.com/OpenBMB/UltraRAG | release_recency=unknown

<!-- canonical_name: OpenBMB/UltraRAG; human_manual_source: deepwiki_human_wiki -->