Doramagic Project Pack · Human Manual
ragflow-plus
ragflow-plus is a Doramagic preview pack compiled from public project evidence and validation signals.
Project Overview & High-Level Architecture
Related topics: Backend Services: API & Management Server, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG), Frontend Apps, Deployment & Troubleshooting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Backend Services: API & Management Server, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG), Frontend Apps, Deployment & Troubleshooting
Project Overview & High-Level Architecture
Purpose and Scope
ragflow-plus is an extended, plus-edition distribution of the open-source RAG (Retrieval-Augmented Generation) framework ragflow. The repository bundles three upstream building blocks:
- The
ragflowcore engine for ingestion, chunking, embedding, retrieval, and chat orchestration. - The v3-admin-vite Vue 3 / Vite / Element Plus admin template that powers the backend management console (see management/web/package.json).
- The minerU document parser that augments the default RAG extractors with stronger OCR/PDF/layout analysis.
As stated in the project's README.md, ragflow-plus inherits its AGPLv3 license from the upstream ragflow engine. The project's mandate is therefore to add features and UI affordances on top of the upstream API surface — not to re-implement the retrieval pipeline.
The two shipped UIs serve distinct audiences:
| UI Module | Tech Stack | Audience | Source |
|---|---|---|---|
web/ | Umi + React + Ant Design Pro | End-user chat / knowledge base | web/package.json |
management/web/ | Vue 3 + Vite + Element Plus | Administrators / operators | management/web/package.json |
The API server lives under api/apps/sdk/ and exposes a Flask-style SDK documented inline via Swagger/OpenAPI docstrings (e.g. api/apps/sdk/doc.py).
High-Level Architecture
flowchart LR
subgraph Clients
A1[End-user Web<br/>Umi + React]
A2[Admin Console<br/>Vue 3 + Vite]
end
subgraph API Layer
B1[api/apps/sdk/doc.py<br/>chunks, retrieval]
B2[api/apps/sdk/dataset.py<br/>knowledge bases]
B3[api/apps/sdk/session.py<br/>chat completions]
end
subgraph Core Engine
C1[ragflow core<br/>embedding + retrieval]
C2[minerU<br/>document parsing]
end
D1[(Elasticsearch /<br/>docStoreConn)]
D2[(Object storage<br/>files)]
A1 -->|REST| B1
A1 -->|REST| B3
A2 -->|REST /api/v1/files| B2
B1 --> C1
B3 --> C1
B1 --> C2
C1 --> D1
C2 --> D2
B2 --> D1
B2 --> D2The SDK layer is the only public contract: it delegates to ragflow services (KnowledgebaseService, DocumentService, FileService, File2DocumentService) and to the document store connection (settings.docStoreConn). For example, dataset listing in api/apps/sdk/dataset.py reads pagination parameters (page, page_size, orderby, desc) and returns renames kb_id → dataset_id, parser_id → chunk_method to present a stable external vocabulary.
Key Components and Module Layout
SDK / API surface
The Flask blueprint manager (registered via @manager.route(...)) defines every HTTP endpoint. Endpoints are grouped by resource:
- Datasets — CRUD and bulk-delete flows, with per-tenant access checks (
KnowledgebaseService.accessible). See api/apps/sdk/dataset.py. - Documents — upload, parse trigger, list, delete, status mapping (
run: 0=UNSTART, 1=RUNNING, 2=CANCEL, 3=DONE, 4=FAIL). See api/apps/sdk/doc.py. - Chunks — add, update, delete, list, and retrieval. Chunk update rebuilds
content_ltksandcontent_sm_ltksviarag_tokenizerfor hybrid search re-indexing. - Sessions / Chat — completion endpoint plus a
/related questionshelper that uses an LLM (chat_mdl.chat) to expand the user's keywords into 5–10 related search terms. See api/apps/sdk/session.py. - Files — direct file API consumed by the admin console (see management/web/src/common/apis/files/index.ts).
Retrieval enforces that all queried datasets share a single embedding model, returning a DATA_ERROR if multiple embd_id values are detected — see api/apps/sdk/doc.py.
Admin console
The management/web module is largely stock v3-admin-vite, decorated with project-specific composables such as useWatermark (management/web/src/common/composables/useWatermark.ts), which attaches a defensive DOM-watched watermark to deter screenshot leakage. Knowledge-base TypeScript contracts (management/web/src/common/apis/kbs/type.ts) and file contracts (management/web/src/common/apis/files/type.ts) define the boundary between the admin UI and the SDK.
Configuration knobs
settings.docStoreConn— pluggable document store (Elasticsearch is the canonical backend).TenantLLMService.split_model_name_and_factory— strips vendor suffixes before comparing embedding models.rag_tokenizer.fine_grained_tokenizevstokenize— controls sparse (content_sm_ltks) and full (content_ltks) token streams used in hybrid retrieval ranking.
Community-Reported Issues and Known Pitfalls
Several recurring issues surface from the issue tracker and align with how the architecture is wired today:
- Chunk coherence (#180): "一个标题与答案被分为两个chunk,chunk之间无法关联." Chunk boundaries are produced by the parser/embedding chain, not the API. Operators can post-process via the chunk update endpoint in api/apps/sdk/doc.py, which re-emits both
content_ltksandcontent_sm_ltksafter editingcontent_with_weight. v0.5.0 release notes further loosen embedding dimensionality constraints, reducing unrelated-chunk scoring artefacts. - Login flood (#257):
MaxConnectionsExceededafter creating many per-account assistants points at downstream Elasticsearch / database pool sizing — out of scope for the SDK but relevant for deployment. - File parsing stuck at 40% (#255): Almost always a stalled minerU subprocess or stuck download task; check worker logs and the
runstatus mapping in the documents endpoint. - GPU host compatibility (#254): Affects minerU acceleration and is resolved by the host driver / CUDA image, not by application code.
- English UI gap (#256):
management/webonly ships Chinese locale strings today.
See Also
- Knowledge Base & Dataset API
- Document Ingestion & Chunking Pipeline
- Retrieval, Ranking & Hybrid Search
- Admin Console Operations
- Deployment & Runtime Requirements
Source: https://github.com/zstar1003/ragflow-plus / Human Manual
Backend Services: API & Management Server
Related topics: Project Overview & High-Level Architecture, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG), Frontend Apps, Deployment & Troubleshooting
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview & High-Level Architecture, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG), Frontend Apps, Deployment & Troubleshooting
Backend Services: API & Management Server
The ragflow-plus project is a Retrieval-Augmented Generation (RAG) platform built on top of upstream RAGFlow, and it exposes its capabilities through two cooperating backend layers: a Flask-based HTTP API (the "API server") under api/ and a Vue 3 administrative console (the "Management server") under management/web/. Together they form the control plane for tenants, datasets, documents, files, and chat assistants, while a third React-based client (the "user-facing web" under web/) consumes the same HTTP API to deliver the conversational experience.
1. Architecture Overview
The API server is the single source of truth for business logic. It registers Flask blueprints through the manager and user_app route prefixes defined in api/apps/sdk/doc.py and api/apps/user_app.py. These blueprints expose SDK-style endpoints such as /api/v1/datasets, /api/v1/documents, /api/v1/chunks, and /sessions/related_questions. The management frontend calls those endpoints through an axios-based client.
flowchart LR Browser[Admin Browser] --> MW[Management Web<br/>Vue3 + Element Plus] User[End User] --> UW[User Web<br/>Umi + React] MW -->|HTTPS/JSON| API[Flask API Server<br/>api/apps/sdk] UW -->|HTTPS/JSON| API API --> SVC[Service Layer<br/>DocumentService / KnowledgebaseService] SVC --> DB[(MySQL / Elasticsearch)] API --> FS[(Object / File Storage)]
The api/apps/system_app.py module wires system-level routes (health checks, user/tenant administration) into the same Flask application, while api/apps/sdk/session.py supplies streaming chat-completion and related-question endpoints that both web clients reuse.
2. API Server: SDK Endpoints
The SDK layer under api/apps/sdk/ exposes the public REST surface. Three modules dominate it:
- Dataset management in api/apps/sdk/dataset.py handles CRUD for knowledge bases, including deletion cascading through
DocumentService.queryandFileService.filter_deleteto clean upfile2documentmappings. - Document & chunk management in api/apps/sdk/doc.py provides add/list/retrieve/delete endpoints for documents, plus chunk-level operations such as
add_chunk,update_chunk,rm_chunk, andlist_chunks. The chunk payload schema includescontent,important_keywords,available, andimage_id, which is what powers the new image-set preview feature shipped in v0.5.0. - Chat & sessions in api/apps/sdk/session.py implements the
/chatbots/<dialog_id>/completionsstreaming endpoint (Server-Sent Events) and the/sessions/related_questionshelper that asks an LLM (LLMBundle(tenant_id, LLMType.CHAT)) to expand a user query into 5-10 related search terms.
Every protected route is wrapped in @token_required and resolves tenant_id before delegating to a service-layer method. Access checks follow the pattern KnowledgebaseService.accessible(kb_id=..., user_id=tenant_id), which is why cross-tenant operations always return You don't own the dataset <id> errors.
| Endpoint group | File | Typical operations |
|---|---|---|
/api/v1/datasets | api/apps/sdk/dataset.py | Create, list, delete knowledge bases |
/api/v1/documents, /api/v1/chunks | api/apps/sdk/doc.py | Upload, list, parse, retrieve, update, delete chunks |
/chatbots/.../completions, /sessions/related_questions | api/apps/sdk/session.py | SSE chat streaming, query expansion |
/user_app/... | api/apps/user_app.py | Login, registration, profile |
A common source of confusion discussed in issue #180 — where text parsed with MinerU + bge-m3 yields disconnected chunks (a title split from its body) — can be mitigated by using the chunk update endpoint exposed here, since it accepts a content payload that bypasses re-parsing and re-indexes the chunk directly through docStoreConn.
3. Management Server (Admin Console)
The management server is a standalone SPA generated from the v3-admin-vite template, as declared in management/web/package.json. Its dependency stack — Vue 3, Vite, TypeScript, Element Plus 2.9, Pinia, and axios — is what gives administrators the user-management, file-management, and knowledge-base dashboards.
Login is restricted to two roles, modeled in management/web/src/pages/login/apis/type.ts: the LoginRequestData interface accepts only "admin" | "editor" usernames, returning a bearer token that the axios client injects into every subsequent request.
Two API client modules illustrate how the UI talks to the backend:
- management/web/src/common/apis/files/index.ts wraps
/api/v1/filesfor listing and/api/v1/files/{id}/downloadfor streaming downloads with progress callbacks andCancelTokensupport. - management/web/src/common/apis/kbs/document.ts wires document lifecycle calls:
parse(with a 60 000-second timeout for large files),chunks, andstatusupdates.
Security features include a defensive watermark composable in management/web/src/common/composables/useWatermark.ts, which uses Mutation and Resize observers to detect when an attacker tries to remove or hide the overlay.
4. Cross-Cutting Concerns
Connection limits. Issue #257 reports MaxConnectionsExceeded after sequentially logging in roughly ten accounts from a single host. Because the management client keeps the bearer token in storage and reuses a shared axios instance, simultaneous tokens accumulate open connections to the API server. Operators should either reduce keep-alive timeouts on a reverse proxy, raise the API worker's pool size, or ensure each admin session ends with an explicit logout that drops the token.
Internationalization. Issue #256 asks for an English option on the user-management page. The login types are language-neutral, so the limitation lives entirely in the Element Plus locale provider used by the management SPA — adding an el-config-provider with an English locale bundle resolves it without any backend change.
Error handling. Both server sides share a common response envelope produced by get_error_data_result / get_result. Frontend code surfaces message strings directly, so localized copy changes need to be paired with consistent backend error wording.
See Also
- RAG Pipeline & Document Parsing — covers MinerU, bge-m3, and chunk quality (#180).
- Deployment notes for GPU workers referenced in issue #254.
- v0.5.0 release notes for the image-set preview and cross-language retrieval features.
Source: https://github.com/zstar1003/ragflow-plus / Human Manual
Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG)
Related topics: Project Overview & High-Level Architecture, Backend Services: API & Management Server, Frontend Apps, Deployment & Troubleshooting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview & High-Level Architecture, Backend Services: API & Management Server, Frontend Apps, Deployment & Troubleshooting
Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG)
1. Purpose and Scope
The Document Parsing, Knowledge Chunking & Retrieval subsystem is the core of the RAG pipeline in ragflow-plus. It transforms raw user-uploaded files (PDF, DOCX, images, etc.) into structured, searchable knowledge chunks stored in a document store, and exposes both synchronous and streaming retrieval endpoints for downstream chat assistants. The feature set is delivered across three surfaces:
- The HTTP OpenAPI defined in
api/apps/sdk/doc.pyandapi/apps/sdk/session.py. - The Python SDK shipped under
sdk/python/ragflow_sdk/for programmatic access. - The management web frontend that calls these APIs through the typed wrappers in
management/web/src/common/apis/.
The release notes for v0.5.0 highlight a tightening of this pipeline: image-set preview per chunk, cross-language retrieval, the relaxation of the 1024-dim parser-model constraint, and per-document parsing progress reporting. Source: README.md:1-1
2. End-to-End Pipeline
A document passes through four stages before it can be retrieved. The web frontend orchestrates these calls.
2.1 Upload and Trigger Parsing
A user selects files in the management UI; the request lands on the upload route, which writes them to a Knowledgebase and dispatches a background parse job. The frontend wrapper triggers parsing with an extended timeout to accommodate large documents:
// management/web/src/common/apis/kbs/document.ts
export function runDocumentParseApi(id: string) {
return request({
url: `/api/v1/knowledgebases/documents/${id}/parse`,
method: "post",
timeout: 60000000 // document parse timeout
})
}
Source: management/web/src/common/apis/kbs/document.ts:1-1
2.2 Document Listing and Status Mapping
The list endpoint returns documents with a numeric run field that the API renames into human-readable status strings (UNSTART, RUNNING, CANCEL, DONE, FAIL) before returning to the client. Source: api/apps/sdk/doc.py:1-1 The same mapping appears in both the dataset-document and single-document list paths, ensuring consistent state display in the UI.
2.3 Chunk Browsing
The frontend retrieves a paginated list of chunks for a given document using the /api/v1/chunks route, forwarding currentPage, size, and optional content filter to the backend. Source: management/web/src/common/apis/kbs/document.ts:1-1
3. Chunk Management
Chunks are the atomic retrieval unit. The SDK exposes three mutation primitives through the HTTP layer and mirrors them in Python.
3.1 Add / Update / Delete Chunks
The HTTP handlers in api/apps/sdk/doc.py enforce ownership (KnowledgebaseService.accessible) and then persist changes through the configured docStoreConn. Updates re-tokenize both content_ltks and the fine-grained content_sm_ltks, and re-embed the chunk through the dataset's configured embedding model:
# api/apps/sdk/doc.py
embd_id = DocumentService.get_embd_id(document_id)
embd_mdl = TenantLLMService.model_instance(
tenant_id, LLMType.EMBEDDING.value, embd_id
)
v, c = embd_mdl.encode([doc.name, d["content_with_weight"]
if not d.get("question_kwd") else "\n".join(d["question_kwd"])])
v = 0.1 * v[0] + 0.9 * v[1] if doc.parser_id != ParserType.QA else v[1]
Source: api/apps/sdk/doc.py:1-1
The Python SDK wraps the same flow:
# sdk/python/ragflow_sdk/modules/document.py
def add_chunk(self, content: str,
important_keywords: list[str] = [],
questions: list[str] = []):
res = self.post(
f'/datasets/{self.dataset_id}/documents/{self.id}/chunks',
{"content": content,
"important_keywords": important_keywords,
"questions": questions}
)
Source: sdk/python/ragflow_sdk/modules/document.py:1-1
3.2 Important Keywords and Questions
Each chunk carries optional important_keywords and questions arrays. The update route tokenizes them into important_tks and question_tks for full-text search, and the available boolean maps to available_int so disabled chunks can be filtered at query time. Source: api/apps/sdk/doc.py:1-1
4. Retrieval and Re-Ranking
4.1 Retrieval Endpoint
The /datasets/<dataset_id>/documents/<document_id>/chunks GET endpoint, in addition to listing chunks, supports a keywords argument that triggers the settings.retrievaler.search path with highlighting enabled. The response includes similarity, docnm_kwd, image_id, and positions for downstream rendering:
# api/apps/sdk/doc.py
sres = settings.retrievaler.search(query, search.index_name(tenant_id),
[dataset_id], emb_mdl=None, highlight=True)
Source: api/apps/sdk/doc.py:1-1
4.2 Cross-Dataset Retrieval Test
The retrieval POST endpoint validates that all target datasets share a single embedding model (comparing the base name after stripping the vendor suffix) before issuing a query. This guard prevents dimension mismatches in the vector store. Source: api/apps/sdk/doc.py:1-1
4.3 Related Question Expansion
The session module offers a related_questions endpoint that uses the tenant's chat LLM to expand a user query into 5–10 related search terms, helping retrieval discover adjacent content. Source: api/apps/sdk/session.py:1-1
5. Known Failure Modes (from Community)
- Disconnected chunks (issue #180): when MinerU + a text-embedding model (e.g.,
bge-m3) parse a document by paragraph, a heading and its answer may end up in separate chunks with no cross-reference. This causes the hybrid retriever to score irrelevant chunks highly (all three similarity signals saturating near 100). Operators can mitigate this by manually associating adjacent chunks via theadd_chunk/ update endpoints exposed in api/apps/sdk/doc.py:1-1 and by tuningimportant_keywordsper chunk. - Parsing stuck at ~40% with no logs (issue #255): the long
timeout: 60000000set by the frontend is necessary, not optional; killing the request before completion is the usual cause of missing log output. Source: management/web/src/common/apis/kbs/document.ts:1-1
See Also
- Project README and contributor guide: README.md
- Python SDK entry point: sdk/python/ragflow_sdk/__init__.py
- File download / streaming helper: management/web/src/common/apis/files/index.ts
Source: https://github.com/zstar1003/ragflow-plus / Human Manual
Frontend Apps, Deployment & Troubleshooting
Related topics: Project Overview & High-Level Architecture, Backend Services: API & Management Server, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG)
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview & High-Level Architecture, Backend Services: API & Management Server, Document Parsing, Knowledge Chunking & Retrieval (RAG / GraphRAG)
Frontend Apps, Deployment & Troubleshooting
Overview
Ragflow-plus ships two independently developed frontends that share the same Python REST backend (the api/ service, derived from Ragflow). The first is the end-user web client located under web/ — built with UmiJS, React 18, TypeScript, Ant Design Pro Components and a Lexical-based rich text editor. The second is the administrative console under management/web/ — built with Vue 3, Vite, TypeScript and Element Plus, derived from v3-admin-vite (Source: README.md). Both clients talk to the documented SDK routes under api/apps/sdk/ for datasets, documents, chunks, sessions and assistant completions.
| Frontend | Stack | Path | Primary Role |
|---|---|---|---|
| User web app | UmiJS + React + Ant Design + Lexical | web/ | Knowledge base browsing, chat, document authoring |
| Admin console | Vue 3 + Vite + Element Plus | management/web/ | User management, file management, system configuration |
User-Facing Web Application (`web/`)
The web/ package declares React 18, @ant-design/pro-components, @antv/g6, @monaco-editor/react, @lexical/react, @radix-ui/* UI primitives, Tailwind and a rich text editor stack (Source: web/package.json). UmiJS scripts include dev, build, lint, test (jest) and a prepare hook that wires Husky pre-commit checks (Source: web/package.json).
The application exposes several major workflows exposed via UmiJS routes (knowledge bases, datasets, chat, document authoring "Write" mode). Each route consumes the SDK endpoints defined under api/apps/sdk/ rather than calling internal services directly. For example:
- Chunk CRUD & retrieval:
POST /chunks,PUT /chunks/{id},DELETE /chunks,POST /chunk/retrieval(Source: api/apps/sdk/doc.py). - Dataset listing, creation and deletion with
page,page_size,orderby,descpagination (Source: api/apps/sdk/dataset.py). - Chat assistant completions and related-search-term expansion through
/chatbots/<dialog_id>/completions(Source: api/apps/sdk/session.py).
Management Admin Console (`management/web/`)
The admin console is a fully separate UmiJS-compatible Vite project (Source: management/web/package.json). Its dependency set is intentionally narrower than the user app — focused on Element Plus, axios, dayjs, pinia, lodash-es, nprogress and screenfull, plus the utility-only mitt event bus (Source: management/web/package.json).
flowchart LR UserWeb[web/ React App] -->|REST| SDK[api/apps/sdk] MgmtWeb[management/web/ Vue Admin] -->|REST| SDK SDK --> EsStore[(Elasticsearch / Doc Store)] SDK --> DocSrv[Document Service] SDK --> KbSrv[Knowledgebase Service] SDK --> LlmSrv[LLM / MinerU Services]
Key client-side conventions observable in the admin code:
- Cache key namespaces are namespaced under
v3-admin-vite-to isolate the admin app from any other Vue projects on the same origin (Source: management/web/src/common/constants/cache-key.ts). These holdTOKEN, layout config, sidebar state, theme name and visited/cached route views. - API typings for files and knowledge bases are strongly typed via
FileData,PageQuery,PageResult<T>andApiResponse<T>interfaces (Source: management/web/src/common/apis/files/type.ts,Source: management/web/src/common/apis/kbs/type.ts). - Streaming downloads use a dedicated axios cancel token,
responseType: "blob", 300s timeout and a permissivevalidateStatusso the UI can render download progress reliably (Source: management/web/src/common/apis/files/index.ts). - Watermarking is implemented as a composable that defends against DOM removal via
MutationObserverandResizeObserveron both the watermark element and its parent (Source: management/web/src/common/composables/useWatermark.ts). - Document admin actions map directly to SDK routes — e.g.
runDocumentParseApi(id)callsPOST /api/v1/knowledgebases/documents/${id}/parsewith a 60 000 000 ms timeout to accommodate long parse jobs (Source: management/web/src/common/apis/kbs/document.ts).
SDK Surface That Binds the Two Frontends
Both frontends depend on a stable REST contract. The most relevant contract points are:
GET /api/v1/knowledgebases/documentsacceptspage,page_size,orderby,descand returns documents withchunk_count,token_count,chunk_methodandrunstatus (Source: api/apps/sdk/doc.py).POST /api/v1/chunksrequirescontentand acceptsimportant_keywords; ownership is checked viaKnowledgebaseService.accessibleandDocumentService.query(Source: api/apps/sdk/doc.py).- Chunk retrieval validates that all requested datasets share a single embedding model, otherwise returns a
DATA_ERROR(Source: api/apps/sdk/doc.py).
Because both clients speak the same contract, an API change must be coordinated in three places: api/apps/sdk/*.py, both frontend API directories under web/src/ and management/web/src/common/apis/.
Common Troubleshooting (Community-Reported)
The issues below map directly to repository modules and are the most frequently encountered deployment failures.
| Symptom | Likely Cause | Where to Look |
|---|---|---|
| File parsing stalls at ~40 %, no logs on disk | Long-running parse task exceeds default request timeout on the admin client; parsing service may have been killed mid-pipeline | runDocumentParseApi timeout (60 000 000 ms) — management/web/src/common/apis/kbs/document.ts:60-65; parse pipeline in api/apps/sdk/doc.py |
MaxConnectionsExceeded('Exceeded maximum connections.') on login | Recycling sessions across many accounts exhausts DB connections; pooling needs adjustment | Backend connection-pool config; SDK session creation in api/apps/sdk/session.py |
| GPU startup error (e.g. GTX 5070 Ti on Windows / WSL2) | Driver / CUDA mismatch with bundled MinerU / embedding containers | Container image tags referenced in the release notes; see v0.5.0 changelog entry on parsing model dimension relaxation |
| Knowledge chunks split title from answer, no cross-chunk linkage | Parsing chunks by paragraph only; relevance ranking returns identical 100 % scores across hybrid/keyword/vector channels for unrelated chunks | Chunk creation & tokenization in api/apps/sdk/doc.py; related-search prompt in api/apps/sdk/session.py |
Additional observation: the v0.5.0 release un-pinned the parsing model from the previous 1024-dimension constraint, fixed a TypeError during PDF rendering in the knowledge-base viewer, and restored the file-upload entry point in the chat UI — confirming that several reported bugs across the two frontends converge on the parsing/chunking pipeline rather than on individual screens.
See Also
Source: https://github.com/zstar1003/ragflow-plus / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 23 structured pitfall item(s), including 14 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/182
2. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/185
3. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/256
4. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/233
5. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/239
6. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/257
7. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/183
8. Capability evidence risk: Capability evidence risk requires verification
- Severity: high
- Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/250
9. Runtime risk: Runtime risk requires verification
- Severity: high
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/240
10. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/238
11. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/245
12. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zstar1003/ragflow-plus/issues/249
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using ragflow-plus with real data or production workflows.
- Community source 1 - github / github_issue
- [[Question]: is there an English language option for the user management](https://github.com/zstar1003/ragflow-plus/issues/256) - github / github_issue
- Community source 3 - github / github_issue
- Community source 4 - github / github_issue
- Community source 5 - github / github_issue
- Community source 6 - github / github_issue
- Community source 7 - github / github_issue
- Community source 8 - github / github_issue
- Community source 9 - github / github_issue
- Community source 10 - github / github_issue
- Community source 11 - github / github_issue
- Community source 12 - github / github_issue
Source: Project Pack community evidence and pitfall evidence