dsRAG Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

dsRAG

High-performance retrieval engine for unstructured data

Overview & System Architecture

Related topics: Pluggable Retrieval Components, Core Retrieval Innovations

Section Related Pages

Continue reading this section for the full explanation and source context.

Overview & System Architecture

Purpose and Scope

dsRAG is a retrieval engine for unstructured data, optimized for challenging queries over dense text such as financial reports, legal documents, and academic papers. According to the project README, on the FinanceBench benchmark dsRAG reaches 96.6% accuracy versus 32% for a vanilla RAG baseline. Source: README.md.

The system is organized around a single high-level abstraction — the KnowledgeBase object — that takes in raw documents, performs chunking and embedding, persists state to disk, and at query time returns the most relevant segments of text. The KnowledgeBase is configured by composing six pluggable components: VectorDB, ChunkDB, Embedding, Reranker, LLM, and FileSystem. Source: README.md.

A dedicated sub-module called dsParse handles multimodal file parsing, semantic sectioning, and chunking. It can be used standalone via pip install dsparse or transparently inside a KnowledgeBase by enabling use_vlm=True in the file-parsing configuration. Source: dsrag/dsparse/README.md.

High-Level Architecture

The system separates ingestion-time concerns (parsing, sectioning, chunking, embedding, persistence) from query-time concerns (vector search, reranking, relevant segment extraction). The data flow below reflects the canonical pipeline described in the README.

flowchart LR
    A[Raw File or Text] --> B[dsParse<br/>VLM Parsing + Semantic Sectioning]
    B --> C[AutoContext<br/>Contextual Chunk Headers]
    C --> D[Embedding Model]
    D --> E[(VectorDB)]
    D --> F[(ChunkDB)]
    G[User Query] --> H[Vector Search]
    H --> E
    H --> I[Reranker]
    I --> J[Relevant Segment<br/>Extraction RSE]
    F --> J
    J --> K[LLM-generated Answer]

Three key methods drive the accuracy gains documented in the README benchmarks:

Method	When it runs	What it does
Semantic Sectioning	Ingestion	LLM identifies semantically cohesive sections and titles them
AutoContext	Ingestion	Prepends document + section context to each chunk header
Relevant Segment Extraction (RSE)	Query time	Combines adjacent relevant chunks into longer segments

Source: README.md.

Core Components

The README enumerates the six configurable components of a KnowledgeBase. Each can be replaced by a custom subclass of its base class. Source: README.md.

VectorDB — stores embedding vectors and metadata. Available options include BasicVectorDB, WeaviateVectorDB, ChromaDB, QdrantVectorDB, MilvusDB, and PineconeDB. ChromaDB additionally supports metadata filtering at query time.
ChunkDB — stores chunk text keyed on (doc_id, chunk_index). Options: BasicChunkDB and SQLiteDB. RSE reads from here to reconstruct full segments.
Embedding — converts chunks (with AutoContext headers) to vectors. Options include OpenAIEmbedding, CohereEmbedding, VoyageAIEmbedding, and OllamaEmbedding. Community issue #6 requests broader local-model support such as sentence-transformers. Source: README.md.
Reranker — re-scores retrieved chunks before RSE. Options: CohereReranker, VoyageReranker, NoReranker.
LLM — used for AutoContext generation and (optionally) answer synthesis. Community issue #5 requests Llama 3-8B support for AutoContext.
FileSystem — controls where intermediate files (e.g., VLM page images, elements.json) are persisted.

The configuration surface is typed via TypedDict definitions in dsrag/dsparse/models/types.py. For example, FileParsingConfig accepts use_vlm, vlm_config, always_save_page_images, plus optional serialized vlm and vlm_fallback clients. Source: dsrag/dsparse/models/types.py.

dsParse Sub-Module

dsParse is the document ingestion engine. Its main entry point is parse_and_chunk(kb_id, doc_id, file_path, file_parsing_config, ...), which returns a tuple (sections, chunks). It supports two parsing modes:

Traditional text extraction — fast and inexpensive, suitable for PDFs with clean extractable text.
VLM (vision language model) parsing — uses a multimodal model such as gemini-2.0-flash to OCR pages, categorize elements into types (NarrativeText, Figure, Image, Table, etc.), and describe visual elements. Requires the external poppler dependency for PDF → image conversion. Source: dsrag/dsparse/README.md.

VLMs now expose a class-based client abstraction (GeminiVLM, etc.) analogous to LLM/Embedding/Reranker. Instances can be attached to a KnowledgeBase via the vlm_client constructor parameter, or supplied per-document inside file_parsing_config["vlm"]. A fallback client can be configured through vlm_fallback. Legacy dict-based vlm_config remains supported for backward compatibility. The system prefers the class-based client when both are present. Source: dsrag/dsparse/README.md.

Community Considerations

Several open issues are directly relevant to the architecture:

Dependency isolation (#127) — dsrag/llm.py reportedly does an eager import google.generativeai instead of using dsrag.utils.imports.LazyLoader. This forces the dependency on every user even when Gemini is not used. Source: community context.
Platform compatibility (#61) — uvloop is unsupported on Windows, causing a RuntimeError for users on that OS. The runtime stack may need a platform-conditional fallback.
Local embedding models (#6) — users want sentence-transformers (and similar) as first-class Embedding components, which would make the system viable fully offline.
Local LLMs for AutoContext (#5) — Llama 3-8B is requested as an AutoContext LLM option. Currently the LLM component accepts provider/model pairs but does not enumerate local runners.
LangChain interop (#4) — proposed by subclassing LangChain's BaseRetriever and delegating _get_relevant_documents to kb.query.

These requests all map onto the existing component slots (Embedding, LLM), suggesting they can be addressed by adding new subclasses rather than restructuring the core architecture.

Configuration Entry Points

The README documents the main configuration dictionaries passed to add_document:

file_parsing_config — controls VLM usage, element exclusion, concurrency, and DPI.
semantic_sectioning_config — selects the LLM provider/model and toggles sectioning.
chunking_config — chunk_size and min_length_for_chunking.
auto_context_config — toggles for document/section summary generation.
rse_params — max_length, overall_max_length, minimum_value, irrelevant_chunk_penalty, decay_rate, top_k_for_document_selection, chunk_length_adjustment.

Persistence is automatic: the full KnowledgeBase configuration is written to a JSON file upon creation and update, making the object reconstructible across processes. Source: README.md.

Pluggable Retrieval Components

Related topics: Overview & System Architecture, Core Retrieval Innovations, dsParse: Multimodal File Parsing & VLM Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section VectorDB

Continue reading this section for the full explanation and source context.

Section ChunkDB

Continue reading this section for the full explanation and source context.

Section Embedding

Continue reading this section for the full explanation and source context.

Pluggable Retrieval Components

dsRAG is built around a pluggable component architecture in which every part of the retrieval pipeline can be swapped without rewriting the surrounding logic. A KnowledgeBase is composed of six configurable components — VectorDB, ChunkDB, Embedding, Reranker, LLM, and FileSystem — plus optional helpers such as a VLM client. Each component has a default implementation, several built-in alternatives, and a documented extension point for fully custom subclasses.

Purpose and Scope

The pluggable design serves two goals. First, it lets users match infrastructure to deployment constraints (for example, switching from an in-memory BasicVectorDB to a managed PineconeDB without touching the embedding or chunking code). Second, it isolates dependency-heavy integrations behind a thin interface, so a project that only needs OpenAI embeddings is not forced to install every optional SDK. This pattern is reinforced throughout the codebase, including the dsParse sub-module, where VLM clients, file systems, and sectioning models follow the same class-based abstraction.

The top-level README states:

There are six key components that define the configuration of a KnowledgeBase, each of which are customizable: 1. VectorDB, 2. ChunkDB, 3. Embedding, 4. Reranker, 5. LLM, 6. FileSystem. There are defaults for each of these components, as well as alternative options included in the repo. You can also define fully custom components by subclassing the base classes and passing in an instance of that subclass to the KnowledgeBase constructor.

Source: README.md

The Six Core Components

VectorDB

Stores dense embedding vectors alongside a small amount of metadata. Built-in options include BasicVectorDB, WeaviateVectorDB, ChromaDB, QdrantVectorDB, MilvusDB, and PineconeDB. ChromaDB additionally supports metadata query filters using operators such as equals, not_equals, in, not_in, greater_than, less_than, greater_than_equals, and less_than_equals.

Source: README.md

ChunkDB

Stores the original text for every chunk in a nested dictionary keyed on doc_id and chunk_index. Relevant Segment Extraction (RSE) reads from this store to rebuild longer passages at query time. The supported options are BasicChunkDB and SQLiteDB.

Source: README.md

Embedding

Defines the embedding model. Built-in clients include OpenAIEmbedding, CohereEmbedding, VoyageAIEmbedding, and OllamaEmbedding. The OllamaEmbedding option is the in-tree path for running embeddings against a local model server, which is relevant to community interest in sentence-transformers and other local models.

Source: README.md

Reranker

Re-ranks the top results returned by the vector store before RSE runs. Cohere and Voyage AI clients are shipped; the NoReranker class is provided for users who want to disable the rerank step entirely.

Source: README.md

LLM

Drives AutoContext contextual chunk headers, semantic sectioning, and (optionally) the response generation step. Models from OpenAI, Anthropic, and Gemini are supported.

Source: README.md

FileSystem

Abstracts where intermediate artefacts such as page images and elements.json are persisted. A LocalFileSystem ships in the dsParse sub-module and is consumed by parse_and_chunk through a file_system argument.

Source: dsrag/dsparse/README.md

VLM Clients and the Parsing Pipeline

The dsParse sub-module extends the same pluggable pattern to multimodal parsing. A class-based VLM client (for example, GeminiVLM) can be passed at the KnowledgeBase level via vlm_client=..., or per document through a serialized dict in file_parsing_config["vlm"]. A fallback client is supported through vlm_fallback. The legacy dict-based path (vlm_config with provider/model) is still supported; when both are provided, the class-based client takes precedence.

The relevant typed configuration is defined in dsrag/dsparse/models/types.py:

TypedDict	Notable keys
`VLMConfig`	`provider`, `model`, `fallback_provider`, `fallback_model`, `exclude_elements`, `element_types`, `dpi`, `vlm_max_concurrent_requests`
`FileParsingConfig`	`use_vlm`, `vlm_config`, `always_save_page_images`, `vlm`, `vlm_fallback`
`SemanticSectioningConfig`	`use_semantic_sectioning`, `llm_provider`, `model`, `language`
`ChunkingConfig`	`chunk_size`, `min_length_for_chunking`

Source: dsrag/dsparse/models/types.py

A typical class-based override looks like:

from dsrag.knowledge_base import KnowledgeBase
from dsrag.dsparse.file_parsing.vlm_clients import GeminiVLM

kb = KnowledgeBase(
    kb_id="my_kb",
    vlm_client=GeminiVLM(model="gemini-2.0-flash"),
)

Source: dsrag/dsparse/README.md

Extending the System

Custom components are introduced by subclassing the base class and passing an instance to the KnowledgeBase constructor. This is the same mechanism the bundled options use, and it is the recommended way to integrate with LangChain retrievers or local models that are not yet first-class. Community discussions (for example, the request to add a LangChain BaseRetriever subclass whose _get_relevant_documents method calls kb.query) follow exactly this pattern, as do proposals to surface additional local LLMs and embedding backends. The README's section on auto_context_config, semantic_sectioning_config, chunking_config, and rse_params documents the configuration surface that custom components must respect.

Source: README.md

Component Interaction at Query Time

flowchart LR
    Q[Query] --> KB[KnowledgeBase]
    KB --> VDB[VectorDB]
    VDB --> RR[Reranker]
    RR --> RSE[Relevant Segment Extraction]
    RSE --> CDB[ChunkDB]
    CDB --> Segs[Segments]
    KB -.uses.-> LLM[LLM / AutoContext]
    KB -.uses.-> FS[FileSystem]

The diagram illustrates the runtime contract: VectorDB returns candidate chunks, the Reranker re-orders them, and RSE reads the full text back from ChunkDB to assemble longer segments. The LLM participates in the offline pipeline (AutoContext headers, semantic sectioning) and optionally at response-generation time. The FileSystem is touched only during ingestion or when VLM parsing is enabled.

Source: README.md

Operational Notes and Failure Modes

Environment variables — The README and dsParse README call out that GEMINI_API_KEY is required for GeminiVLM and that a clear error is raised when it is missing. Custom components should follow the same pattern.
Optional dependencies — Vector databases are now distributed as optional install groups, mirroring how the pluggable architecture is intended to keep core installs lean.
Backward compatibility — The dsParse vlm/vlm_fallback class-based path supersedes the legacy provider/model dict path only when both are supplied, so existing configurations continue to work.
Cross-platform — The component abstraction keeps heavy native dependencies (for example, the poppler binary needed for VLM PDF parsing) out of the core install path.

Source: README.md, dsrag/dsparse/README.md

Core Retrieval Innovations

Related topics: Overview & System Architecture, Pluggable Retrieval Components

Section Related Pages

Continue reading this section for the full explanation and source context.

Core Retrieval Innovations

dsRAG is a retrieval engine for unstructured data, especially dense text like financial reports, legal documents, and academic papers. Three retrieval innovations sit at the heart of its pipeline and drive its reported jump from ~32% to ~96.6% accuracy on the FinanceBench benchmark (Source: README.md). This page documents those three mechanisms: Semantic Sectioning, AutoContext (contextual chunk headers), and Relevant Segment Extraction (RSE), plus how they are configured through the KnowledgeBase API.

Architecture Overview

The three innovations are applied at distinct stages of the retrieval pipeline. Semantic Sectioning reshapes the document before chunking, AutoContext enriches the chunk text before embedding, and RSE reconstructs longer passages after the vector + reranker search.

flowchart LR
    A[Raw Document] --> B[Semantic Sectioning]
    B --> C[Section-aware Chunking]
    C --> D[AutoContext Headers]
    D --> E[VectorDB + Reranker Search]
    E --> F[Relevant Segment Extraction]
    F --> G[Final Segments to LLM]

Source: README.md, dsrag/rse.py

Semantic Sectioning

Semantic sectioning is an offline, ingest-time step that uses an LLM to break a document into "semantically cohesive" sections. The implementation annotates the document with line numbers and prompts the LLM to identify the starting line for each section. Sections are expected to span a few paragraphs to a few pages, and the LLM also produces descriptive titles for each (Source: README.md).

The behavior is encoded in dsrag/dsparse/sectioning_and_chunking/semantic_sectioning.py. The SemanticSectioningConfig typed dict in dsrag/dsparse/models/types.py exposes use_semantic_sectioning, llm_provider, model, and language as the relevant parameters. Document text is processed in roughly 5,000-token mega-chunks in parallel, so even multi-hundred-page documents finish in 5–10 seconds. The default model is gpt-4o-mini, and gemini-2.0-flash or claude-3-5-haiku-latest are also listed as compatible options (Source: README.md).

The downstream benefit is twofold: section titles feed into the AutoContext headers, and section boundaries constrain the chunker in dsrag/dsparse/sectioning_and_chunking/chunking.py so that chunks do not straddle unrelated topics.

AutoContext (Contextual Chunk Headers)

AutoContext solves a well-known embedding problem: an isolated chunk often lacks the context needed to embed it accurately. The auto_context module in dsrag/auto_context.py generates a textual header per chunk containing document-level context (title, summary) and section-level context (section title and, optionally, a section summary), then prepends it before embedding. The chunk used for retrieval is the header plus the original text, while the stored chunk body remains the original (Source: README.md).

Key configuration flags exposed in KnowledgeBase.add_document include:

Flag	Effect
`use_generated_title`	Use an LLM-generated document title instead of the user-supplied one
`get_document_summary`	Include an LLM-generated document summary in each header
`get_section_summaries`	Include LLM-generated section summaries in each header
`document_title_guidance`, `section_summarization_guidance`	Free-form prompt guidance strings

Source: README.md

Because AutoContext is an LLM call per document (and sometimes per section), there is open community interest in using locally hosted models such as Llama 3-8B for this step (Source: issue #5). The LLM client is currently configured through the KnowledgeBase constructor, and the same concern — that the provider abstraction should make local model backends pluggable — also appears in requests for sentence-transformers embeddings (Source: issue #6).

Relevant Segment Extraction (RSE)

RSE is a query-time post-processing step. The vector search plus reranker returns a ranked list of individual chunks, but many real questions are answered by a contiguous block of text longer than one chunk. RSE clusters the top-ranked chunks, scores contiguous runs, and emits the highest-scoring runs as "segments" (Source: dsrag/rse.py, README.md).

The algorithm relies on ChunkDB to resolve the full text of each chunk by (doc_id, chunk_index), then applies a length-aware scoring heuristic so longer contiguous runs are not unfairly penalized. The main tuning knobs live under rse_params in the KnowledgeBase configuration:

Parameter	Purpose
`max_length`	Maximum length of a single segment, in chunks
`overall_max_length`	Maximum total length across all returned segments
`minimum_value`	Relevance floor below which a segment is dropped
`irrelevant_chunk_penalty`	Penalty (0–1) applied when a low-scoring chunk sits inside a run
`overall_max_length_extension`	Per-query extension to `overall_max_length`
`decay_rate`	Exponential decay applied across segment boundaries
`top_k_for_document_selection`	How many distinct documents to consider
`chunk_length_adjustment`	Whether to scale chunk scores by chunk length before segment scoring

Source: README.md

Because RSE re-reads chunk text from ChunkDB, it is sensitive to how chunks are stored. The metadata layer in dsrag/metadata.py is what makes a segment addressable and presentable to the downstream generator.

How the Three Innovations Combine

The combined effect of section-aware chunking, contextual headers, and segment-level retrieval is the basis of dsRAG's reported benchmark results. The README's KITE table compares vanilla top-k retrieval, RSE alone, contextual headers alone, and the combined default configuration; the combined configuration is uniformly the strongest across the AI Papers, BVP Cloud 10-Ks, Sourcegraph handbook, and Supreme Court opinions datasets (Source: README.md).

For users who want to consume this pipeline from another framework, the recommended pattern is to subclass the host framework's retriever base class and call kb.query from _get_relevant_documents, so the same three innovations are reused unchanged (Source: issue #4). This is also what dsrag/auto_query.py is designed to support for query-side enrichment.

dsParse: Multimodal File Parsing & VLM Integration

Related topics: Overview & System Architecture, Pluggable Retrieval Components

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Element Types

Continue reading this section for the full explanation and source context.

Section VLM Clients

Continue reading this section for the full explanation and source context.

dsParse: Multimodal File Parsing & VLM Integration

Overview

dsParse is a sub-module of dsrag that performs multimodal file parsing, semantic sectioning, and chunking. It accepts a file path plus configuration and returns clean, structured chunks ready for embedding and retrieval. The module can be used standalone via the standalone dsparse pip package, or transparently through a KnowledgeBase by setting use_vlm=True in file_parsing_config. Source: dsrag/dsparse/README.md.

The core motivation behind dsParse is to handle documents where vanilla text extraction fails or loses fidelity — scanned PDFs, complex layouts, dense tables, and figures. By delegating page understanding to a Vision Language Model (VLM), dsParse produces rich descriptions of visual content and structurally accurate text, dramatically improving downstream retrieval. Source: dsrag/dsparse/README.md.

Architecture and Data Flow

When VLM parsing is enabled, dsParse converts each PDF page into an image (via poppler), sends the image and a structured prompt to a VLM, and receives a categorized list of page elements. Elements are typed, optionally described (for visuals), and then concatenated into lines that flow into the semantic sectioner and chunker. The default VLM is gemini-2.0-flash, chosen for fast, cost-effective, near-SOTA performance. Source: dsrag/dsparse/README.md.

flowchart TD
    A[PDF / file_path] --> B[Poppler: convert to page images]
    B --> C[VLM Client: GeminiVLM]
    C --> D[Element list<br/>NarrativeText, Table, Figure, etc.]
    D --> E[Annotate with line numbers]
    E --> F[Semantic Sectioner LLM]
    F --> G[Sections w/ titles]
    G --> H[Chunker]
    H --> I[(sections, chunks)]
    I --> J[AutoContext chunk headers]
    J --> K[(Embedding + VectorDB)]

A non-VLM parsing path (non_vlm_file_parsing.py) remains available for cases where a VLM is undesirable (cost, latency, or platform restrictions). The selection between VLM and non-VLM parsing is controlled by the use_vlm flag inside FileParsingConfig. Source: dsrag/dsparse/models/types.py.

Element Types

Page content is categorized into eight categories by default: NarrativeText, Figure, Image, Table, Header, Footnote, Footer, and Equation. Users may define custom categories by supplying an element_types list, or exclude existing ones. By default, Header and Footer are excluded because they rarely carry semantic value and disrupt cross-page flow. Source: dsrag/dsparse/README.md.

The Element and Line types defined in models/types.py carry fields such as type, content, page_number, and is_visual — these flow through the pipeline and inform whether a chunk should include a visual description. Source: dsrag/dsparse/models/types.py.

Configuration

Configuration is exposed through three TypedDict groups: FileParsingConfig, SemanticSectioningConfig, and ChunkingConfig. All fields are optional and fall back to module-level defaults when omitted. Source: dsrag/dsparse/models/types.py.

Config Group	Key Field	Purpose
`FileParsingConfig`	`use_vlm`	Toggle VLM parsing (default False)
`FileParsingConfig`	`vlm` / `vlm_fallback`	Serialized class-based VLM clients
`FileParsingConfig`	`vlm_config`	Legacy dict: provider, model, dpi, concurrency
`FileParsingConfig`	`always_save_page_images`	Persist rasterized pages for reuse
`SemanticSectioningConfig`	`use_semantic_sectioning`	Toggle LLM-based sectioning
`SemanticSectioningConfig`	`llm_provider` / `model`	Sectioning LLM (default `gpt-4o-mini`)
`ChunkingConfig`	`chunk_size`	Max characters per chunk
`ChunkingConfig`	`min_length_for_chunking`	Skip chunking below this length

Source: dsrag/dsparse/models/types.py.

VLM Clients

dsParse provides class-based VLM clients (e.g., GeminiVLM) that mirror the abstraction used for LLMs, embeddings, and rerankers. They support .to_dict() serialization so they can be persisted in configuration and rehydrated at runtime. Source: dsrag/dsparse/README.md.

A vlm_fallback client may be supplied alongside the primary client; the system alternates between them after the initial retries when needed. This mirrors the fallback patterns used elsewhere in dsRAG. Source: dsrag/dsparse/README.md.

Legacy dict-based configuration (e.g., vlm_config={"provider": "gemini", "model": "gemini-2.0-flash"}) remains fully supported. When both a serialized client and a legacy dict are supplied, the serialized client takes precedence. Source: dsrag/dsparse/README.md.

Usage Patterns

Standalone parsing. The parse_and_chunk function is the primary entry point and accepts file_path, file_parsing_config, and optionally a file_system parameter (e.g., LocalFileSystem(base_path="~/dsParse")) for persisting intermediate artifacts. Source: dsrag/dsparse/README.md.

KnowledgeBase integration. A VLM client can be attached at the KB level via KnowledgeBase(..., vlm_client=GeminiVLM(model="...")), then overridden per document by passing a serialized client under file_parsing_config["vlm"]. This is the recommended pattern when different documents need different models or cost profiles. Source: dsrag/dsparse/README.md.

Reusing pre-extracted images. When page images already exist in the configured FileSystem directory, pass vlm_config={"images_already_exist": True} to skip rasterization and avoid redundant API calls. Source: dsrag/dsparse/README.md.

Cost, Latency, and Common Pitfalls

Cost. VLM parsing with gemini-2.0-flash is approximately $0.10 per 1000 pages (assuming 4 × 258-token image tiles per page at standard DPI plus a ~500-token prompt). Semantic sectioning with gpt-4o-mini is roughly $0.15 per 1000 pages. Source: dsrag/dsparse/README.md.

Latency. A single page takes ~15–20 seconds for VLM parsing. Documents are page-parallelized within the rate-limit budget (vlm_max_concurrent_requests). Sectioning operates on ~5000-token mega-chunks (≈10 pages) processed in parallel, typically completing a few-hundred-page document in 5–10 seconds. Source: dsrag/dsparse/README.md.

Platform compatibility. dsParse depends on poppler for PDF rasterization (brew install poppler on macOS). Community issue #61 reports that some async runtimes (uvloop) fail on Windows, which is relevant when integrating dsParse into high-throughput async pipelines. Source: community issue #61.

Optional dependencies. Community issue #127 highlights that some optional dependencies (e.g., google.generativeai) are imported eagerly elsewhere in the codebase rather than via LazyLoader. This is worth noting when assembling minimal environments for dsParse on its own. Source: community issue #127.

Environment variables. GEMINI_API_KEY is required for GeminiVLM; a clear error is raised at instantiation if missing. Other providers are expected to follow the same convention. Source: dsrag/dsparse/README.md.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 13 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/113

2. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/117

3. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/73

4. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/127

5. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/116

6. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/D-Star-AI/dsRAG

7. Runtime risk: Runtime risk requires verification

Severity: medium
Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/124

8. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/D-Star-AI/dsRAG

9. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/D-Star-AI/dsRAG

10. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/D-Star-AI/dsRAG

11. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/D-Star-AI/dsRAG/issues/118

12. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/D-Star-AI/dsRAG

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 9

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using dsRAG with real data or production workflows.

raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError - github / github_issue
llm.py directly imports google.generativeai instead of using LazyLoader - github / github_issue
A bug at custom_term_mapping? - github / github_issue
Is ChunkDB really needed? - github / github_issue
WeaviateVectorDB fails to connect with Weaviate v4 client - missing grpc - github / github_issue
sqlite3.OperationalError: no such column: model_response_status - github / github_issue
About Performance of Semantic Chunk - github / github_issue
Import "dsrag.document_parsing" from the README example couldn't be reso - github / github_issue
Capability evidence risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

dsRAG

Overview & System Architecture

Related Pages

Overview & System Architecture

Purpose and Scope

High-Level Architecture

Core Components

dsParse Sub-Module

Community Considerations

Configuration Entry Points

See Also

Pluggable Retrieval Components

Related Pages

Pluggable Retrieval Components

Purpose and Scope

The Six Core Components

VectorDB

ChunkDB

Embedding

Reranker

LLM

FileSystem

VLM Clients and the Parsing Pipeline

Extending the System

Component Interaction at Query Time

Operational Notes and Failure Modes

See Also

Core Retrieval Innovations

Related Pages

Core Retrieval Innovations

Architecture Overview

Semantic Sectioning

AutoContext (Contextual Chunk Headers)

Relevant Segment Extraction (RSE)

How the Three Innovations Combine

See Also

dsParse: Multimodal File Parsing & VLM Integration

Related Pages

dsParse: Multimodal File Parsing & VLM Integration

Overview

Architecture and Data Flow

Element Types

Configuration

VLM Clients

Usage Patterns

Cost, Latency, and Common Pitfalls

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Installation risk: Installation risk requires verification

2. Configuration risk: Configuration risk requires verification

3. Installation risk: Installation risk requires verification

4. Installation risk: Installation risk requires verification

5. Configuration risk: Configuration risk requires verification

6. Capability evidence risk: Capability evidence risk requires verification

7. Runtime risk: Runtime risk requires verification

8. Maintenance risk: Maintenance risk requires verification

9. Security or permission risk: Security or permission risk requires verification

10. Security or permission risk: Security or permission risk requires verification

11. Security or permission risk: Security or permission risk requires verification

12. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence