deep-searcher Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

deep-searcher

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Project Overview & System Architecture

Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 3.1 NaiveRAG

Continue reading this section for the full explanation and source context.

Section 3.2 ChainOfRAG

Continue reading this section for the full explanation and source context.

Section 3.3 DeepSearch

Continue reading this section for the full explanation and source context.

Project Overview & System Architecture

1. Purpose and Scope

DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines private knowledge bases with optional web search to answer complex queries. The repository is organized around a modular agent architecture in which each agent encapsulates a different retrieval-and-reasoning strategy. The framework plugs in an LLM, an embedding model, and a vector database as swappable backends, and exposes a unified query interface to the end user. As described in the bundled evaluation notes, the project is positioned to handle complex multi-hop questions and supports recall-based evaluation against datasets such as 2WikiMultiHopQA. Source: evaluation/README.md:1-15

The system's high-level design is centered on three RAG agent implementations (NaiveRAG, ChainOfRAG, DeepSearch) and a routing layer (RAGRouter) that picks the best implementation for a given query. This plug-in approach makes the framework easy to extend with new agents, vector stores, or LLMs.

2. Core Architectural Components

The codebase is structured around a small set of abstract base classes and concrete implementations:

Component	File	Role
`BaseAgent`	deepsearcher/agent/base.py	Abstract root for any agent; defines `invoke(query, **kwargs)`.
`RAGAgent`	deepsearcher/agent/base.py	Subclass of `BaseAgent` for RAG-style agents; requires `retrieve()` and `query()` returning `(answer, results, token_usage)`.
`NaiveRAG`	deepsearcher/agent/naive_rag.py	Simple retrieve-then-summarize agent.
`ChainOfRAG`	deepsearcher/agent/chain_of_rag.py	Multi-step iterative RAG with reflection and supported-doc filtering.
`DeepSearch`	deepsearcher/agent/deep_search.py	Sub-query decomposition, async retrieve, rerank, reflection, gap-query generation.
`RAGRouter`	deepsearcher/agent/rag_router.py	LLM-driven router that picks a single agent for a given query.
`CollectionRouter`	referenced from each agent	Selects relevant vector-DB collections before retrieval.

BaseAgent only mandates an invoke method, while RAGAgent adds the contract retrieve(query, kwargs) -> (List[RetrievalResult], int, dict) and query(query, kwargs) -> (str, List[RetrievalResult], int). Source: deepsearcher/agent/base.py:30-77 This contract is what makes the agents interchangeable inside the router.

3. Agent Implementations and Their Strategies

3.1 NaiveRAG

NaiveRAG performs a single retrieval pass and asks the LLM to summarize the retrieved chunks. It supports optional route_collection (using CollectionRouter to pick collections) and text_window_splitter (which substitutes metadata["wider_text"] for each chunk in the summary prompt). The summary prompt instructs the LLM to behave as a content analysis expert that consolidates chunks into a "specific and detailed answer or report." Source: deepsearcher/agent/naive_rag.py:1-130

3.2 ChainOfRAG

ChainOfRAG runs up to max_iter rounds. At each iteration it generates a follow-up sub-query, retrieves documents, asks the LLM for an intermediate answer, lets the LLM select which documents supported that answer (_get_supported_docs), and then uses a reflection prompt (REFLECTION_PROMPT) to decide whether to stop early. Final answers are synthesized with FINAL_ANSWER_PROMPT. The class docstring notes it is "very suitable for handling concrete factual queries and multi-hop questions," inspired by the paper referenced in the file. Source: deepsearcher/agent/chain_of_rag.py:1-200

3.3 DeepSearch

DeepSearch is the most elaborate agent. It decomposes the original query into up to four sub-questions (SUB_QUERY_PROMPT), retrieves and reranks chunks (RERANK_PROMPT), reflects to find gaps (REFLECT_PROMPT), and iterates up to max_iter times. It exposes both retrieve (synchronous wrapper) and async_retrieve, using asyncio.run to bridge them. The class docstring positions it for "general and simple queries, such as given a topic and then writing a report, survey, or article." Source: deepsearcher/agent/deep_search.py:1-200

4. Query Flow and System Wiring

The end-to-end flow uses the LLM twice: once at the agent-router level, and again inside the chosen agent. RAGRouter._route formats a prompt that lists each agent's index and __description__ (auto-populated via the describe_class decorator) and asks the LLM to return a single index. A fallback extracts the last digit if the LLM's output is not purely numeric — a common behavior with reasoning models. Source: deepsearcher/agent/rag_router.py:1-90

Inside any RAG agent, when route_collection=True, the query first passes through CollectionRouter.invoke, which returns the selected collections and the routing token cost. Retrieved chunks are deduplicated through deduplicate_results before being passed to the LLM for answer generation. The terminal log line ==== FINAL ANSWER==== is emitted via the colored progress logger defined in the utility module. Source: deepsearcher/agent/naive_rag.py:1-130, deepsearcher/utils/log.py:1-90

flowchart TD
    A[User Query] --> B[RAGRouter._route]
    B --> C{LLM selects agent}
    C -->|1| D[NaiveRAG]
    C -->|2| E[ChainOfRAG]
    C -->|3| F[DeepSearch]
    D --> G[CollectionRouter]
    E --> G
    F --> G
    G --> H[Vector DB retrieve]
    H --> I[Deduplicate chunks]
    I --> J[LLM summarize/reflect]
    J --> K[Final answer]

5. Configuration, Logging, and Extensibility

All agents take the same constructor triad: an LLM (BaseLLM), an embedding model (BaseEmbedding), and a vector database (BaseVectorDB). Optional flags such as top_k, max_iter, early_stopping, route_collection, and text_window_splitter let users tune retrieval depth versus cost. The describe_class decorator on each agent supplies a human-readable description that the router consumes, which is how a new agent becomes selectable automatically. Source: deepsearcher/agent/base.py:1-30, deepsearcher/agent/rag_router.py:30-60

The logging layer splits output into a "dev" logger (gated by set_dev_mode) and a "progress" logger that drives colored console output via termcolor and a ColoredFormatter. The progress_logger is what users see during a query call, while dev_mode controls verbose diagnostic logs. Source: deepsearcher/utils/log.py:1-110

For evaluation, the project ships an evaluation/ directory that runs Recall@K against 2WikiMultiHopQA, comparing DeepSearcher against naive RAG. A pre_num argument controls sample count, and --skip_load reuses an already-loaded vector DB on subsequent runs. Source: evaluation/README.md:1-30

Installation & Quickstart

Related topics: Project Overview & System Architecture, Deployment, CLI & FastAPI Service

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 3.1 Install from PyPI (recommended for new users)

Continue reading this section for the full explanation and source context.

Section 3.2 Install from source

Continue reading this section for the full explanation and source context.

Section 3.3 Container image (community-requested)

Continue reading this section for the full explanation and source context.

Installation & Quickstart

This page documents how to install DeepSearcher, configure it for first use, and run the minimum example end-to-end. It is written for a developer who wants to evaluate the project locally before customizing providers or loading their own corpus.

1. Purpose and Scope

DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines a configurable LLM, an embedding model, and a vector database to answer questions over private documents and (optionally) web sources. The "Installation & Quickstart" workflow is the supported on-ramp for new users: it installs the Python package, populates the environment variables required by the providers, and demonstrates a single-shot query against a vector index using the default NaiveRAG agent.

The community frequently hits the same three friction points during installation — missing optional dependencies on Windows, broken syntax in copy-pasted snippets, and the lack of an official container image — and those are covered in the Troubleshooting section below.

Source: README.md

2. Prerequisites

DeepSearcher is a Python project distributed as a package and a console script. Before installing, confirm the following:

Python runtime — A recent CPython (3.10+ recommended). Community reports indicate that running the CLI on Python 3.13 (deepsearcher.exe) has produced import-time tracebacks inside the bundled console script wrapper, so a 3.11–3.12 interpreter is the safer default until upstream adjusts the entry point. Source: deepsearcher/cli.py (referenced in issue #255)
Operating system — The default vector backend (milvus_lite) is officially supported on Ubuntu ≥ 20.04 and macOS ≥ 11.0. Windows users must switch to an alternative backend such as Milvus standalone or a hosted Milvus/Zilliz Cloud instance. Source: pyproject.toml, community discussion in issue #67
Provider credentials — At minimum an LLM API key (OpenAI-compatible) and, depending on the chosen embedding and vector DB, additional secrets. These are read from environment variables defined in env.example.

3. Installation

3.1 Install from PyPI (recommended for new users)

pip install deepsearcher

This installs the package, the optional provider integrations declared in pyproject.toml, and the deepsearcher console script used by the CLI entry point. Source: pyproject.toml

3.2 Install from source

Use this when you intend to modify agents, prompts, or providers:

git clone https://github.com/zilliztech/deep-searcher.git
cd deep-searcher
pip install -e .

The -e editable install is also the recommended setup for contributors evaluating the ChainOfRAG and DeepSearch agents, since both define their prompt templates as module-level string constants that are easy to iterate on. Source: deepsearcher/agent/chain_of_rag.py, deepsearcher/agent/deep_search.py

3.3 Container image (community-requested)

There is no official Docker image at the time of writing; the request is tracked in issue #78. For now, a local container can be built by authoring a Dockerfile that mirrors the editable install above and mounting a populated .env file. Source: pyproject.toml, issue #78

4. Configuration and First Run

4.1 Environment variables

Copy env.example to .env at the repository root and fill in the secrets for the providers you intend to use. The file lists, among others, the LLM API key, embedding model identifier, and vector database connection string. Source: env.example

4.2 Minimal program (`main.py`)

The repository ships a runnable script that loads a configuration, ingests a small set of local files, and answers a single query:

from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
# Customize your config here;
# more configuration see the Configuration Details section...

init_config(config=config)

# (load documents, then:)
result = query("Your question here")
print(result)

Fix: The version of this snippet previously circulated in the README was missing a closing parenthesis on line 7 (issue #80). The form shown above is the corrected one. Source: issue #80

4.3 CLI usage

The package exposes a deepsearcher console script whose entry point is deepsearcher.cli:main. Once the package is installed, you can invoke it directly:

deepsearcher --help

This is the path taken by users hitting the deepsearcher.exe traceback reported in issue #255. Source: deepsearcher/cli.py

4.4 `examples/basic_example.py`

For a fully self-contained walkthrough — including provider selection, file ingestion, and an end-to-end query — run the basic example:

python examples/basic_example.py

It instantiates the Configuration object, calls init_config, loads a few sample files into the default vector database, and prints the answer returned by the default RAG agent (NaiveRAG). Source: examples/basic_example.py, deepsearcher/agent/naive_rag.py

5. Troubleshooting

Symptom	Likely cause	Fix
`ModuleNotFoundError: No module named 'milvus_lite'` on Windows	`milvus_lite` is not published for Windows	Upgrade `pymilvus` and switch to a Milvus backend that runs off-host (standalone Docker, Zilliz Cloud) — issue #67
Traceback inside `deepsearcher.exe` on Python 3.13	Bundled console-script wrapper incompatibility	Use Python 3.11–3.12, or invoke the module directly (`python -m deepsearcher.cli`) — issue #255
`SyntaxError` from the README snippet	Unbalanced parenthesis on line 7 of the quickstart	Apply the corrected snippet shown in §4.2 — issue #80
Generic LLM hallucinations in answers	Small, non-reasoning LLM behind the API	Use a cutting-edge reasoning model (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) as recommended in issue #267

6. End-to-End Flow

The diagram below summarizes the path from pip install to a printed answer.

flowchart LR
    A[pip install deepsearcher] --> B[Copy env.example to .env]
    B --> C[Fill provider credentials]
    C --> D[python examples/basic_example.py]
    D --> E[Configuration + init_config]
    E --> F[Load documents into vector DB]
    F --> G[NaiveRAG agent query]
    G --> H[Printed answer + retrieved chunks]

LLM Provider Configuration

Related topics: Embedding Model Configuration, Extensibility, Troubleshooting & FAQ

Section Related Pages

Continue reading this section for the full explanation and source context.

LLM Provider Configuration

Overview

DeepSearcher is designed to be LLM-agnostic. Every retrieval agent — NaiveRAG, ChainOfRAG, DeepSearch, and RAGRouter — accepts a single llm: BaseLLM argument that is used for prompt answering, sub-query decomposition, reranking, and reflection. Source: deepsearcher/agent/naive_rag.py, deepsearcher/agent/chain_of_rag.py, deepsearcher/agent/deep_search.py.

The BaseLLM abstract class is defined in deepsearcher/llm/base.py. Source: deepsearcher/llm/base.py. Concrete providers live in sibling modules — openai_llm.py, deepseek.py, anthropic_llm.py, and ollama.py — each wrapping a different upstream SDK while exposing the same interface. Source: deepsearcher/llm/openai_llm.py, deepsearcher/llm/deepseek.py, deepsearcher/llm/anthropic_llm.py, deepsearcher/llm/ollama.py. Provider selection happens in deepsearcher/configuration.py, where a YAML-driven Configuration object is materialized into the actual BaseLLM instance. Source: deepsearcher/configuration.py.

The BaseLLM Contract

Every LLM provider must implement the methods consumed by the agents. Inspecting agent call sites reveals the full contract:

Method	Where used	Purpose
`chat(messages)`	All agents	Send a prompt, return an object with `.content` (string) and `.total_tokens` (int)
`literal_eval(text)`	`ChainOfRAG`, `DeepSearch`, `CollectionRouter`	Parse a Python literal (list of strings) from the LLM output
`remove_think(text)`	`ChainOfRAG`, `DeepSearch`, `RAGRouter`	Strip `<think>…` blocks emitted by reasoning models
`find_last_digit(text)`	`RAGRouter`	Fallback to extract the trailing digit when parsing the routing decision

Source: deepsearcher/agent/chain_of_rag.py (uses literal_eval to parse follow-up questions and remove_think before literal_eval), deepsearcher/agent/deep_search.py (SUB_QUERY_PROMPT and REFLECT_PROMPT expect a Python list of strings back), deepsearcher/agent/rag_router.py (falls back to find_last_digit when int(...) fails on a reasoning model's verbose reply).

The presence of remove_think and find_last_digit in the contract is a direct response to the maintainers' recommendation that users select a cutting-edge *reasoning* model — such as OpenAI o-series, DeepSeek R1, or Claude 3.7 Sonnet — because the prompts depend on structured output (lists, integers) and small LLMs are prone to hallucinations. Source: deepsearcher/agent/rag_router.py.

Built-in Providers

The repository ships four first-party provider modules. Each module is a thin adapter over a third-party SDK and is referenced by name from the configuration loader.

OpenAI / OpenAI-compatible — deepsearcher/llm/openai_llm.py covers the OpenAI Chat Completions API and any vendor exposing an OpenAI-compatible endpoint (Azure OpenAI, local vLLM serving OpenAI-format models). Source: deepsearcher/llm/openai_llm.py.
DeepSeek — deepsearcher/llm/deepseek.py adapts the DeepSeek API, which is OpenAI-compatible and recommended by the maintainers for reasoning-heavy workloads. Source: deepsearcher/llm/deepseek.py.
Anthropic — deepsearcher/llm/anthropic_llm.py adapts the Claude Messages API. Source: deepsearcher/llm/anthropic_llm.py.
Ollama (local) — deepsearcher/llm/ollama.py runs models locally through the Ollama daemon. Community reports note that Ollama throughput can be a bottleneck for embedding-heavy workloads. Source: deepsearcher/llm/ollama.py.

A user request in issue #254 to add BurnCloud as an additional LLM provider has been opened against the project, illustrating how third-party providers can be contributed by adding a new module that subclasses BaseLLM. Source: issue #254 — BurnCloud seeks to contribute enhancements. Issue #247 similarly asks for local deployment of Qwen3-Embedding and a Qwen3 LLM served via vLLM with an OpenAI-compatible interface — i.e. using the existing openai_llm.py adapter against a self-hosted endpoint. Source: issue #247 — local Qwen3 deployment.

Configuration Lifecycle

LLM selection is driven by the Configuration class. The canonical entry point is:

from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section...

Source: deepsearcher/configuration.py and the Quickstart snippet reproduced in issue #80 (which originally contained an unbalanced parenthesis that has since been fixed).

Internally, the configuration object resolves a provider_name to a concrete class registered in deepsearcher/llm/__init__.py, instantiates it with credentials (api_key, base_url, model, etc.), and the resulting BaseLLM instance is passed by reference into every agent. Because all agents receive the *same* llm object, switching providers requires only a configuration change — no agent code edits. Source: deepsearcher/llm/__init__.py.

flowchart LR
    YAML["config.yaml<br/>(provider, model, api_key)"] --> Cfg[Configuration]
    Cfg --> Resolver["deepsearcher/llm/__init__.py<br/>provider dispatch"]
    Resolver --> Provider["Concrete BaseLLM<br/>(OpenAI / DeepSeek /<br/>Anthropic / Ollama)"]
    Provider --> Agents["NaiveRAG / ChainOfRAG /<br/>DeepSearch / RAGRouter"]
    Agents -->|chat / literal_eval /<br/>remove_think| Provider

Common Failure Modes and Community Pitfalls

Small or non-reasoning LLMs produce malformed structured output. The RAGRouter prompt asks for a single integer; ChainOfRAG and DeepSearch ask for a Python list of strings. When the model returns prose, int(...) raises ValueError and the router falls back to find_last_digit. If even that fails, the agent throws — a symptom users see as "the LLM is hallucinating". Source: deepsearcher/agent/rag_router.py.
CLI bootstrap errors. Issue #255 reports a deepsearcher.exe invocation crashing because deepsearcher.cli cannot import — almost always a missing or mis-configured provider SDK installed after deepsearcher itself. Source: issue #255.
Ollama throughput. Issue #247 cites Ollama as too slow for embedding and asks for a vLLM-backed local server instead, served via the OpenAI-compatible provider. Source: issue #247.
Collection routing authorization gap. Issue #267 notes that CollectionRouter selects collections based on the query alone and ignores caller authorization context, so provider-side guardrails must be enforced elsewhere. Source: issue #267, deepsearcher/agent/collection_router.py.

Embedding Model Configuration

Related topics: LLM Provider Configuration, Vector Database & Data Loader Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Embedding Model Configuration

Overview

The embedding model is a foundational component of the DeepSearcher retrieval pipeline. Every query that is sent to a vector database must first be transformed into a dense vector, and every indexed chunk must be stored as one. DeepSearcher abstracts this concern behind a single interface — BaseEmbedding — so the same agent code can be re-targeted against different providers (cloud APIs, local runtimes, or on-disk models) without modifying retrieval logic.

All RAG agents in the project — NaiveRAG, DeepSearch, and ChainOfRAG — accept an embedding_model: BaseEmbedding instance in their constructor and call two methods on it: embed_query(...) (used at retrieval time) and an embed method used at ingestion time. They also read embedding_model.dimension to feed it into the CollectionRouter so that the right collection can be selected and any new collections created in the vector store can be initialized with a matching dimensionality. Source: deepsearcher/agent/naive_rag.py:1-90, deepsearcher/agent/deep_search.py:1-120, deepsearcher/agent/chain_of_rag.py:1-140.

Community evidence reflects strong interest in expanding the set of supported embedding backends. For example, issue #247 asks whether local Qwen3-Embedding can be deployed in place of Ollama, and the most recent release notes ship a new Milvus_default_embedding_model (the GPTCache-backed backend). The configuration layer is the single place where these choices are made.

The `BaseEmbedding` Interface

DeepSearcher defines an abstract base class for embedding models under deepsearcher/embedding/base.py. The class exposes the contract that every concrete provider must implement:

Member	Purpose
`dimension` (property)	The fixed vector size produced by the model; used by `CollectionRouter` and by vector DB initialization.
`embed_query(text: str)`	Embeds a single query string (used at retrieval time).
`embed_documents(texts: List[str])`	Embeds a batch of chunk strings (used at ingestion time).
Optional `is_normalized` flag	Some models (e.g. BGE) ship pre-normalized vectors, which the agent code consumes when computing similarity.

Concrete subclasses are plug-and-play. The agent code in naive_rag.py and deep_search.py only ever references self.embedding_model.embed_query(...) and self.embedding_model.dimension — it never imports a specific provider — so swapping a backend is purely a configuration concern. Source: deepsearcher/agent/naive_rag.py:34-90, deepsearcher/agent/deep_search.py:30-80.

Available Providers

DeepSearcher ships with multiple BaseEmbedding implementations, each living in its own module under deepsearcher/embedding/:

Milvus (GPTCache) default embedding — deepsearcher/embedding/milvus_embedding.py. The newest default; leverages GPTCache model bindings exposed by the Milvus ecosystem. Selected by the most recent release tagged Milvus_default_embedding_model(GPTCache model).
OpenAI-compatible embedding — deepsearcher/embedding/openai_embedding.py. Targets text-embedding-3-* and similar OpenAI models; configurable through standard OPENAI_API_KEY / OPENAI_BASE_URL environment variables.
Voyage AI embedding — deepsearcher/embedding/voyage_embedding.py. Targets Voyage's embedding endpoints (e.g. voyage-3).
FastEmbed (local) — deepsearcher/embedding/fastembed_embdding.py. Runs models such as BGE locally without a network round-trip. This is the most common choice for fully offline deployments and is the closest match to what issue #247 requests for Qwen3-Embedding.

flowchart LR
    A[User Query] --> B[Agent retrieve]
    B --> C[CollectionRouter]
    C --> D[BaseEmbedding.embed_query]
    D --> E[(Vector DB)]
    E --> F[Top-k Chunks]
    F --> G[BaseLLM.chat]
    G --> H[Final Answer]

The diagram above shows where the embedding model sits in the critical path. Notice that the embedding is invoked once per query, regardless of which agent is selected — meaning a misconfigured embedding backend silently degrades every retrieval strategy at once.

Configuration and Wiring

Embedding configuration is performed at the Configuration layer (deepsearcher.configuration) and is propagated into the agents during initialization. The snippet in issue #80 illustrates the intended pattern:

from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()

# Customize your config here,
# more configuration see the Configuration Details section...
init_config(config=config)

In practice, the configuration object exposes a provider (e.g. openai, milvus, voyage, fastembed), a model_name, and provider-specific fields such as api_key or base_url. At runtime, init_config constructs the matching BaseEmbedding subclass and threads it into every RAG agent — NaiveRAG, DeepSearch, and ChainOfRAG — which all hold a reference to the same instance. This means changing the embedding in Configuration is the only edit required to retarget the entire stack. Source: deepsearcher/agent/chain_of_rag.py:40-110.

The dimension field that comes back from the configured BaseEmbedding is also used to size new Milvus collections on the fly; if a different embedding model is selected mid-project, the agent will create a new collection (and ignore the old one) rather than fail on a vector-size mismatch.

Common Failure Modes

A few recurring issues in the community trace back to embedding configuration choices:

ModuleNotFoundError: No module named 'milvus_lite' (#67) — the default backend on Linux/macOS is the Milvus Lite path; Windows users must either install a matching pymilvus build or switch the embedding provider away from milvus.
Local Qwen3-Embedding / vLLM (#247) — FastEmbed is the supported local path today; users wanting Qwen3-Embedding locally currently need a custom BaseEmbedding subclass, because the project's FastEmbed module does not yet wrap Qwen3.
CLI import errors (#255) — when the embedding provider is misconfigured at install time, the CLI fails to import before the agent layer is even reached. Confirming that the provider selected in Configuration has its required dependency installed (pymilvus, openai, voyageai, fastembed) is the first debugging step.

Vector Database & Data Loader Configuration

Related topics: LLM Provider Configuration, Embedding Model Configuration, RAG Agent System & Retrieval Strategies

Section Related Pages

Continue reading this section for the full explanation and source context.

Vector Database & Data Loader Configuration

Overview

In DeepSearcher, the vector database and data loader are not first-class configuration topics handled inside the agent modules, but they are central dependencies that every agent constructs at initialization time. The agent classes in deepsearcher/agent/naive_rag.py, deepsearcher/agent/deep_search.py, and deepsearcher/agent/chain_of_rag.py all accept a vector_db: BaseVectorDB instance and an embedding_model: BaseEmbedding instance as required constructor arguments. The configuration surface is therefore expressed in code as object construction, while the high-level Configuration object (referenced in the quickstart snippet from issue #80) is the user-facing entry point that wires those objects together.

Community feedback highlights the practical pain points of this configuration. Issue #67 reports a ModuleNotFoundError: No module named 'milvus_lite' when the local Milvus backend is selected, and the maintainers note that milvus_lite only supports Ubuntu >= 20.04 and macOS >= 11.0, which constrains how the default vector database can be configured on Windows. The latest release, "Milvus_default_embedding_model(GPTCache model)", further indicates that the default embedding model and vector backend are tightly coupled and shipped as a coordinated unit. Source: evaluation/README.md.

Vector Database Contract

All agents rely on a shared BaseVectorDB abstraction imported from deepsearcher.vector_db.base. The contract that the agents depend on is:

Capability used by agents	Source location
`vector_db.search(...)` returning `List[RetrievalResult]`	`deepsearcher/agent/naive_rag.py:67-86`, `deepsearcher/agent/deep_search.py:170-200`
`vector_db.list_collections(dim=...)` returning collection metadata	`deepsearcher/agent/collection_router.py:55-62`
`vector_db.default_collection` used as a fallback target	`deepsearcher/agent/collection_router.py:88-94`
`embedding_model.dimension` passed as the `dim` argument for collection listing	`deepsearcher/agent/naive_rag.py:48`, `deepsearcher/agent/collection_router.py:42`
`deduplicate_results(...)` to merge results across iterations or collections	`deepsearcher/agent/deep_search.py:155`, `deepsearcher/agent/chain_of_rag.py:140`

Because the agent layer only ever talks to BaseVectorDB and BaseEmbedding, the concrete data loader and vector database (Milvus, Milvus Lite, Qdrant, Azure Search, Oracle, etc., as enumerated in deepsearcher/vector_db/__init__.py in the broader project) can be swapped by changing the wired instance without modifying agent code. This is the design pattern that makes "configuration" effectively a matter of choosing the right concrete class and credential set. Source: deepsearcher/agent/naive_rag.py:36-58.

Collection Routing and Data Placement

The CollectionRouter in deepsearcher/agent/collection_router.py is the bridge between the agent layer and the physical vector database collections where loaded data lives. At construction time, it enumerates all available collections using self.vector_db.list_collections(dim=dim), where dim is taken from embedding_model.dimension so that the router only sees collections whose schema matches the active embedding model. This is the de-facto configuration check for whether a piece of loaded data is even visible to the current pipeline. Source: deepsearcher/agent/collection_router.py:42-58.

At query time, the router calls the LLM with COLLECTION_ROUTE_PROMPT, asking it to pick a Python list of collection names from the candidate list. Two rules then extend that selection: any collection with an empty description is always added (the query itself is used as the search query), and the default_collection is always appended. The final list is deduplicated before being passed to the per-collection search loop. This explains why data loaders should populate the description field on collections: an empty description forces the collection to be searched for every query, which is the correct behavior for a default "catch-all" collection but a footgun for named, domain-specific collections. Source: deepsearcher/agent/collection_router.py:78-100.

A known limitation surfaces in issue #267, "Collection routing ignores caller authorization context": because the router relies solely on the LLM to pick collections, it has no way to enforce per-caller access control. Any user with query access effectively gets to search every collection the router knows about.

Agent-Level Configuration Knobs

The agent constructors expose the configuration levers that most directly affect vector DB behavior:

top_k (NaiveRAG, default 10) — number of chunks fetched per collection per query. Source: deepsearcher/agent/naive_rag.py:40-58.
max_iter (DeepSearch default 3, ChainOfRAG default 4) — caps the reflection/re-query loop that drives additional vector searches. Source: deepsearcher/agent/deep_search.py:36-58, deepsearcher/agent/chain_of_rag.py:46-70.
route_collection (default True on all three agents) — toggles the CollectionRouter. When False, the agent searches collection_router.all_collections directly with zero routing tokens. Source: deepsearcher/agent/naive_rag.py:70-78.
text_window_splitter (default True) — when enabled, the summarization step reads the wider_text metadata field produced by the splitter instead of the raw chunk text, which materially changes the context the LLM sees at answer time. Source: deepsearcher/agent/naive_rag.py:96-110.
early_stopping (ChainOfRAG, default False) — uses a reflection prompt to break the loop as soon as the intermediate context is judged sufficient, reducing redundant vector searches. Source: deepsearcher/agent/chain_of_rag.py:46-70.

The RAGRouter in deepsearcher/agent/rag_router.py sits one level above these knobs: it uses the __description__ attribute (registered via the @describe_class decorator in deepsearcher/agent/base.py) to pick which agent should handle a given query, which is the recommended way to expose the configuration trade-offs to end users without forcing them to choose an agent manually. Source: deepsearcher/agent/base.py:8-26.

Data Flow and Common Failure Modes

flowchart LR
    A[Load documents] --> B[Embedding model]
    B --> C[Vector DB collections<br/>with dim from embedding_model]
    Q[User query] --> R[CollectionRouter<br/>uses LLM + dim]
    R --> S[Search selected collections]
    S --> T[deduplicate_results]
    T --> U[Agent summarization<br/>uses wider_text if available]
    U --> A2[Final answer]

Two failure modes recur in community reports and are visible from the code:

Embedding/collection dimension mismatch. The router passes embedding_model.dimension to list_collections(dim=dim). If a previously loaded collection was built with a different embedding model (for example, switching from the GPTCache default mentioned in the latest release to Qwen3-Embedding as requested in issue #247), the dimension check will silently drop that collection from the candidate list, and the agent will return no results without an explicit error.
Missing backend dependency on Windows. Per issue #67, the milvus_lite extension required by the default local Milvus configuration is unavailable on Windows. The community-validated workaround is to upgrade pymilvus to a version that bundles the right native libraries, or to switch to a different BaseVectorDB implementation such as Qdrant or Azure Search.

RAG Agent System & Retrieval Strategies

Related topics: Project Overview & System Architecture, Vector Database & Data Loader Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section NaiveRAG — Single-Pass Retrieval

Continue reading this section for the full explanation and source context.

Section ChainOfRAG — Iterative Sub-Query Decomposition

Continue reading this section for the full explanation and source context.

Section DeepSearch — Reflective Multi-Iteration Search

Continue reading this section for the full explanation and source context.

RAG Agent System & Retrieval Strategies

The deepsearcher.agent package is the orchestration layer that turns a natural-language question into a verified answer grounded in the user's private vector store (and, optionally, the public web). It defines a small hierarchy of abstract base classes and ships four concrete strategies that share the same inputs — an LLM, an embedding model, and a vector database — but apply very different retrieval and reasoning loops. Source: deepsearcher/agent/__init__.py:1-12.

Agent Class Hierarchy

At the root is BaseAgent, an abstract class with a single invoke(query, **kwargs) entry point. RAGAgent extends it and adds two structured methods: retrieve() returning (results, tokens, metadata), and query() returning (answer, results, tokens). All concrete RAG agents inherit from RAGAgent. Source: deepsearcher/agent/base.py:36-90.

A describe_class() decorator injects a __description__ string on each agent class. This description is later read by RAGRouter to decide which agent should handle a query, so every shipped agent ships with a curated one-liner explaining its strength. Source: deepsearcher/agent/base.py:18-34.

The Three Retrieval Strategies

NaiveRAG — Single-Pass Retrieval

NaiveRAG is the simplest implementation. It embeds the query, optionally uses CollectionRouter to narrow the search to relevant collections, pulls top_k chunks from vector_db, formats them inside <chunk_i>...</chunk_i> tags, and asks the LLM to write a final SUMMARY_PROMPT. Source: deepsearcher/agent/naive_rag.py:28-110.

It supports two flags: route_collection (off by default — must be opted in) and text_window_splitter which prefers the wider_text metadata field of each chunk so the LLM sees surrounding context instead of an isolated fragment. Source: deepsearcher/agent/naive_rag.py:18-110.

ChainOfRAG — Iterative Sub-Query Decomposition

ChainOfRAG targets concrete, factual, multi-hop questions. At each iteration it (1) prompts the LLM for a single follow-up sub-query, (2) retrieves chunks for that sub-query, (3) generates an intermediate answer using only retrieved documents, (4) asks the LLM to pick the supporting docs via GET_SUPPORTED_DOCS_PROMPT, and (5) calls REFLECTION_PROMPT to decide whether to stop early or iterate again. Source: deepsearcher/agent/chain_of_rag.py:30-180.

Key configuration: max_iter (default 4), early_stopping (default False), route_collection (default True). When early_stopping=True, the loop terminates as soon as the reflection prompt returns "Yes", saving both tokens and latency. The final answer is composed from the union of deduplicated chunks plus all intermediate answer contexts. Source: deepsearcher/agent/chain_of_rag.py:120-200.

DeepSearch — Reflective Multi-Iteration Search

DeepSearch is designed for "write me a report / survey / article" prompts. Its async_retrieve() first calls _generate_sub_queries() (up to 4 sub-questions, or the original query alone if already simple), then executes all vector searches in parallel with asyncio.gather. Source: deepsearcher/agent/deep_search.py:130-180.

Two LLM-driven filters shape the final corpus:

RERANK_PROMPT — accepts or rejects each individual chunk against the active sub-queries.
REFLECT_PROMPT — at the end of each iteration, asks the LLM whether further research is needed and, if so, returns up to 3 "gap queries" that feed the next iteration.

The loop runs for max_iter rounds (default 3), then a final SUMMARY_PROMPT consolidates everything. Note that DeepSearch always instantiates a CollectionRouter regardless of the route_collection flag in its constructor. Source: deepsearcher/agent/deep_search.py:30-60; deepsearcher/agent/deep_search.py:90-130.

Query and Collection Routing

RAGRouter selects one agent out of a user-supplied list. It builds a numbered prompt of agent descriptions and parses the LLM's chosen index, with a defensive fallback that scans for the last digit if a reasoning model wraps the answer in prose. Source: deepsearcher/agent/rag_router.py:15-75.

Inside each agent, CollectionRouter performs a second, finer-grained routing step: choosing which vector-DB collections to search. It is constructed from (llm, vector_db, dim) and exposed via collection_router.all_collections when routing is disabled. Source: deepsearcher/agent/naive_rag.py:50-80; deepsearcher/agent/chain_of_rag.py:30-60.

Data Flow

flowchart TD
    Q[User Query] --> RR[RAGRouter]
    RR -->|picks agent| NA[NaiveRAG]
    RR -->|picks agent| CR[ChainOfRAG]
    RR -->|picks agent| DS[DeepSearch]
    NA --> CR2[CollectionRouter]
    CR --> CR2
    DS --> CR2
    CR2 --> VDB[(Vector DB)]
    VDB --> CR2
    CR2 -->|top_k chunks| NA
    CR2 -->|top_k chunks| CR
    CR2 -->|top_k chunks| DS
    NA --> LLM[LLM Summary]
    CR -->|loop: follow-up + reflect| LLM
    DS -->|loop: reflect + gap queries| LLM
    LLM --> A[Final Answer + Citations]

Failure Modes and Community Notes

Routing & authorization. Community issue #267 reports that CollectionRouter ignores the caller's authorization context, so any user-visible collection may be searched even when it should be filtered. Treat collection names as non-sensitive labels until ACLs are layered on top. Source: deepsearcher/agent/collection_router.py; community context #267.
Reasoning-model prompts. The prompts intentionally call llm.remove_think() before parsing integers and indices because reasoning models (o-series, DeepSeek R1, Claude 3.7 Sonnet) wrap answers in <think>... tags. The RAGRouter falls back to find_last_digit when this still fails. Source: deepsearcher/agent/rag_router.py:55-70; deepsearcher/agent/deep_search.py:170-200.
Web vs. private data. DeepSearcher is intentionally a private-data-first RAG; web search is a complementary add-on. The current DeepSearch code reserves a search_res_from_internet = [] slot for a future web backend, which is the entry point relevant to feature request #270 for adding serpbase.dev. Source: deepsearcher/agent/deep_search.py:150-160; community context #270.
LLM quality matters. Small LLMs struggle with literal_eval, list-format responses, and the YES/NO reranker, producing hallucinations. The maintainers explicitly recommend reasoning-grade models.

Deployment, CLI & FastAPI Service

Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies

Section Related Pages

Continue reading this section for the full explanation and source context.

Deployment, CLI & FastAPI Service

Deep-Searcher is shipped as a Python package that exposes a command-line entry point, a programmatic main.py driver, a reproducible Makefile workflow, and an opt-in container image via the project Dockerfile. This page documents the deployment surface area, how the pieces fit together, and the failure modes reported by the community.

1. Entry Points and High-Level Topology

The repository exposes the system through three complementary surfaces: a generated console script, a Python module entry, and an evaluation harness.

The console script deepsearcher is generated by the package's [project.scripts] (or console_scripts) metadata. When installed in a Python environment it becomes an executable that delegates to deepsearcher.cli:main. This is visible in community stack traces where the invocation resolves to C:\...\Scripts\deepsearcher.exe\__main__.py and immediately runs from deepsearcher.cli import main (community trace from issue #255).
main.py at the repository root is the canonical "load → query" driver used in the Quickstart guide. It imports Configuration and init_config from deepsearcher.configuration, then drives deepsearcher.online_query.query against the configured provider stack.
evaluation/evaluate.py is a separate entry point dedicated to batch benchmarking; it does not share runtime state with the CLI.

flowchart LR
    A[User Shell] -->|deepsearcher| B[deepsearcher/cli.py]
    A -->|python main.py| C[main.py driver]
    A -->|python evaluate.py| D[evaluation/evaluate.py]
    B --> E[Configuration + init_config]
    C --> E
    D --> E
    E --> F[Vector DB + LLM + Embedding]
    F --> G[Agents: NaiveRAG / ChainOfRAG / DeepSearch]
    G --> H[Answer + RetrievalResults]

Source: deepsearcher/cli.py, main.py, evaluation/evaluate.py

2. CLI Surface (`deepsearcher.cli`)

The CLI module is the most fragile of the entry points because it is what end users run after pip install. The traceback reproduced in issue #255 confirms three facts that operators must understand:

The console-script wrapper lives under the active Python environment's Scripts/ directory and re-exports deepsearcher.cli.main.
Importing deepsearcher.cli eagerly pulls in the full agent, vector-DB, and LLM dependency tree. A failure anywhere in that tree surfaces as a ModuleNotFoundError from the CLI before any user code runs.
On Windows with Python 3.13, the deepsearcher.exe shim and the underlying imports must both succeed; community reports show partial Windows breakage driven by milvus_lite (issue #67 — milvus_lite officially supports Ubuntu ≥ 20.04 and macOS ≥ 11.0).

Operators deploying the CLI on Windows should therefore pin to a Python version supported by pymilvus/milvus_lite, or run the CLI inside WSL/Linux where the native library resolves correctly.

Source: deepsearcher/cli.py, community reports in issues #255 and #67.

3. Programmatic Driver and FastAPI-Style Service

main.py is the reference deployment pattern: instantiate Configuration, mutate it for the target LLM / embedding / vector-DB providers, then call init_config(config) before issuing query(...) calls. The driver composes the agents defined in deepsearcher/agent/:

NaiveRAG — single-shot vector retrieval + summarization (see SUMMARY_PROMPT flow in deepsearcher/agent/naive_rag.py).
ChainOfRAG — iterative sub-query decomposition with reflection (REFLECTION_PROMPT, GET_SUPPORTED_DOCS_PROMPT) and optional early_stopping, see deepsearcher/agent/chain_of_rag.py.
DeepSearch — multi-aspect sub-query generation + gap-driven reflection (REFLECT_PROMPT), see deepsearcher/agent/deep_search.py.

These three classes share the RAGAgent contract defined in deepsearcher/agent/base.py (retrieve(...) returns (List[RetrievalResult], int, dict), query(...) returns (str, List[RetrievalResult], int)). Any HTTP layer wrapping Deep-Searcher (FastAPI, Starlette, etc.) only needs to forward a string query into query(...) and serialize the returned tuple — no special protocol is required because the return shape is stable across agents.

Source: main.py, deepsearcher/agent/base.py, deepsearcher/agent/naive_rag.py, deepsearcher/agent/chain_of_rag.py.

4. Containerization, Make Targets, and Evaluation Harness

The repository ships a Dockerfile and Makefile to make local and CI deployments reproducible. Issue #78 ("Build up an OFFICIAL Docker image please") is the most up-voted deployment request, indicating that the in-repo Dockerfile is currently the supported path rather than a published registry image.

The Makefile typically wraps the most common operator tasks (install, lint, test, run-evaluate). Combined with evaluation/evaluate.py, the supported evaluation flow documented in evaluation/README.md is:

python evaluate.py \
  --dataset 2wikimultihopqa \
  --config_yaml ./eval_config.yaml \
  --pre_num 5 \
  --output_dir ./eval_output

Key flags and behaviors per the README:

Flag	Purpose
`--dataset`	Selects the QA dataset (currently `2wikimultihopqa`).
`--config_yaml`	Path to a YAML file specifying LLM, embedding, and provider parameters.
`--pre_num`	Number of samples to evaluate; higher = more accurate but more token cost.
`--skip_load`	Reuse a previously loaded vector DB instead of re-ingesting.
`--output_dir`	Destination for recall plots (e.g. `plot_results/max_iter_vs_recall.png`).

The evaluation uses Recall@K: the percentage of relevant documents appearing in the top-K retrieved results. The README notes diminishing returns as max_iter increases — most models gain steeply between 2–4 iterations and plateau afterwards, with Claude-3-7-sonnet approaching near-perfect recall at 7 iterations on the 50-sample preview.

Source: Dockerfile, Makefile, evaluation/evaluate.py, evaluation/README.md.

5. Common Deployment Failure Modes

The community context surfaces four recurring deployment problems that operators should pre-empt:

Windows + Python 3.13 CLI breakage — issues #255 and #67. The console-script shim crashes before any user logic executes because milvus_lite lacks Windows wheels. Mitigation: run on Linux/macOS or via the supplied Dockerfile.
Quickstart syntax bug — issue #80 reports an unbalanced parenthesis in the published from deepsearcher.configuration import Configuration, init_config snippet. Always copy the snippet directly from the repo rather than cached docs.
No official Docker image — issue #78. Until an official image is published, build locally from the Dockerfile.
Local LLM/embedding deployment — issue #247 shows that users wanting Qwen3-Embedding or Qwen3 LLM locally must use vllm>=0.8.5 rather than Ollama for acceptable throughput, and must wire the provider through Configuration + init_config in main.py.

Source: community issues #255, #67, #80, #78, #247.

Extensibility, Troubleshooting & FAQ

Related topics: LLM Provider Configuration, Embedding Model Configuration, Vector Database & Data Loader Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Adding a New Agent

Continue reading this section for the full explanation and source context.

Section Adding Providers (LLM, Embedding, Vector DB, Web Search)

Continue reading this section for the full explanation and source context.

Section 1. ModuleNotFoundError: No module named 'milvuslite'

Continue reading this section for the full explanation and source context.

Extensibility, Troubleshooting & FAQ

Overview

DeepSearcher is designed with extensibility as a first-class concern. Every core capability — large language models, embeddings, vector databases, retrieval agents, and web search backends — is abstracted behind a base class with a minimal interface. This lets contributors plug in new providers without forking the framework. At the same time, a predictable extension surface means troubleshooting stays tractable: most failures trace back to a misconfigured provider, a missing native dependency, or an LLM that is too weak for the prompt-following the framework expects.

This page consolidates how to extend the system, how to diagnose the most common runtime errors reported by the community, and answers to frequently asked questions drawn from the issue tracker.

Extension Points

Adding a New Agent

All retrieval agents inherit from RAGAgent, which in turn extends BaseAgent defined in deepsearcher/agent/base.py. The contract is intentionally small: implement retrieve(query, kwargs) -> (List[RetrievalResult], int, dict) and query(query, kwargs) -> (str, List[RetrievalResult], int).

To make an agent discoverable by the query router, decorate the class with @describe_class("…"). The decorator stores the description on cls.__description__, which RAGRouter reads at construction time when no explicit agent_descriptions list is supplied (deepsearcher/agent/rag_router.py).

Reference implementations to study:

NaiveRAG — single-pass retrieve + summarize (deepsearcher/agent/naive_rag.py).
ChainOfRAG — iterative follow-up queries with early stopping, suitable for multi-hop factual questions (deepsearcher/agent/chain_of_rag.py).
DeepSearch — sub-query decomposition, LLM-based reranking, and reflection to fill gaps (deepsearcher/agent/deep_search.py).

Adding Providers (LLM, Embedding, Vector DB, Web Search)

The same pattern repeats across modules: subclass the Base… abstract class, implement the required methods, then register the provider in the configuration layer so Configuration() can resolve it by name. Each provider module follows this shape (constructor takes a config object, methods return typed dataclasses such as RetrievalResult).

Community discussions around new providers frequently reference this pattern, including requests to add additional web search backends such as serpbase.dev (issue #270) and local embedding/LLM stacks like Qwen3-Embedding served via vllm or an OpenAI-compatible endpoint (issue #247). Both are accommodated by the existing base classes without changes to core logic.

Common Errors and Troubleshooting

The error categories below are the ones most frequently reported by users in the issue tracker.

1. `ModuleNotFoundError: No module named 'milvus_lite'`

milvus_lite is the default embedded vector database backend. Its prebuilt wheels only ship for Ubuntu ≥ 20.04 and macOS ≥ 11.0, which is why Windows users hit this error (issue #67). Remedies:

Upgrade to a recent pymilvus version that bundles a compatible wheel.
Switch to an alternative vector DB backend that runs on Windows (e.g., a remote Milvus/Zilliz Cloud instance) by changing the vector_db block of your configuration.

2. Quickstart `SyntaxError` from an Unbalanced Parenthesis

An older quickstart snippet shipped with a missing closing bracket, producing an immediate SyntaxError on import (issue #80). Always copy from the latest README or docs site; the snippet should now read:

from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
init_config(config=config)

3. `deepsearcher.exe` Traceback on Windows

A launcher traceback at process start (e.g., from deepsearcher.cli import main failing inside Scripts/deepsearcher.exe/__main__.py) usually means a partial or broken install (issue #255). Recommended fix:

pip uninstall deepsearcher
pip install --upgrade deepsearcher

If the launcher still fails, run the library directly with python -m deepsearcher.cli to surface the real error.

4. Collection Routing Ignoring Authorization

CollectionRouter selects collections based on the query alone; it does not receive the caller's authorization context (issue #267). In multi-tenant deployments you must pre-filter collections in your application layer before constructing the agent, or extend CollectionRouter to accept an auth context.

5. Weak LLM Producing Hallucinated Routing

RAGRouter parses the agent index from the LLM output and falls back to "last digit" parsing when a reasoning model emits prose (deepsearcher/agent/rag_router.py). Smaller or non-reasoning LLMs frequently fail this step. The maintainers' guidance, mirrored in the issue template, is to use a frontier or reasoning model (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) for both routing and generation.

FAQ

Q: Which agent should I use? NaiveRAG is the cheapest and works well for single-fact lookups. DeepSearch is the most thorough and is the default for general topic/report-style questions. ChainOfRAG strikes a middle ground for multi-hop factual queries that benefit from iterative refinement but do not need full sub-query decomposition (deepsearcher/agent/chain_of_rag.py).

Q: Can I run DeepSearcher fully offline? Yes, provided the LLM and embedding model are exposed via an OpenAI-compatible endpoint (e.g., vllm serving Qwen3-Embedding — issue #247). Point the LLM and embedding provider configurations at that endpoint.

Q: How is the evaluation suite run? The evaluation harness in evaluation/README.md supports the 2WikiMultiHopQA dataset out of the box and reports Recall@K against DeepSearcher versus a naive RAG baseline:

python evaluate.py \
  --dataset 2wikimultihopqa \
  --config_yaml ./eval_config.yaml \
  --pre_num 5 \
  --output_dir ./eval_output

Re-running after the first load can be accelerated with --skip_load.

Q: Where do logs go? The logger in deepsearcher/utils/log.py writes colored progress output via color_print and dev-level diagnostics via dev_logger. critical() raises RuntimeError, so it should only be used for fatal paths.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 12 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/255

2. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/270

3. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/67

4. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/zilliztech/deep-searcher

5. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/zilliztech/deep-searcher

6. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/zilliztech/deep-searcher

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/zilliztech/deep-searcher

9. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/254

10. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/267

11. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

12. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using deep-searcher with real data or production workflows.

Feature Request: Add serpbase.dev as a web search source for reliable Go - github / github_issue
Collection routing ignores caller authorization context - github / github_issue
run issue - github / github_issue
ModuleNotFoundError: No module named 'milvus_lite' - github / github_issue
BurnCloud seeks to contribute enhancements - Permission to submit PR - github / github_issue
Can it support local deployment of Qwen3-Embedding? - github / github_issue
Milvus_default_embedding_model(GPTCache model) - github / github_release
Configuration risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

deep-searcher

Project Overview & System Architecture

Related Pages

Project Overview & System Architecture

1. Purpose and Scope

2. Core Architectural Components

3. Agent Implementations and Their Strategies

3.1 NaiveRAG

3.2 ChainOfRAG

3.3 DeepSearch

4. Query Flow and System Wiring

5. Configuration, Logging, and Extensibility

See Also

Installation & Quickstart

Related Pages

Installation & Quickstart

1. Purpose and Scope

2. Prerequisites

3. Installation

3.1 Install from PyPI (recommended for new users)

3.2 Install from source

3.3 Container image (community-requested)

4. Configuration and First Run

4.1 Environment variables

4.2 Minimal program (`main.py`)

4.3 CLI usage

4.4 `examples/basic_example.py`

5. Troubleshooting

6. End-to-End Flow

See Also

LLM Provider Configuration

Related Pages

LLM Provider Configuration

Overview

The BaseLLM Contract

Built-in Providers

Configuration Lifecycle

Common Failure Modes and Community Pitfalls

See Also

Embedding Model Configuration

Related Pages

Embedding Model Configuration

Overview

The `BaseEmbedding` Interface

Available Providers

Configuration and Wiring

Common Failure Modes

See Also

Vector Database & Data Loader Configuration

Related Pages

Vector Database & Data Loader Configuration

Overview

Vector Database Contract

Collection Routing and Data Placement

Agent-Level Configuration Knobs

Data Flow and Common Failure Modes

See Also

RAG Agent System & Retrieval Strategies

Related Pages

RAG Agent System & Retrieval Strategies

Agent Class Hierarchy

The Three Retrieval Strategies

NaiveRAG — Single-Pass Retrieval

ChainOfRAG — Iterative Sub-Query Decomposition

DeepSearch — Reflective Multi-Iteration Search

Query and Collection Routing

Data Flow

Failure Modes and Community Notes

See Also

Deployment, CLI & FastAPI Service

Related Pages

Deployment, CLI & FastAPI Service

1. Entry Points and High-Level Topology

2. CLI Surface (`deepsearcher.cli`)

3. Programmatic Driver and FastAPI-Style Service

4. Containerization, Make Targets, and Evaluation Harness

5. Common Deployment Failure Modes

See Also

Extensibility, Troubleshooting & FAQ

Related Pages