Doramagic Project Pack · Human Manual
deep-searcher
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Project Overview & System Architecture
Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies
Project Overview & System Architecture
1. Purpose and Scope
DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines private knowledge bases with optional web search to answer complex queries. The repository is organized around a modular agent architecture in which each agent encapsulates a different retrieval-and-reasoning strategy. The framework plugs in an LLM, an embedding model, and a vector database as swappable backends, and exposes a unified query interface to the end user. As described in the bundled evaluation notes, the project is positioned to handle complex multi-hop questions and supports recall-based evaluation against datasets such as 2WikiMultiHopQA. Source: evaluation/README.md:1-15
The system's high-level design is centered on three RAG agent implementations (NaiveRAG, ChainOfRAG, DeepSearch) and a routing layer (RAGRouter) that picks the best implementation for a given query. This plug-in approach makes the framework easy to extend with new agents, vector stores, or LLMs.
2. Core Architectural Components
The codebase is structured around a small set of abstract base classes and concrete implementations:
| Component | File | Role |
|---|---|---|
BaseAgent | deepsearcher/agent/base.py | Abstract root for any agent; defines invoke(query, **kwargs). |
RAGAgent | deepsearcher/agent/base.py | Subclass of BaseAgent for RAG-style agents; requires retrieve() and query() returning (answer, results, token_usage). |
NaiveRAG | deepsearcher/agent/naive_rag.py | Simple retrieve-then-summarize agent. |
ChainOfRAG | deepsearcher/agent/chain_of_rag.py | Multi-step iterative RAG with reflection and supported-doc filtering. |
DeepSearch | deepsearcher/agent/deep_search.py | Sub-query decomposition, async retrieve, rerank, reflection, gap-query generation. |
RAGRouter | deepsearcher/agent/rag_router.py | LLM-driven router that picks a single agent for a given query. |
CollectionRouter | referenced from each agent | Selects relevant vector-DB collections before retrieval. |
BaseAgent only mandates an invoke method, while RAGAgent adds the contract retrieve(query, kwargs) -> (List[RetrievalResult], int, dict) and query(query, kwargs) -> (str, List[RetrievalResult], int). Source: deepsearcher/agent/base.py:30-77 This contract is what makes the agents interchangeable inside the router.
3. Agent Implementations and Their Strategies
3.1 NaiveRAG
NaiveRAG performs a single retrieval pass and asks the LLM to summarize the retrieved chunks. It supports optional route_collection (using CollectionRouter to pick collections) and text_window_splitter (which substitutes metadata["wider_text"] for each chunk in the summary prompt). The summary prompt instructs the LLM to behave as a content analysis expert that consolidates chunks into a "specific and detailed answer or report." Source: deepsearcher/agent/naive_rag.py:1-130
3.2 ChainOfRAG
ChainOfRAG runs up to max_iter rounds. At each iteration it generates a follow-up sub-query, retrieves documents, asks the LLM for an intermediate answer, lets the LLM select which documents supported that answer (_get_supported_docs), and then uses a reflection prompt (REFLECTION_PROMPT) to decide whether to stop early. Final answers are synthesized with FINAL_ANSWER_PROMPT. The class docstring notes it is "very suitable for handling concrete factual queries and multi-hop questions," inspired by the paper referenced in the file. Source: deepsearcher/agent/chain_of_rag.py:1-200
3.3 DeepSearch
DeepSearch is the most elaborate agent. It decomposes the original query into up to four sub-questions (SUB_QUERY_PROMPT), retrieves and reranks chunks (RERANK_PROMPT), reflects to find gaps (REFLECT_PROMPT), and iterates up to max_iter times. It exposes both retrieve (synchronous wrapper) and async_retrieve, using asyncio.run to bridge them. The class docstring positions it for "general and simple queries, such as given a topic and then writing a report, survey, or article." Source: deepsearcher/agent/deep_search.py:1-200
4. Query Flow and System Wiring
The end-to-end flow uses the LLM twice: once at the agent-router level, and again inside the chosen agent. RAGRouter._route formats a prompt that lists each agent's index and __description__ (auto-populated via the describe_class decorator) and asks the LLM to return a single index. A fallback extracts the last digit if the LLM's output is not purely numeric — a common behavior with reasoning models. Source: deepsearcher/agent/rag_router.py:1-90
Inside any RAG agent, when route_collection=True, the query first passes through CollectionRouter.invoke, which returns the selected collections and the routing token cost. Retrieved chunks are deduplicated through deduplicate_results before being passed to the LLM for answer generation. The terminal log line ==== FINAL ANSWER==== is emitted via the colored progress logger defined in the utility module. Source: deepsearcher/agent/naive_rag.py:1-130, deepsearcher/utils/log.py:1-90
flowchart TD
A[User Query] --> B[RAGRouter._route]
B --> C{LLM selects agent}
C -->|1| D[NaiveRAG]
C -->|2| E[ChainOfRAG]
C -->|3| F[DeepSearch]
D --> G[CollectionRouter]
E --> G
F --> G
G --> H[Vector DB retrieve]
H --> I[Deduplicate chunks]
I --> J[LLM summarize/reflect]
J --> K[Final answer]5. Configuration, Logging, and Extensibility
All agents take the same constructor triad: an LLM (BaseLLM), an embedding model (BaseEmbedding), and a vector database (BaseVectorDB). Optional flags such as top_k, max_iter, early_stopping, route_collection, and text_window_splitter let users tune retrieval depth versus cost. The describe_class decorator on each agent supplies a human-readable description that the router consumes, which is how a new agent becomes selectable automatically. Source: deepsearcher/agent/base.py:1-30, deepsearcher/agent/rag_router.py:30-60
The logging layer splits output into a "dev" logger (gated by set_dev_mode) and a "progress" logger that drives colored console output via termcolor and a ColoredFormatter. The progress_logger is what users see during a query call, while dev_mode controls verbose diagnostic logs. Source: deepsearcher/utils/log.py:1-110
For evaluation, the project ships an evaluation/ directory that runs Recall@K against 2WikiMultiHopQA, comparing DeepSearcher against naive RAG. A pre_num argument controls sample count, and --skip_load reuses an already-loaded vector DB on subsequent runs. Source: evaluation/README.md:1-30
See Also
- Agents and Routing
- Configuration and Provider Setup
- Evaluation and Benchmarks
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Installation & Quickstart
Related topics: Project Overview & System Architecture, Deployment, CLI & FastAPI Service
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview & System Architecture, Deployment, CLI & FastAPI Service
Installation & Quickstart
This page documents how to install DeepSearcher, configure it for first use, and run the minimum example end-to-end. It is written for a developer who wants to evaluate the project locally before customizing providers or loading their own corpus.
1. Purpose and Scope
DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines a configurable LLM, an embedding model, and a vector database to answer questions over private documents and (optionally) web sources. The "Installation & Quickstart" workflow is the supported on-ramp for new users: it installs the Python package, populates the environment variables required by the providers, and demonstrates a single-shot query against a vector index using the default NaiveRAG agent.
The community frequently hits the same three friction points during installation — missing optional dependencies on Windows, broken syntax in copy-pasted snippets, and the lack of an official container image — and those are covered in the Troubleshooting section below.
Source: README.md
2. Prerequisites
DeepSearcher is a Python project distributed as a package and a console script. Before installing, confirm the following:
- Python runtime — A recent CPython (3.10+ recommended). Community reports indicate that running the CLI on Python 3.13 (
deepsearcher.exe) has produced import-time tracebacks inside the bundled console script wrapper, so a 3.11–3.12 interpreter is the safer default until upstream adjusts the entry point. Source: deepsearcher/cli.py (referenced in issue #255) - Operating system — The default vector backend (
milvus_lite) is officially supported on Ubuntu ≥ 20.04 and macOS ≥ 11.0. Windows users must switch to an alternative backend such as Milvus standalone or a hosted Milvus/Zilliz Cloud instance. Source: pyproject.toml, community discussion in issue #67 - Provider credentials — At minimum an LLM API key (OpenAI-compatible) and, depending on the chosen embedding and vector DB, additional secrets. These are read from environment variables defined in
env.example.
3. Installation
3.1 Install from PyPI (recommended for new users)
pip install deepsearcher
This installs the package, the optional provider integrations declared in pyproject.toml, and the deepsearcher console script used by the CLI entry point. Source: pyproject.toml
3.2 Install from source
Use this when you intend to modify agents, prompts, or providers:
git clone https://github.com/zilliztech/deep-searcher.git
cd deep-searcher
pip install -e .
The -e editable install is also the recommended setup for contributors evaluating the ChainOfRAG and DeepSearch agents, since both define their prompt templates as module-level string constants that are easy to iterate on. Source: deepsearcher/agent/chain_of_rag.py, deepsearcher/agent/deep_search.py
3.3 Container image (community-requested)
There is no official Docker image at the time of writing; the request is tracked in issue #78. For now, a local container can be built by authoring a Dockerfile that mirrors the editable install above and mounting a populated .env file. Source: pyproject.toml, issue #78
4. Configuration and First Run
4.1 Environment variables
Copy env.example to .env at the repository root and fill in the secrets for the providers you intend to use. The file lists, among others, the LLM API key, embedding model identifier, and vector database connection string. Source: env.example
4.2 Minimal program (`main.py`)
The repository ships a runnable script that loads a configuration, ingests a small set of local files, and answers a single query:
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here;
# more configuration see the Configuration Details section...
init_config(config=config)
# (load documents, then:)
result = query("Your question here")
print(result)
Fix: The version of this snippet previously circulated in the README was missing a closing parenthesis on line 7 (issue #80). The form shown above is the corrected one. Source: issue #80
4.3 CLI usage
The package exposes a deepsearcher console script whose entry point is deepsearcher.cli:main. Once the package is installed, you can invoke it directly:
deepsearcher --help
This is the path taken by users hitting the deepsearcher.exe traceback reported in issue #255. Source: deepsearcher/cli.py
4.4 `examples/basic_example.py`
For a fully self-contained walkthrough — including provider selection, file ingestion, and an end-to-end query — run the basic example:
python examples/basic_example.py
It instantiates the Configuration object, calls init_config, loads a few sample files into the default vector database, and prints the answer returned by the default RAG agent (NaiveRAG). Source: examples/basic_example.py, deepsearcher/agent/naive_rag.py
5. Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
ModuleNotFoundError: No module named 'milvus_lite' on Windows | milvus_lite is not published for Windows | Upgrade pymilvus and switch to a Milvus backend that runs off-host (standalone Docker, Zilliz Cloud) — issue #67 |
Traceback inside deepsearcher.exe on Python 3.13 | Bundled console-script wrapper incompatibility | Use Python 3.11–3.12, or invoke the module directly (python -m deepsearcher.cli) — issue #255 |
SyntaxError from the README snippet | Unbalanced parenthesis on line 7 of the quickstart | Apply the corrected snippet shown in §4.2 — issue #80 |
| Generic LLM hallucinations in answers | Small, non-reasoning LLM behind the API | Use a cutting-edge reasoning model (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) as recommended in issue #267 |
6. End-to-End Flow
The diagram below summarizes the path from pip install to a printed answer.
flowchart LR
A[pip install deepsearcher] --> B[Copy env.example to .env]
B --> C[Fill provider credentials]
C --> D[python examples/basic_example.py]
D --> E[Configuration + init_config]
E --> F[Load documents into vector DB]
F --> G[NaiveRAG agent query]
G --> H[Printed answer + retrieved chunks]See Also
- Configuration & Providers
- RAG Agents (NaiveRAG, ChainOfRAG, DeepSearch, RAGRouter)
- Evaluation Guide
- Web Search Sources
Source: https://github.com/zilliztech/deep-searcher / Human Manual
LLM Provider Configuration
Related topics: Embedding Model Configuration, Extensibility, Troubleshooting & FAQ
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Embedding Model Configuration, Extensibility, Troubleshooting & FAQ
LLM Provider Configuration
Overview
DeepSearcher is designed to be LLM-agnostic. Every retrieval agent — NaiveRAG, ChainOfRAG, DeepSearch, and RAGRouter — accepts a single llm: BaseLLM argument that is used for prompt answering, sub-query decomposition, reranking, and reflection. Source: deepsearcher/agent/naive_rag.py, deepsearcher/agent/chain_of_rag.py, deepsearcher/agent/deep_search.py.
The BaseLLM abstract class is defined in deepsearcher/llm/base.py. Source: deepsearcher/llm/base.py. Concrete providers live in sibling modules — openai_llm.py, deepseek.py, anthropic_llm.py, and ollama.py — each wrapping a different upstream SDK while exposing the same interface. Source: deepsearcher/llm/openai_llm.py, deepsearcher/llm/deepseek.py, deepsearcher/llm/anthropic_llm.py, deepsearcher/llm/ollama.py. Provider selection happens in deepsearcher/configuration.py, where a YAML-driven Configuration object is materialized into the actual BaseLLM instance. Source: deepsearcher/configuration.py.
The BaseLLM Contract
Every LLM provider must implement the methods consumed by the agents. Inspecting agent call sites reveals the full contract:
| Method | Where used | Purpose |
|---|---|---|
chat(messages) | All agents | Send a prompt, return an object with .content (string) and .total_tokens (int) |
literal_eval(text) | ChainOfRAG, DeepSearch, CollectionRouter | Parse a Python literal (list of strings) from the LLM output |
remove_think(text) | ChainOfRAG, DeepSearch, RAGRouter | Strip <think>… blocks emitted by reasoning models |
find_last_digit(text) | RAGRouter | Fallback to extract the trailing digit when parsing the routing decision |
Source: deepsearcher/agent/chain_of_rag.py (uses literal_eval to parse follow-up questions and remove_think before literal_eval), deepsearcher/agent/deep_search.py (SUB_QUERY_PROMPT and REFLECT_PROMPT expect a Python list of strings back), deepsearcher/agent/rag_router.py (falls back to find_last_digit when int(...) fails on a reasoning model's verbose reply).
The presence of remove_think and find_last_digit in the contract is a direct response to the maintainers' recommendation that users select a cutting-edge *reasoning* model — such as OpenAI o-series, DeepSeek R1, or Claude 3.7 Sonnet — because the prompts depend on structured output (lists, integers) and small LLMs are prone to hallucinations. Source: deepsearcher/agent/rag_router.py.
Built-in Providers
The repository ships four first-party provider modules. Each module is a thin adapter over a third-party SDK and is referenced by name from the configuration loader.
- OpenAI / OpenAI-compatible —
deepsearcher/llm/openai_llm.pycovers the OpenAI Chat Completions API and any vendor exposing an OpenAI-compatible endpoint (Azure OpenAI, local vLLM serving OpenAI-format models). Source: deepsearcher/llm/openai_llm.py. - DeepSeek —
deepsearcher/llm/deepseek.pyadapts the DeepSeek API, which is OpenAI-compatible and recommended by the maintainers for reasoning-heavy workloads. Source: deepsearcher/llm/deepseek.py. - Anthropic —
deepsearcher/llm/anthropic_llm.pyadapts the Claude Messages API. Source: deepsearcher/llm/anthropic_llm.py. - Ollama (local) —
deepsearcher/llm/ollama.pyruns models locally through the Ollama daemon. Community reports note that Ollama throughput can be a bottleneck for embedding-heavy workloads. Source: deepsearcher/llm/ollama.py.
A user request in issue #254 to add BurnCloud as an additional LLM provider has been opened against the project, illustrating how third-party providers can be contributed by adding a new module that subclasses BaseLLM. Source: issue #254 — BurnCloud seeks to contribute enhancements. Issue #247 similarly asks for local deployment of Qwen3-Embedding and a Qwen3 LLM served via vLLM with an OpenAI-compatible interface — i.e. using the existing openai_llm.py adapter against a self-hosted endpoint. Source: issue #247 — local Qwen3 deployment.
Configuration Lifecycle
LLM selection is driven by the Configuration class. The canonical entry point is:
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section...
Source: deepsearcher/configuration.py and the Quickstart snippet reproduced in issue #80 (which originally contained an unbalanced parenthesis that has since been fixed).
Internally, the configuration object resolves a provider_name to a concrete class registered in deepsearcher/llm/__init__.py, instantiates it with credentials (api_key, base_url, model, etc.), and the resulting BaseLLM instance is passed by reference into every agent. Because all agents receive the *same* llm object, switching providers requires only a configuration change — no agent code edits. Source: deepsearcher/llm/__init__.py.
flowchart LR
YAML["config.yaml<br/>(provider, model, api_key)"] --> Cfg[Configuration]
Cfg --> Resolver["deepsearcher/llm/__init__.py<br/>provider dispatch"]
Resolver --> Provider["Concrete BaseLLM<br/>(OpenAI / DeepSeek /<br/>Anthropic / Ollama)"]
Provider --> Agents["NaiveRAG / ChainOfRAG /<br/>DeepSearch / RAGRouter"]
Agents -->|chat / literal_eval /<br/>remove_think| ProviderCommon Failure Modes and Community Pitfalls
- Small or non-reasoning LLMs produce malformed structured output. The
RAGRouterprompt asks for a single integer;ChainOfRAGandDeepSearchask for a Python list of strings. When the model returns prose,int(...)raisesValueErrorand the router falls back tofind_last_digit. If even that fails, the agent throws — a symptom users see as "the LLM is hallucinating". Source: deepsearcher/agent/rag_router.py. - CLI bootstrap errors. Issue #255 reports a
deepsearcher.exeinvocation crashing becausedeepsearcher.clicannot import — almost always a missing or mis-configured provider SDK installed afterdeepsearcheritself. Source: issue #255. - Ollama throughput. Issue #247 cites Ollama as too slow for embedding and asks for a vLLM-backed local server instead, served via the OpenAI-compatible provider. Source: issue #247.
- Collection routing authorization gap. Issue #267 notes that
CollectionRouterselects collections based on the query alone and ignores caller authorization context, so provider-side guardrails must be enforced elsewhere. Source: issue #267, deepsearcher/agent/collection_router.py.
See Also
- Vector Database Configuration — sibling configuration domain covering Milvus / Milvus-Lite.
- Embedding Model Configuration — the analogous
BaseEmbeddingabstraction, including the GPTCache default used in theMilvus_default_embedding_modelrelease. - Agent Architecture — how
BaseLLMis consumed byNaiveRAG,ChainOfRAG,DeepSearch, andRAGRouter. - Evaluation Harness —
evaluate.pyreads aconfig_yamlthat specifies LLM, embedding, and provider parameters for the 2WikiMultiHopQA benchmark.
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Embedding Model Configuration
Related topics: LLM Provider Configuration, Vector Database & Data Loader Configuration
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LLM Provider Configuration, Vector Database & Data Loader Configuration
Embedding Model Configuration
Overview
The embedding model is a foundational component of the DeepSearcher retrieval pipeline. Every query that is sent to a vector database must first be transformed into a dense vector, and every indexed chunk must be stored as one. DeepSearcher abstracts this concern behind a single interface — BaseEmbedding — so the same agent code can be re-targeted against different providers (cloud APIs, local runtimes, or on-disk models) without modifying retrieval logic.
All RAG agents in the project — NaiveRAG, DeepSearch, and ChainOfRAG — accept an embedding_model: BaseEmbedding instance in their constructor and call two methods on it: embed_query(...) (used at retrieval time) and an embed method used at ingestion time. They also read embedding_model.dimension to feed it into the CollectionRouter so that the right collection can be selected and any new collections created in the vector store can be initialized with a matching dimensionality. Source: deepsearcher/agent/naive_rag.py:1-90, deepsearcher/agent/deep_search.py:1-120, deepsearcher/agent/chain_of_rag.py:1-140.
Community evidence reflects strong interest in expanding the set of supported embedding backends. For example, issue #247 asks whether local Qwen3-Embedding can be deployed in place of Ollama, and the most recent release notes ship a new Milvus_default_embedding_model (the GPTCache-backed backend). The configuration layer is the single place where these choices are made.
The `BaseEmbedding` Interface
DeepSearcher defines an abstract base class for embedding models under deepsearcher/embedding/base.py. The class exposes the contract that every concrete provider must implement:
| Member | Purpose |
|---|---|
dimension (property) | The fixed vector size produced by the model; used by CollectionRouter and by vector DB initialization. |
embed_query(text: str) | Embeds a single query string (used at retrieval time). |
embed_documents(texts: List[str]) | Embeds a batch of chunk strings (used at ingestion time). |
Optional is_normalized flag | Some models (e.g. BGE) ship pre-normalized vectors, which the agent code consumes when computing similarity. |
Concrete subclasses are plug-and-play. The agent code in naive_rag.py and deep_search.py only ever references self.embedding_model.embed_query(...) and self.embedding_model.dimension — it never imports a specific provider — so swapping a backend is purely a configuration concern. Source: deepsearcher/agent/naive_rag.py:34-90, deepsearcher/agent/deep_search.py:30-80.
Available Providers
DeepSearcher ships with multiple BaseEmbedding implementations, each living in its own module under deepsearcher/embedding/:
- Milvus (GPTCache) default embedding —
deepsearcher/embedding/milvus_embedding.py. The newest default; leverages GPTCache model bindings exposed by the Milvus ecosystem. Selected by the most recent release taggedMilvus_default_embedding_model(GPTCache model). - OpenAI-compatible embedding —
deepsearcher/embedding/openai_embedding.py. Targetstext-embedding-3-*and similar OpenAI models; configurable through standardOPENAI_API_KEY/OPENAI_BASE_URLenvironment variables. - Voyage AI embedding —
deepsearcher/embedding/voyage_embedding.py. Targets Voyage's embedding endpoints (e.g.voyage-3). - FastEmbed (local) —
deepsearcher/embedding/fastembed_embdding.py. Runs models such as BGE locally without a network round-trip. This is the most common choice for fully offline deployments and is the closest match to what issue #247 requests for Qwen3-Embedding.
flowchart LR
A[User Query] --> B[Agent retrieve]
B --> C[CollectionRouter]
C --> D[BaseEmbedding.embed_query]
D --> E[(Vector DB)]
E --> F[Top-k Chunks]
F --> G[BaseLLM.chat]
G --> H[Final Answer]The diagram above shows where the embedding model sits in the critical path. Notice that the embedding is invoked once per query, regardless of which agent is selected — meaning a misconfigured embedding backend silently degrades every retrieval strategy at once.
Configuration and Wiring
Embedding configuration is performed at the Configuration layer (deepsearcher.configuration) and is propagated into the agents during initialization. The snippet in issue #80 illustrates the intended pattern:
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section...
init_config(config=config)
In practice, the configuration object exposes a provider (e.g. openai, milvus, voyage, fastembed), a model_name, and provider-specific fields such as api_key or base_url. At runtime, init_config constructs the matching BaseEmbedding subclass and threads it into every RAG agent — NaiveRAG, DeepSearch, and ChainOfRAG — which all hold a reference to the same instance. This means changing the embedding in Configuration is the only edit required to retarget the entire stack. Source: deepsearcher/agent/chain_of_rag.py:40-110.
The dimension field that comes back from the configured BaseEmbedding is also used to size new Milvus collections on the fly; if a different embedding model is selected mid-project, the agent will create a new collection (and ignore the old one) rather than fail on a vector-size mismatch.
Common Failure Modes
A few recurring issues in the community trace back to embedding configuration choices:
ModuleNotFoundError: No module named 'milvus_lite'(#67) — the default backend on Linux/macOS is the Milvus Lite path; Windows users must either install a matchingpymilvusbuild or switch the embedding provider away frommilvus.- Local Qwen3-Embedding / vLLM (#247) — FastEmbed is the supported local path today; users wanting Qwen3-Embedding locally currently need a custom
BaseEmbeddingsubclass, because the project's FastEmbed module does not yet wrap Qwen3. - CLI import errors (#255) — when the embedding provider is misconfigured at install time, the CLI fails to import before the agent layer is even reached. Confirming that the provider selected in
Configurationhas its required dependency installed (pymilvus,openai,voyageai,fastembed) is the first debugging step.
See Also
- Agent Architecture — covers
NaiveRAG,DeepSearch, andChainOfRAGin detail. - Vector Database Configuration — discusses how the embedding
dimensionis consumed by the vector store. - Configuration Module — entry point for
init_configand provider selection.
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Vector Database & Data Loader Configuration
Related topics: LLM Provider Configuration, Embedding Model Configuration, RAG Agent System & Retrieval Strategies
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LLM Provider Configuration, Embedding Model Configuration, RAG Agent System & Retrieval Strategies
Vector Database & Data Loader Configuration
Overview
In DeepSearcher, the vector database and data loader are not first-class configuration topics handled inside the agent modules, but they are central dependencies that every agent constructs at initialization time. The agent classes in deepsearcher/agent/naive_rag.py, deepsearcher/agent/deep_search.py, and deepsearcher/agent/chain_of_rag.py all accept a vector_db: BaseVectorDB instance and an embedding_model: BaseEmbedding instance as required constructor arguments. The configuration surface is therefore expressed in code as object construction, while the high-level Configuration object (referenced in the quickstart snippet from issue #80) is the user-facing entry point that wires those objects together.
Community feedback highlights the practical pain points of this configuration. Issue #67 reports a ModuleNotFoundError: No module named 'milvus_lite' when the local Milvus backend is selected, and the maintainers note that milvus_lite only supports Ubuntu >= 20.04 and macOS >= 11.0, which constrains how the default vector database can be configured on Windows. The latest release, "Milvus_default_embedding_model(GPTCache model)", further indicates that the default embedding model and vector backend are tightly coupled and shipped as a coordinated unit. Source: evaluation/README.md.
Vector Database Contract
All agents rely on a shared BaseVectorDB abstraction imported from deepsearcher.vector_db.base. The contract that the agents depend on is:
| Capability used by agents | Source location |
|---|---|
vector_db.search(...) returning List[RetrievalResult] | deepsearcher/agent/naive_rag.py:67-86, deepsearcher/agent/deep_search.py:170-200 |
vector_db.list_collections(dim=...) returning collection metadata | deepsearcher/agent/collection_router.py:55-62 |
vector_db.default_collection used as a fallback target | deepsearcher/agent/collection_router.py:88-94 |
embedding_model.dimension passed as the dim argument for collection listing | deepsearcher/agent/naive_rag.py:48, deepsearcher/agent/collection_router.py:42 |
deduplicate_results(...) to merge results across iterations or collections | deepsearcher/agent/deep_search.py:155, deepsearcher/agent/chain_of_rag.py:140 |
Because the agent layer only ever talks to BaseVectorDB and BaseEmbedding, the concrete data loader and vector database (Milvus, Milvus Lite, Qdrant, Azure Search, Oracle, etc., as enumerated in deepsearcher/vector_db/__init__.py in the broader project) can be swapped by changing the wired instance without modifying agent code. This is the design pattern that makes "configuration" effectively a matter of choosing the right concrete class and credential set. Source: deepsearcher/agent/naive_rag.py:36-58.
Collection Routing and Data Placement
The CollectionRouter in deepsearcher/agent/collection_router.py is the bridge between the agent layer and the physical vector database collections where loaded data lives. At construction time, it enumerates all available collections using self.vector_db.list_collections(dim=dim), where dim is taken from embedding_model.dimension so that the router only sees collections whose schema matches the active embedding model. This is the de-facto configuration check for whether a piece of loaded data is even visible to the current pipeline. Source: deepsearcher/agent/collection_router.py:42-58.
At query time, the router calls the LLM with COLLECTION_ROUTE_PROMPT, asking it to pick a Python list of collection names from the candidate list. Two rules then extend that selection: any collection with an empty description is always added (the query itself is used as the search query), and the default_collection is always appended. The final list is deduplicated before being passed to the per-collection search loop. This explains why data loaders should populate the description field on collections: an empty description forces the collection to be searched for every query, which is the correct behavior for a default "catch-all" collection but a footgun for named, domain-specific collections. Source: deepsearcher/agent/collection_router.py:78-100.
A known limitation surfaces in issue #267, "Collection routing ignores caller authorization context": because the router relies solely on the LLM to pick collections, it has no way to enforce per-caller access control. Any user with query access effectively gets to search every collection the router knows about.
Agent-Level Configuration Knobs
The agent constructors expose the configuration levers that most directly affect vector DB behavior:
top_k(NaiveRAG, default 10) — number of chunks fetched per collection per query. Source: deepsearcher/agent/naive_rag.py:40-58.max_iter(DeepSearch default 3, ChainOfRAG default 4) — caps the reflection/re-query loop that drives additional vector searches. Source: deepsearcher/agent/deep_search.py:36-58, deepsearcher/agent/chain_of_rag.py:46-70.route_collection(default True on all three agents) — toggles theCollectionRouter. WhenFalse, the agent searchescollection_router.all_collectionsdirectly with zero routing tokens. Source: deepsearcher/agent/naive_rag.py:70-78.text_window_splitter(default True) — when enabled, the summarization step reads thewider_textmetadata field produced by the splitter instead of the raw chunk text, which materially changes the context the LLM sees at answer time. Source: deepsearcher/agent/naive_rag.py:96-110.early_stopping(ChainOfRAG, default False) — uses a reflection prompt to break the loop as soon as the intermediate context is judged sufficient, reducing redundant vector searches. Source: deepsearcher/agent/chain_of_rag.py:46-70.
The RAGRouter in deepsearcher/agent/rag_router.py sits one level above these knobs: it uses the __description__ attribute (registered via the @describe_class decorator in deepsearcher/agent/base.py) to pick which agent should handle a given query, which is the recommended way to expose the configuration trade-offs to end users without forcing them to choose an agent manually. Source: deepsearcher/agent/base.py:8-26.
Data Flow and Common Failure Modes
flowchart LR
A[Load documents] --> B[Embedding model]
B --> C[Vector DB collections<br/>with dim from embedding_model]
Q[User query] --> R[CollectionRouter<br/>uses LLM + dim]
R --> S[Search selected collections]
S --> T[deduplicate_results]
T --> U[Agent summarization<br/>uses wider_text if available]
U --> A2[Final answer]Two failure modes recur in community reports and are visible from the code:
- Embedding/collection dimension mismatch. The router passes
embedding_model.dimensiontolist_collections(dim=dim). If a previously loaded collection was built with a different embedding model (for example, switching from the GPTCache default mentioned in the latest release to Qwen3-Embedding as requested in issue #247), the dimension check will silently drop that collection from the candidate list, and the agent will return no results without an explicit error. - Missing backend dependency on Windows. Per issue #67, the
milvus_liteextension required by the default local Milvus configuration is unavailable on Windows. The community-validated workaround is to upgradepymilvusto a version that bundles the right native libraries, or to switch to a differentBaseVectorDBimplementation such as Qdrant or Azure Search.
See Also
- Agent Architecture — overview of
BaseAgentandRAGAgent - LLM Provider Configuration — selecting and configuring the language model backend
- Evaluation Guide — Recall@K methodology and
2WikiMultiHopQAbenchmarks described in evaluation/README.md
Source: https://github.com/zilliztech/deep-searcher / Human Manual
RAG Agent System & Retrieval Strategies
Related topics: Project Overview & System Architecture, Vector Database & Data Loader Configuration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview & System Architecture, Vector Database & Data Loader Configuration
RAG Agent System & Retrieval Strategies
The deepsearcher.agent package is the orchestration layer that turns a natural-language question into a verified answer grounded in the user's private vector store (and, optionally, the public web). It defines a small hierarchy of abstract base classes and ships four concrete strategies that share the same inputs — an LLM, an embedding model, and a vector database — but apply very different retrieval and reasoning loops. Source: deepsearcher/agent/__init__.py:1-12.
Agent Class Hierarchy
At the root is BaseAgent, an abstract class with a single invoke(query, **kwargs) entry point. RAGAgent extends it and adds two structured methods: retrieve() returning (results, tokens, metadata), and query() returning (answer, results, tokens). All concrete RAG agents inherit from RAGAgent. Source: deepsearcher/agent/base.py:36-90.
A describe_class() decorator injects a __description__ string on each agent class. This description is later read by RAGRouter to decide which agent should handle a query, so every shipped agent ships with a curated one-liner explaining its strength. Source: deepsearcher/agent/base.py:18-34.
The Three Retrieval Strategies
NaiveRAG — Single-Pass Retrieval
NaiveRAG is the simplest implementation. It embeds the query, optionally uses CollectionRouter to narrow the search to relevant collections, pulls top_k chunks from vector_db, formats them inside <chunk_i>...</chunk_i> tags, and asks the LLM to write a final SUMMARY_PROMPT. Source: deepsearcher/agent/naive_rag.py:28-110.
It supports two flags: route_collection (off by default — must be opted in) and text_window_splitter which prefers the wider_text metadata field of each chunk so the LLM sees surrounding context instead of an isolated fragment. Source: deepsearcher/agent/naive_rag.py:18-110.
ChainOfRAG — Iterative Sub-Query Decomposition
ChainOfRAG targets concrete, factual, multi-hop questions. At each iteration it (1) prompts the LLM for a single follow-up sub-query, (2) retrieves chunks for that sub-query, (3) generates an intermediate answer using only retrieved documents, (4) asks the LLM to pick the supporting docs via GET_SUPPORTED_DOCS_PROMPT, and (5) calls REFLECTION_PROMPT to decide whether to stop early or iterate again. Source: deepsearcher/agent/chain_of_rag.py:30-180.
Key configuration: max_iter (default 4), early_stopping (default False), route_collection (default True). When early_stopping=True, the loop terminates as soon as the reflection prompt returns "Yes", saving both tokens and latency. The final answer is composed from the union of deduplicated chunks plus all intermediate answer contexts. Source: deepsearcher/agent/chain_of_rag.py:120-200.
DeepSearch — Reflective Multi-Iteration Search
DeepSearch is designed for "write me a report / survey / article" prompts. Its async_retrieve() first calls _generate_sub_queries() (up to 4 sub-questions, or the original query alone if already simple), then executes all vector searches in parallel with asyncio.gather. Source: deepsearcher/agent/deep_search.py:130-180.
Two LLM-driven filters shape the final corpus:
RERANK_PROMPT— accepts or rejects each individual chunk against the active sub-queries.REFLECT_PROMPT— at the end of each iteration, asks the LLM whether further research is needed and, if so, returns up to 3 "gap queries" that feed the next iteration.
The loop runs for max_iter rounds (default 3), then a final SUMMARY_PROMPT consolidates everything. Note that DeepSearch always instantiates a CollectionRouter regardless of the route_collection flag in its constructor. Source: deepsearcher/agent/deep_search.py:30-60; deepsearcher/agent/deep_search.py:90-130.
Query and Collection Routing
RAGRouter selects one agent out of a user-supplied list. It builds a numbered prompt of agent descriptions and parses the LLM's chosen index, with a defensive fallback that scans for the last digit if a reasoning model wraps the answer in prose. Source: deepsearcher/agent/rag_router.py:15-75.
Inside each agent, CollectionRouter performs a second, finer-grained routing step: choosing which vector-DB collections to search. It is constructed from (llm, vector_db, dim) and exposed via collection_router.all_collections when routing is disabled. Source: deepsearcher/agent/naive_rag.py:50-80; deepsearcher/agent/chain_of_rag.py:30-60.
Data Flow
flowchart TD
Q[User Query] --> RR[RAGRouter]
RR -->|picks agent| NA[NaiveRAG]
RR -->|picks agent| CR[ChainOfRAG]
RR -->|picks agent| DS[DeepSearch]
NA --> CR2[CollectionRouter]
CR --> CR2
DS --> CR2
CR2 --> VDB[(Vector DB)]
VDB --> CR2
CR2 -->|top_k chunks| NA
CR2 -->|top_k chunks| CR
CR2 -->|top_k chunks| DS
NA --> LLM[LLM Summary]
CR -->|loop: follow-up + reflect| LLM
DS -->|loop: reflect + gap queries| LLM
LLM --> A[Final Answer + Citations]Failure Modes and Community Notes
- Routing & authorization. Community issue #267 reports that
CollectionRouterignores the caller's authorization context, so any user-visible collection may be searched even when it should be filtered. Treat collection names as non-sensitive labels until ACLs are layered on top. Source: deepsearcher/agent/collection_router.py; community context #267. - Reasoning-model prompts. The prompts intentionally call
llm.remove_think()before parsing integers and indices because reasoning models (o-series, DeepSeek R1, Claude 3.7 Sonnet) wrap answers in<think>...tags. TheRAGRouterfalls back tofind_last_digitwhen this still fails. Source: deepsearcher/agent/rag_router.py:55-70; deepsearcher/agent/deep_search.py:170-200. - Web vs. private data. DeepSearcher is intentionally a private-data-first RAG; web search is a complementary add-on. The current
DeepSearchcode reserves asearch_res_from_internet = []slot for a future web backend, which is the entry point relevant to feature request #270 for adding serpbase.dev. Source: deepsearcher/agent/deep_search.py:150-160; community context #270. - LLM quality matters. Small LLMs struggle with
literal_eval, list-format responses, and the YES/NO reranker, producing hallucinations. The maintainers explicitly recommend reasoning-grade models.
See Also
- Vector DB backends and
RetrievalResultschema - Embedding model providers (e.g., Qwen3-Embedding, GPTCache)
- Configuration module:
deepsearcher.configuration - Online query API:
deepsearcher.online_query - Evaluation pipeline: evaluation/README.md
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Deployment, CLI & FastAPI Service
Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation & Quickstart, RAG Agent System & Retrieval Strategies
Deployment, CLI & FastAPI Service
Deep-Searcher is shipped as a Python package that exposes a command-line entry point, a programmatic main.py driver, a reproducible Makefile workflow, and an opt-in container image via the project Dockerfile. This page documents the deployment surface area, how the pieces fit together, and the failure modes reported by the community.
1. Entry Points and High-Level Topology
The repository exposes the system through three complementary surfaces: a generated console script, a Python module entry, and an evaluation harness.
- The console script
deepsearcheris generated by the package's[project.scripts](orconsole_scripts) metadata. When installed in a Python environment it becomes an executable that delegates todeepsearcher.cli:main. This is visible in community stack traces where the invocation resolves toC:\...\Scripts\deepsearcher.exe\__main__.pyand immediately runsfrom deepsearcher.cli import main(community trace from issue #255). main.pyat the repository root is the canonical "load → query" driver used in the Quickstart guide. It importsConfigurationandinit_configfromdeepsearcher.configuration, then drivesdeepsearcher.online_query.queryagainst the configured provider stack.evaluation/evaluate.pyis a separate entry point dedicated to batch benchmarking; it does not share runtime state with the CLI.
flowchart LR
A[User Shell] -->|deepsearcher| B[deepsearcher/cli.py]
A -->|python main.py| C[main.py driver]
A -->|python evaluate.py| D[evaluation/evaluate.py]
B --> E[Configuration + init_config]
C --> E
D --> E
E --> F[Vector DB + LLM + Embedding]
F --> G[Agents: NaiveRAG / ChainOfRAG / DeepSearch]
G --> H[Answer + RetrievalResults]Source: deepsearcher/cli.py, main.py, evaluation/evaluate.py
2. CLI Surface (`deepsearcher.cli`)
The CLI module is the most fragile of the entry points because it is what end users run after pip install. The traceback reproduced in issue #255 confirms three facts that operators must understand:
- The console-script wrapper lives under the active Python environment's
Scripts/directory and re-exportsdeepsearcher.cli.main. - Importing
deepsearcher.clieagerly pulls in the full agent, vector-DB, and LLM dependency tree. A failure anywhere in that tree surfaces as aModuleNotFoundErrorfrom the CLI before any user code runs. - On Windows with Python 3.13, the
deepsearcher.exeshim and the underlying imports must both succeed; community reports show partial Windows breakage driven bymilvus_lite(issue #67 —milvus_liteofficially supports Ubuntu ≥ 20.04 and macOS ≥ 11.0).
Operators deploying the CLI on Windows should therefore pin to a Python version supported by pymilvus/milvus_lite, or run the CLI inside WSL/Linux where the native library resolves correctly.
Source: deepsearcher/cli.py, community reports in issues #255 and #67.
3. Programmatic Driver and FastAPI-Style Service
main.py is the reference deployment pattern: instantiate Configuration, mutate it for the target LLM / embedding / vector-DB providers, then call init_config(config) before issuing query(...) calls. The driver composes the agents defined in deepsearcher/agent/:
NaiveRAG— single-shot vector retrieval + summarization (seeSUMMARY_PROMPTflow indeepsearcher/agent/naive_rag.py).ChainOfRAG— iterative sub-query decomposition with reflection (REFLECTION_PROMPT,GET_SUPPORTED_DOCS_PROMPT) and optionalearly_stopping, seedeepsearcher/agent/chain_of_rag.py.DeepSearch— multi-aspect sub-query generation + gap-driven reflection (REFLECT_PROMPT), seedeepsearcher/agent/deep_search.py.
These three classes share the RAGAgent contract defined in deepsearcher/agent/base.py (retrieve(...) returns (List[RetrievalResult], int, dict), query(...) returns (str, List[RetrievalResult], int)). Any HTTP layer wrapping Deep-Searcher (FastAPI, Starlette, etc.) only needs to forward a string query into query(...) and serialize the returned tuple — no special protocol is required because the return shape is stable across agents.
Source: main.py, deepsearcher/agent/base.py, deepsearcher/agent/naive_rag.py, deepsearcher/agent/chain_of_rag.py.
4. Containerization, Make Targets, and Evaluation Harness
The repository ships a Dockerfile and Makefile to make local and CI deployments reproducible. Issue #78 ("Build up an OFFICIAL Docker image please") is the most up-voted deployment request, indicating that the in-repo Dockerfile is currently the supported path rather than a published registry image.
The Makefile typically wraps the most common operator tasks (install, lint, test, run-evaluate). Combined with evaluation/evaluate.py, the supported evaluation flow documented in evaluation/README.md is:
python evaluate.py \
--dataset 2wikimultihopqa \
--config_yaml ./eval_config.yaml \
--pre_num 5 \
--output_dir ./eval_output
Key flags and behaviors per the README:
| Flag | Purpose |
|---|---|
--dataset | Selects the QA dataset (currently 2wikimultihopqa). |
--config_yaml | Path to a YAML file specifying LLM, embedding, and provider parameters. |
--pre_num | Number of samples to evaluate; higher = more accurate but more token cost. |
--skip_load | Reuse a previously loaded vector DB instead of re-ingesting. |
--output_dir | Destination for recall plots (e.g. plot_results/max_iter_vs_recall.png). |
The evaluation uses Recall@K: the percentage of relevant documents appearing in the top-K retrieved results. The README notes diminishing returns as max_iter increases — most models gain steeply between 2–4 iterations and plateau afterwards, with Claude-3-7-sonnet approaching near-perfect recall at 7 iterations on the 50-sample preview.
Source: Dockerfile, Makefile, evaluation/evaluate.py, evaluation/README.md.
5. Common Deployment Failure Modes
The community context surfaces four recurring deployment problems that operators should pre-empt:
- Windows + Python 3.13 CLI breakage — issues #255 and #67. The console-script shim crashes before any user logic executes because
milvus_litelacks Windows wheels. Mitigation: run on Linux/macOS or via the suppliedDockerfile. - Quickstart syntax bug — issue #80 reports an unbalanced parenthesis in the published
from deepsearcher.configuration import Configuration, init_configsnippet. Always copy the snippet directly from the repo rather than cached docs. - No official Docker image — issue #78. Until an official image is published, build locally from the
Dockerfile. - Local LLM/embedding deployment — issue #247 shows that users wanting Qwen3-Embedding or Qwen3 LLM locally must use
vllm>=0.8.5rather than Ollama for acceptable throughput, and must wire the provider throughConfiguration+init_configinmain.py.
Source: community issues #255, #67, #80, #78, #247.
See Also
- Agents & RAG Pipelines (NaiveRAG / ChainOfRAG / DeepSearch / RAGRouter)
- Configuration & Provider Wiring (LLM, Embedding, Vector DB)
- Vector Database Connectors (Milvus / Milvus-Lite)
- Evaluation Harness & Recall@K Benchmarking
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Extensibility, Troubleshooting & FAQ
Related topics: LLM Provider Configuration, Embedding Model Configuration, Vector Database & Data Loader Configuration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LLM Provider Configuration, Embedding Model Configuration, Vector Database & Data Loader Configuration
Extensibility, Troubleshooting & FAQ
Overview
DeepSearcher is designed with extensibility as a first-class concern. Every core capability — large language models, embeddings, vector databases, retrieval agents, and web search backends — is abstracted behind a base class with a minimal interface. This lets contributors plug in new providers without forking the framework. At the same time, a predictable extension surface means troubleshooting stays tractable: most failures trace back to a misconfigured provider, a missing native dependency, or an LLM that is too weak for the prompt-following the framework expects.
This page consolidates how to extend the system, how to diagnose the most common runtime errors reported by the community, and answers to frequently asked questions drawn from the issue tracker.
Extension Points
Adding a New Agent
All retrieval agents inherit from RAGAgent, which in turn extends BaseAgent defined in deepsearcher/agent/base.py. The contract is intentionally small: implement retrieve(query, kwargs) -> (List[RetrievalResult], int, dict) and query(query, kwargs) -> (str, List[RetrievalResult], int).
To make an agent discoverable by the query router, decorate the class with @describe_class("…"). The decorator stores the description on cls.__description__, which RAGRouter reads at construction time when no explicit agent_descriptions list is supplied (deepsearcher/agent/rag_router.py).
Reference implementations to study:
- NaiveRAG — single-pass retrieve + summarize (deepsearcher/agent/naive_rag.py).
- ChainOfRAG — iterative follow-up queries with early stopping, suitable for multi-hop factual questions (deepsearcher/agent/chain_of_rag.py).
- DeepSearch — sub-query decomposition, LLM-based reranking, and reflection to fill gaps (deepsearcher/agent/deep_search.py).
Adding Providers (LLM, Embedding, Vector DB, Web Search)
The same pattern repeats across modules: subclass the Base… abstract class, implement the required methods, then register the provider in the configuration layer so Configuration() can resolve it by name. Each provider module follows this shape (constructor takes a config object, methods return typed dataclasses such as RetrievalResult).
Community discussions around new providers frequently reference this pattern, including requests to add additional web search backends such as serpbase.dev (issue #270) and local embedding/LLM stacks like Qwen3-Embedding served via vllm or an OpenAI-compatible endpoint (issue #247). Both are accommodated by the existing base classes without changes to core logic.
Common Errors and Troubleshooting
The error categories below are the ones most frequently reported by users in the issue tracker.
1. `ModuleNotFoundError: No module named 'milvus_lite'`
milvus_lite is the default embedded vector database backend. Its prebuilt wheels only ship for Ubuntu ≥ 20.04 and macOS ≥ 11.0, which is why Windows users hit this error (issue #67). Remedies:
- Upgrade to a recent
pymilvusversion that bundles a compatible wheel. - Switch to an alternative vector DB backend that runs on Windows (e.g., a remote Milvus/Zilliz Cloud instance) by changing the
vector_dbblock of your configuration.
2. Quickstart `SyntaxError` from an Unbalanced Parenthesis
An older quickstart snippet shipped with a missing closing bracket, producing an immediate SyntaxError on import (issue #80). Always copy from the latest README or docs site; the snippet should now read:
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query
config = Configuration()
init_config(config=config)
3. `deepsearcher.exe` Traceback on Windows
A launcher traceback at process start (e.g., from deepsearcher.cli import main failing inside Scripts/deepsearcher.exe/__main__.py) usually means a partial or broken install (issue #255). Recommended fix:
pip uninstall deepsearcher
pip install --upgrade deepsearcher
If the launcher still fails, run the library directly with python -m deepsearcher.cli to surface the real error.
4. Collection Routing Ignoring Authorization
CollectionRouter selects collections based on the query alone; it does not receive the caller's authorization context (issue #267). In multi-tenant deployments you must pre-filter collections in your application layer before constructing the agent, or extend CollectionRouter to accept an auth context.
5. Weak LLM Producing Hallucinated Routing
RAGRouter parses the agent index from the LLM output and falls back to "last digit" parsing when a reasoning model emits prose (deepsearcher/agent/rag_router.py). Smaller or non-reasoning LLMs frequently fail this step. The maintainers' guidance, mirrored in the issue template, is to use a frontier or reasoning model (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) for both routing and generation.
FAQ
Q: Which agent should I use? NaiveRAG is the cheapest and works well for single-fact lookups. DeepSearch is the most thorough and is the default for general topic/report-style questions. ChainOfRAG strikes a middle ground for multi-hop factual queries that benefit from iterative refinement but do not need full sub-query decomposition (deepsearcher/agent/chain_of_rag.py).
Q: Can I run DeepSearcher fully offline? Yes, provided the LLM and embedding model are exposed via an OpenAI-compatible endpoint (e.g., vllm serving Qwen3-Embedding — issue #247). Point the LLM and embedding provider configurations at that endpoint.
Q: How is the evaluation suite run? The evaluation harness in evaluation/README.md supports the 2WikiMultiHopQA dataset out of the box and reports Recall@K against DeepSearcher versus a naive RAG baseline:
python evaluate.py \
--dataset 2wikimultihopqa \
--config_yaml ./eval_config.yaml \
--pre_num 5 \
--output_dir ./eval_output
Re-running after the first load can be accelerated with --skip_load.
Q: Where do logs go? The logger in deepsearcher/utils/log.py writes colored progress output via color_print and dev-level diagnostics via dev_logger. critical() raises RuntimeError, so it should only be used for fatal paths.
See Also
- Agent overview:
naive_rag,chain_of_rag,deep_search,rag_router - Configuration and provider registration
- Evaluation harness (
evaluation/README.md)
Source: https://github.com/zilliztech/deep-searcher / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 12 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.
1. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/255
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/270
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/67
4. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | https://github.com/zilliztech/deep-searcher
5. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/zilliztech/deep-searcher
6. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/zilliztech/deep-searcher
8. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/zilliztech/deep-searcher
9. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/254
10. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/267
11. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher
12. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using deep-searcher with real data or production workflows.
- Feature Request: Add serpbase.dev as a web search source for reliable Go - github / github_issue
- Collection routing ignores caller authorization context - github / github_issue
- run issue - github / github_issue
- ModuleNotFoundError: No module named 'milvus_lite' - github / github_issue
- BurnCloud seeks to contribute enhancements - Permission to submit PR - github / github_issue
- Can it support local deployment of Qwen3-Embedding? - github / github_issue
- Milvus_default_embedding_model(GPTCache model) - github / github_release
- Configuration risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence