# https://github.com/zilliztech/deep-searcher Project Manual

Generated at: 2026-06-26 10:21:57 UTC

## Table of Contents

- [Project Overview & System Architecture](#page-1)
- [Installation & Quickstart](#page-2)
- [LLM Provider Configuration](#page-3)
- [Embedding Model Configuration](#page-4)
- [Vector Database & Data Loader Configuration](#page-5)
- [RAG Agent System & Retrieval Strategies](#page-6)
- [Deployment, CLI & FastAPI Service](#page-7)
- [Extensibility, Troubleshooting & FAQ](#page-8)

<a id='page-1'></a>

## Project Overview & System Architecture

### Related Pages

Related topics: [Installation & Quickstart](#page-2), [RAG Agent System & Retrieval Strategies](#page-6)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)
- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/utils/log.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/utils/log.py)
- [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)
</details>

# Project Overview & System Architecture

## 1. Purpose and Scope

DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines private knowledge bases with optional web search to answer complex queries. The repository is organized around a modular agent architecture in which each agent encapsulates a different retrieval-and-reasoning strategy. The framework plugs in an LLM, an embedding model, and a vector database as swappable backends, and exposes a unified `query` interface to the end user. As described in the bundled evaluation notes, the project is positioned to handle complex multi-hop questions and supports recall-based evaluation against datasets such as 2WikiMultiHopQA. Source: [evaluation/README.md:1-15]()

The system's high-level design is centered on three RAG agent implementations (NaiveRAG, ChainOfRAG, DeepSearch) and a routing layer (RAGRouter) that picks the best implementation for a given query. This plug-in approach makes the framework easy to extend with new agents, vector stores, or LLMs.

## 2. Core Architectural Components

The codebase is structured around a small set of abstract base classes and concrete implementations:

| Component | File | Role |
|---|---|---|
| `BaseAgent` | [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py) | Abstract root for any agent; defines `invoke(query, **kwargs)`. |
| `RAGAgent` | [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py) | Subclass of `BaseAgent` for RAG-style agents; requires `retrieve()` and `query()` returning `(answer, results, token_usage)`. |
| `NaiveRAG` | [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py) | Simple retrieve-then-summarize agent. |
| `ChainOfRAG` | [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py) | Multi-step iterative RAG with reflection and supported-doc filtering. |
| `DeepSearch` | [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py) | Sub-query decomposition, async retrieve, rerank, reflection, gap-query generation. |
| `RAGRouter` | [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py) | LLM-driven router that picks a single agent for a given query. |
| `CollectionRouter` | referenced from each agent | Selects relevant vector-DB collections before retrieval. |

`BaseAgent` only mandates an `invoke` method, while `RAGAgent` adds the contract `retrieve(query, **kwargs) -> (List[RetrievalResult], int, dict)` and `query(query, **kwargs) -> (str, List[RetrievalResult], int)`. Source: [deepsearcher/agent/base.py:30-77]() This contract is what makes the agents interchangeable inside the router.

## 3. Agent Implementations and Their Strategies

### 3.1 NaiveRAG
`NaiveRAG` performs a single retrieval pass and asks the LLM to summarize the retrieved chunks. It supports optional `route_collection` (using `CollectionRouter` to pick collections) and `text_window_splitter` (which substitutes `metadata["wider_text"]` for each chunk in the summary prompt). The summary prompt instructs the LLM to behave as a content analysis expert that consolidates chunks into a "specific and detailed answer or report." Source: [deepsearcher/agent/naive_rag.py:1-130]()

### 3.2 ChainOfRAG
`ChainOfRAG` runs up to `max_iter` rounds. At each iteration it generates a follow-up sub-query, retrieves documents, asks the LLM for an intermediate answer, lets the LLM select which documents supported that answer (`_get_supported_docs`), and then uses a reflection prompt (`REFLECTION_PROMPT`) to decide whether to stop early. Final answers are synthesized with `FINAL_ANSWER_PROMPT`. The class docstring notes it is "very suitable for handling concrete factual queries and multi-hop questions," inspired by the paper referenced in the file. Source: [deepsearcher/agent/chain_of_rag.py:1-200]()

### 3.3 DeepSearch
`DeepSearch` is the most elaborate agent. It decomposes the original query into up to four sub-questions (`SUB_QUERY_PROMPT`), retrieves and reranks chunks (`RERANK_PROMPT`), reflects to find gaps (`REFLECT_PROMPT`), and iterates up to `max_iter` times. It exposes both `retrieve` (synchronous wrapper) and `async_retrieve`, using `asyncio.run` to bridge them. The class docstring positions it for "general and simple queries, such as given a topic and then writing a report, survey, or article." Source: [deepsearcher/agent/deep_search.py:1-200]()

## 4. Query Flow and System Wiring

The end-to-end flow uses the LLM twice: once at the agent-router level, and again inside the chosen agent. `RAGRouter._route` formats a prompt that lists each agent's index and `__description__` (auto-populated via the `describe_class` decorator) and asks the LLM to return a single index. A fallback extracts the last digit if the LLM's output is not purely numeric — a common behavior with reasoning models. Source: [deepsearcher/agent/rag_router.py:1-90]()

Inside any RAG agent, when `route_collection=True`, the query first passes through `CollectionRouter.invoke`, which returns the selected collections and the routing token cost. Retrieved chunks are deduplicated through `deduplicate_results` before being passed to the LLM for answer generation. The terminal log line `==== FINAL ANSWER====` is emitted via the colored progress logger defined in the utility module. Source: [deepsearcher/agent/naive_rag.py:1-130](), [deepsearcher/utils/log.py:1-90]()

```mermaid
flowchart TD
    A[User Query] --> B[RAGRouter._route]
    B --> C{LLM selects agent}
    C -->|1| D[NaiveRAG]
    C -->|2| E[ChainOfRAG]
    C -->|3| F[DeepSearch]
    D --> G[CollectionRouter]
    E --> G
    F --> G
    G --> H[Vector DB retrieve]
    H --> I[Deduplicate chunks]
    I --> J[LLM summarize/reflect]
    J --> K[Final answer]
```

## 5. Configuration, Logging, and Extensibility

All agents take the same constructor triad: an LLM (`BaseLLM`), an embedding model (`BaseEmbedding`), and a vector database (`BaseVectorDB`). Optional flags such as `top_k`, `max_iter`, `early_stopping`, `route_collection`, and `text_window_splitter` let users tune retrieval depth versus cost. The `describe_class` decorator on each agent supplies a human-readable description that the router consumes, which is how a new agent becomes selectable automatically. Source: [deepsearcher/agent/base.py:1-30](), [deepsearcher/agent/rag_router.py:30-60]()

The logging layer splits output into a "dev" logger (gated by `set_dev_mode`) and a "progress" logger that drives colored console output via `termcolor` and a `ColoredFormatter`. The `progress_logger` is what users see during a `query` call, while `dev_mode` controls verbose diagnostic logs. Source: [deepsearcher/utils/log.py:1-110]()

For evaluation, the project ships an `evaluation/` directory that runs Recall@K against 2WikiMultiHopQA, comparing DeepSearcher against naive RAG. A `pre_num` argument controls sample count, and `--skip_load` reuses an already-loaded vector DB on subsequent runs. Source: [evaluation/README.md:1-30]()

## See Also

- Agents and Routing
- Configuration and Provider Setup
- Evaluation and Benchmarks

---

<a id='page-2'></a>

## Installation & Quickstart

### Related Pages

Related topics: [Project Overview & System Architecture](#page-1), [Deployment, CLI & FastAPI Service](#page-7)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/zilliztech/deep-searcher/blob/main/README.md)
- [pyproject.toml](https://github.com/zilliztech/deep-searcher/blob/main/pyproject.toml)
- [env.example](https://github.com/zilliztech/deep-searcher/blob/main/env.example)
- [main.py](https://github.com/zilliztech/deep-searcher/blob/main/main.py)
- [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py)
- [examples/basic_example.py](https://github.com/zilliztech/deep-searcher/blob/main/examples/basic_example.py)
- [deepsearcher/configuration.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/configuration.py)
</details>

# Installation & Quickstart

This page documents how to install DeepSearcher, configure it for first use, and run the minimum example end-to-end. It is written for a developer who wants to evaluate the project locally before customizing providers or loading their own corpus.

## 1. Purpose and Scope

DeepSearcher is a Retrieval-Augmented Generation (RAG) framework that combines a configurable LLM, an embedding model, and a vector database to answer questions over private documents and (optionally) web sources. The "Installation & Quickstart" workflow is the supported on-ramp for new users: it installs the Python package, populates the environment variables required by the providers, and demonstrates a single-shot query against a vector index using the default `NaiveRAG` agent.

The community frequently hits the same three friction points during installation — missing optional dependencies on Windows, broken syntax in copy-pasted snippets, and the lack of an official container image — and those are covered in the [Troubleshooting](#5-troubleshooting) section below.

Source: [README.md](https://github.com/zilliztech/deep-searcher/blob/main/README.md)

## 2. Prerequisites

DeepSearcher is a Python project distributed as a package and a console script. Before installing, confirm the following:

- **Python runtime** — A recent CPython (3.10+ recommended). Community reports indicate that running the CLI on Python 3.13 (`deepsearcher.exe`) has produced import-time tracebacks inside the bundled console script wrapper, so a 3.11–3.12 interpreter is the safer default until upstream adjusts the entry point. Source: [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py) (referenced in issue #255)
- **Operating system** — The default vector backend (`milvus_lite`) is officially supported on Ubuntu ≥ 20.04 and macOS ≥ 11.0. Windows users must switch to an alternative backend such as Milvus standalone or a hosted Milvus/Zilliz Cloud instance. Source: [pyproject.toml](https://github.com/zilliztech/deep-searcher/blob/main/pyproject.toml), community discussion in issue #67
- **Provider credentials** — At minimum an LLM API key (OpenAI-compatible) and, depending on the chosen embedding and vector DB, additional secrets. These are read from environment variables defined in `env.example`.

## 3. Installation

### 3.1 Install from PyPI (recommended for new users)

```bash
pip install deepsearcher
```

This installs the package, the optional provider integrations declared in `pyproject.toml`, and the `deepsearcher` console script used by the CLI entry point. Source: [pyproject.toml](https://github.com/zilliztech/deep-searcher/blob/main/pyproject.toml)

### 3.2 Install from source

Use this when you intend to modify agents, prompts, or providers:

```bash
git clone https://github.com/zilliztech/deep-searcher.git
cd deep-searcher
pip install -e .
```

The `-e` editable install is also the recommended setup for contributors evaluating the `ChainOfRAG` and `DeepSearch` agents, since both define their prompt templates as module-level string constants that are easy to iterate on. Source: [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py), [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)

### 3.3 Container image (community-requested)

There is no official Docker image at the time of writing; the request is tracked in issue #78. For now, a local container can be built by authoring a `Dockerfile` that mirrors the editable install above and mounting a populated `.env` file. Source: [pyproject.toml](https://github.com/zilliztech/deep-searcher/blob/main/pyproject.toml), issue #78

## 4. Configuration and First Run

### 4.1 Environment variables

Copy `env.example` to `.env` at the repository root and fill in the secrets for the providers you intend to use. The file lists, among others, the LLM API key, embedding model identifier, and vector database connection string. Source: [env.example](https://github.com/zilliztech/deep-searcher/blob/main/env.example)

### 4.2 Minimal program (`main.py`)

The repository ships a runnable script that loads a configuration, ingests a small set of local files, and answers a single query:

```python
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
# Customize your config here;
# more configuration see the Configuration Details section...

init_config(config=config)

# (load documents, then:)
result = query("Your question here")
print(result)
```

> **Fix:** The version of this snippet previously circulated in the README was missing a closing parenthesis on line 7 (issue #80). The form shown above is the corrected one. Source: issue #80

### 4.3 CLI usage

The package exposes a `deepsearcher` console script whose entry point is `deepsearcher.cli:main`. Once the package is installed, you can invoke it directly:

```bash
deepsearcher --help
```

This is the path taken by users hitting the `deepsearcher.exe` traceback reported in issue #255. Source: [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py)

### 4.4 `examples/basic_example.py`

For a fully self-contained walkthrough — including provider selection, file ingestion, and an end-to-end query — run the basic example:

```bash
python examples/basic_example.py
```

It instantiates the `Configuration` object, calls `init_config`, loads a few sample files into the default vector database, and prints the answer returned by the default RAG agent (`NaiveRAG`). Source: [examples/basic_example.py](https://github.com/zilliztech/deep-searcher/blob/main/examples/basic_example.py), [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)

## 5. Troubleshooting

| Symptom | Likely cause | Fix |
| --- | --- | --- |
| `ModuleNotFoundError: No module named 'milvus_lite'` on Windows | `milvus_lite` is not published for Windows | Upgrade `pymilvus` and switch to a Milvus backend that runs off-host (standalone Docker, Zilliz Cloud) — issue #67 |
| Traceback inside `deepsearcher.exe` on Python 3.13 | Bundled console-script wrapper incompatibility | Use Python 3.11–3.12, or invoke the module directly (`python -m deepsearcher.cli`) — issue #255 |
| `SyntaxError` from the README snippet | Unbalanced parenthesis on line 7 of the quickstart | Apply the corrected snippet shown in §4.2 — issue #80 |
| Generic LLM hallucinations in answers | Small, non-reasoning LLM behind the API | Use a cutting-edge reasoning model (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) as recommended in issue #267 |

## 6. End-to-End Flow

The diagram below summarizes the path from `pip install` to a printed answer.

```mermaid
flowchart LR
    A[pip install deepsearcher] --> B[Copy env.example to .env]
    B --> C[Fill provider credentials]
    C --> D[python examples/basic_example.py]
    D --> E[Configuration + init_config]
    E --> F[Load documents into vector DB]
    F --> G[NaiveRAG agent query]
    G --> H[Printed answer + retrieved chunks]
```

## See Also

- Configuration & Providers
- RAG Agents (NaiveRAG, ChainOfRAG, DeepSearch, RAGRouter)
- Evaluation Guide
- Web Search Sources

---

<a id='page-3'></a>

## LLM Provider Configuration

### Related Pages

Related topics: [Embedding Model Configuration](#page-4), [Extensibility, Troubleshooting & FAQ](#page-8)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/llm/__init__.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/__init__.py)
- [deepsearcher/llm/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/base.py)
- [deepsearcher/llm/openai_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/openai_llm.py)
- [deepsearcher/llm/deepseek.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/deepseek.py)
- [deepsearcher/llm/anthropic_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/anthropic_llm.py)
- [deepsearcher/llm/ollama.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/ollama.py)
- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py)
- [deepsearcher/configuration.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/configuration.py)
- [deepsearcher/online_query.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/online_query.py)
</details>

# LLM Provider Configuration

## Overview

DeepSearcher is designed to be LLM-agnostic. Every retrieval agent — `NaiveRAG`, `ChainOfRAG`, `DeepSearch`, and `RAGRouter` — accepts a single `llm: BaseLLM` argument that is used for prompt answering, sub-query decomposition, reranking, and reflection. Source: [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py), [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py), [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py).

The `BaseLLM` abstract class is defined in `deepsearcher/llm/base.py`. Source: [deepsearcher/llm/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/base.py). Concrete providers live in sibling modules — `openai_llm.py`, `deepseek.py`, `anthropic_llm.py`, and `ollama.py` — each wrapping a different upstream SDK while exposing the same interface. Source: [deepsearcher/llm/openai_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/openai_llm.py), [deepsearcher/llm/deepseek.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/deepseek.py), [deepsearcher/llm/anthropic_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/anthropic_llm.py), [deepsearcher/llm/ollama.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/ollama.py). Provider selection happens in `deepsearcher/configuration.py`, where a YAML-driven `Configuration` object is materialized into the actual `BaseLLM` instance. Source: [deepsearcher/configuration.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/configuration.py).

## The BaseLLM Contract

Every LLM provider must implement the methods consumed by the agents. Inspecting agent call sites reveals the full contract:

| Method | Where used | Purpose |
| --- | --- | --- |
| `chat(messages)` | All agents | Send a prompt, return an object with `.content` (string) and `.total_tokens` (int) |
| `literal_eval(text)` | `ChainOfRAG`, `DeepSearch`, `CollectionRouter` | Parse a Python literal (list of strings) from the LLM output |
| `remove_think(text)` | `ChainOfRAG`, `DeepSearch`, `RAGRouter` | Strip `<think>…` blocks emitted by reasoning models |
| `find_last_digit(text)` | `RAGRouter` | Fallback to extract the trailing digit when parsing the routing decision |

Source: [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py) (uses `literal_eval` to parse follow-up questions and `remove_think` before `literal_eval`), [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py) (`SUB_QUERY_PROMPT` and `REFLECT_PROMPT` expect a Python list of strings back), [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py) (falls back to `find_last_digit` when `int(...)` fails on a reasoning model's verbose reply).

The presence of `remove_think` and `find_last_digit` in the contract is a direct response to the maintainers' recommendation that users select a cutting-edge *reasoning* model — such as OpenAI o-series, DeepSeek R1, or Claude 3.7 Sonnet — because the prompts depend on structured output (lists, integers) and small LLMs are prone to hallucinations. Source: [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py).

## Built-in Providers

The repository ships four first-party provider modules. Each module is a thin adapter over a third-party SDK and is referenced by name from the configuration loader.

- **OpenAI / OpenAI-compatible** — `deepsearcher/llm/openai_llm.py` covers the OpenAI Chat Completions API and any vendor exposing an OpenAI-compatible endpoint (Azure OpenAI, local vLLM serving OpenAI-format models). Source: [deepsearcher/llm/openai_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/openai_llm.py).
- **DeepSeek** — `deepsearcher/llm/deepseek.py` adapts the DeepSeek API, which is OpenAI-compatible and recommended by the maintainers for reasoning-heavy workloads. Source: [deepsearcher/llm/deepseek.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/deepseek.py).
- **Anthropic** — `deepsearcher/llm/anthropic_llm.py` adapts the Claude Messages API. Source: [deepsearcher/llm/anthropic_llm.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/anthropic_llm.py).
- **Ollama (local)** — `deepsearcher/llm/ollama.py` runs models locally through the Ollama daemon. Community reports note that Ollama throughput can be a bottleneck for embedding-heavy workloads. Source: [deepsearcher/llm/ollama.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/ollama.py).

A user request in issue #254 to add **BurnCloud** as an additional LLM provider has been opened against the project, illustrating how third-party providers can be contributed by adding a new module that subclasses `BaseLLM`. Source: [issue #254 — BurnCloud seeks to contribute enhancements](https://github.com/zilliztech/deep-searcher/issues/254). Issue #247 similarly asks for local deployment of Qwen3-Embedding and a Qwen3 LLM served via vLLM with an OpenAI-compatible interface — i.e. using the existing `openai_llm.py` adapter against a self-hosted endpoint. Source: [issue #247 — local Qwen3 deployment](https://github.com/zilliztech/deep-searcher/issues/247).

## Configuration Lifecycle

LLM selection is driven by the `Configuration` class. The canonical entry point is:

```python
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
# Customize your config here,
# more configuration see the Configuration Details section...
```

Source: [deepsearcher/configuration.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/configuration.py) and the Quickstart snippet reproduced in [issue #80](https://github.com/zilliztech/deep-searcher/issues/80) (which originally contained an unbalanced parenthesis that has since been fixed).

Internally, the configuration object resolves a `provider_name` to a concrete class registered in `deepsearcher/llm/__init__.py`, instantiates it with credentials (`api_key`, `base_url`, `model`, etc.), and the resulting `BaseLLM` instance is passed by reference into every agent. Because all agents receive the *same* `llm` object, switching providers requires only a configuration change — no agent code edits. Source: [deepsearcher/llm/__init__.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/llm/__init__.py).

```mermaid
flowchart LR
    YAML["config.yaml<br/>(provider, model, api_key)"] --> Cfg[Configuration]
    Cfg --> Resolver["deepsearcher/llm/__init__.py<br/>provider dispatch"]
    Resolver --> Provider["Concrete BaseLLM<br/>(OpenAI / DeepSeek /<br/>Anthropic / Ollama)"]
    Provider --> Agents["NaiveRAG / ChainOfRAG /<br/>DeepSearch / RAGRouter"]
    Agents -->|chat / literal_eval /<br/>remove_think| Provider
```

## Common Failure Modes and Community Pitfalls

- **Small or non-reasoning LLMs produce malformed structured output.** The `RAGRouter` prompt asks for a single integer; `ChainOfRAG` and `DeepSearch` ask for a Python list of strings. When the model returns prose, `int(...)` raises `ValueError` and the router falls back to `find_last_digit`. If even that fails, the agent throws — a symptom users see as "the LLM is hallucinating". Source: [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py).
- **CLI bootstrap errors.** Issue #255 reports a `deepsearcher.exe` invocation crashing because `deepsearcher.cli` cannot import — almost always a missing or mis-configured provider SDK installed after `deepsearcher` itself. Source: [issue #255](https://github.com/zilliztech/deep-searcher/issues/255).
- **Ollama throughput.** Issue #247 cites Ollama as too slow for embedding and asks for a vLLM-backed local server instead, served via the OpenAI-compatible provider. Source: [issue #247](https://github.com/zilliztech/deep-searcher/issues/247).
- **Collection routing authorization gap.** Issue #267 notes that `CollectionRouter` selects collections based on the query alone and ignores caller authorization context, so provider-side guardrails must be enforced elsewhere. Source: [issue #267](https://github.com/zilliztech/deep-searcher/issues/267), [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py).

## See Also

- [Vector Database Configuration](#) — sibling configuration domain covering Milvus / Milvus-Lite.
- [Embedding Model Configuration](#) — the analogous `BaseEmbedding` abstraction, including the GPTCache default used in the `Milvus_default_embedding_model` release.
- [Agent Architecture](#) — how `BaseLLM` is consumed by `NaiveRAG`, `ChainOfRAG`, `DeepSearch`, and `RAGRouter`.
- [Evaluation Harness](https://github.com/zilliztech/deep-searcher/tree/main/evaluation) — `evaluate.py` reads a `config_yaml` that specifies LLM, embedding, and provider parameters for the 2WikiMultiHopQA benchmark.

---

<a id='page-4'></a>

## Embedding Model Configuration

### Related Pages

Related topics: [LLM Provider Configuration](#page-3), [Vector Database & Data Loader Configuration](#page-5)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/embedding/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/embedding/base.py)
- [deepsearcher/embedding/milvus_embedding.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/embedding/milvus_embedding.py)
- [deepsearcher/embedding/openai_embedding.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/embedding/openai_embedding.py)
- [deepsearcher/embedding/voyage_embedding.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/embedding/voyage_embedding.py)
- [deepsearcher/embedding/fastembed_embdding.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/embedding/fastembed_embdding.py)
</details>

# Embedding Model Configuration

## Overview

The embedding model is a foundational component of the DeepSearcher retrieval pipeline. Every query that is sent to a vector database must first be transformed into a dense vector, and every indexed chunk must be stored as one. DeepSearcher abstracts this concern behind a single interface — `BaseEmbedding` — so the same agent code can be re-targeted against different providers (cloud APIs, local runtimes, or on-disk models) without modifying retrieval logic.

All RAG agents in the project — `NaiveRAG`, `DeepSearch`, and `ChainOfRAG` — accept an `embedding_model: BaseEmbedding` instance in their constructor and call two methods on it: `embed_query(...)` (used at retrieval time) and an embed method used at ingestion time. They also read `embedding_model.dimension` to feed it into the `CollectionRouter` so that the right collection can be selected and any new collections created in the vector store can be initialized with a matching dimensionality. Source: [deepsearcher/agent/naive_rag.py:1-90](), [deepsearcher/agent/deep_search.py:1-120](), [deepsearcher/agent/chain_of_rag.py:1-140]().

Community evidence reflects strong interest in expanding the set of supported embedding backends. For example, issue [#247](https://github.com/zilliztech/deep-searcher/issues/247) asks whether local Qwen3-Embedding can be deployed in place of Ollama, and the most recent release notes ship a new `Milvus_default_embedding_model` (the GPTCache-backed backend). The configuration layer is the single place where these choices are made.

## The `BaseEmbedding` Interface

DeepSearcher defines an abstract base class for embedding models under `deepsearcher/embedding/base.py`. The class exposes the contract that every concrete provider must implement:

| Member | Purpose |
| --- | --- |
| `dimension` (property) | The fixed vector size produced by the model; used by `CollectionRouter` and by vector DB initialization. |
| `embed_query(text: str)` | Embeds a single query string (used at retrieval time). |
| `embed_documents(texts: List[str])` | Embeds a batch of chunk strings (used at ingestion time). |
| Optional `is_normalized` flag | Some models (e.g. BGE) ship pre-normalized vectors, which the agent code consumes when computing similarity. |

Concrete subclasses are plug-and-play. The agent code in `naive_rag.py` and `deep_search.py` only ever references `self.embedding_model.embed_query(...)` and `self.embedding_model.dimension` — it never imports a specific provider — so swapping a backend is purely a configuration concern. Source: [deepsearcher/agent/naive_rag.py:34-90](), [deepsearcher/agent/deep_search.py:30-80]().

## Available Providers

DeepSearcher ships with multiple `BaseEmbedding` implementations, each living in its own module under `deepsearcher/embedding/`:

- **Milvus (GPTCache) default embedding** — `deepsearcher/embedding/milvus_embedding.py`. The newest default; leverages GPTCache model bindings exposed by the Milvus ecosystem. Selected by the most recent release tagged `Milvus_default_embedding_model(GPTCache model)`.
- **OpenAI-compatible embedding** — `deepsearcher/embedding/openai_embedding.py`. Targets `text-embedding-3-*` and similar OpenAI models; configurable through standard `OPENAI_API_KEY` / `OPENAI_BASE_URL` environment variables.
- **Voyage AI embedding** — `deepsearcher/embedding/voyage_embedding.py`. Targets Voyage's embedding endpoints (e.g. `voyage-3`).
- **FastEmbed (local)** — `deepsearcher/embedding/fastembed_embdding.py`. Runs models such as BGE locally without a network round-trip. This is the most common choice for fully offline deployments and is the closest match to what issue [#247](https://github.com/zilliztech/deep-searcher/issues/247) requests for Qwen3-Embedding.

```mermaid
flowchart LR
    A[User Query] --> B[Agent retrieve]
    B --> C[CollectionRouter]
    C --> D[BaseEmbedding.embed_query]
    D --> E[(Vector DB)]
    E --> F[Top-k Chunks]
    F --> G[BaseLLM.chat]
    G --> H[Final Answer]
```

The diagram above shows where the embedding model sits in the critical path. Notice that the embedding is invoked once per query, regardless of which agent is selected — meaning a misconfigured embedding backend silently degrades every retrieval strategy at once.

## Configuration and Wiring

Embedding configuration is performed at the `Configuration` layer (`deepsearcher.configuration`) and is propagated into the agents during initialization. The snippet in issue [#80](https://github.com/zilliztech/deep-searcher/issues/80) illustrates the intended pattern:

```python
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()

# Customize your config here,
# more configuration see the Configuration Details section...
init_config(config=config)
```

In practice, the configuration object exposes a `provider` (e.g. `openai`, `milvus`, `voyage`, `fastembed`), a `model_name`, and provider-specific fields such as `api_key` or `base_url`. At runtime, `init_config` constructs the matching `BaseEmbedding` subclass and threads it into every RAG agent — `NaiveRAG`, `DeepSearch`, and `ChainOfRAG` — which all hold a reference to the same instance. This means changing the embedding in `Configuration` is the only edit required to retarget the entire stack. Source: [deepsearcher/agent/chain_of_rag.py:40-110]().

The `dimension` field that comes back from the configured `BaseEmbedding` is also used to size new Milvus collections on the fly; if a different embedding model is selected mid-project, the agent will create a new collection (and ignore the old one) rather than fail on a vector-size mismatch.

## Common Failure Modes

A few recurring issues in the community trace back to embedding configuration choices:

- **`ModuleNotFoundError: No module named 'milvus_lite'`** ([#67](https://github.com/zilliztech/deep-searcher/issues/67)) — the default backend on Linux/macOS is the Milvus Lite path; Windows users must either install a matching `pymilvus` build or switch the embedding provider away from `milvus`.
- **Local Qwen3-Embedding / vLLM** ([#247](https://github.com/zilliztech/deep-searcher/issues/247)) — FastEmbed is the supported local path today; users wanting Qwen3-Embedding locally currently need a custom `BaseEmbedding` subclass, because the project's FastEmbed module does not yet wrap Qwen3.
- **CLI import errors** ([#255](https://github.com/zilliztech/deep-searcher/issues/255)) — when the embedding provider is misconfigured at install time, the CLI fails to import before the agent layer is even reached. Confirming that the provider selected in `Configuration` has its required dependency installed (`pymilvus`, `openai`, `voyageai`, `fastembed`) is the first debugging step.

## See Also

- [Agent Architecture](https://github.com/zilliztech/deep-searcher) — covers `NaiveRAG`, `DeepSearch`, and `ChainOfRAG` in detail.
- [Vector Database Configuration](https://github.com/zilliztech/deep-searcher) — discusses how the embedding `dimension` is consumed by the vector store.
- [Configuration Module](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/configuration.py) — entry point for `init_config` and provider selection.

---

<a id='page-5'></a>

## Vector Database & Data Loader Configuration

### Related Pages

Related topics: [LLM Provider Configuration](#page-3), [Embedding Model Configuration](#page-4), [RAG Agent System & Retrieval Strategies](#page-6)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)
- [deepsearcher/agent/__init__.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/__init__.py)
- [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)
</details>

# Vector Database & Data Loader Configuration

## Overview

In DeepSearcher, the vector database and data loader are not first-class configuration topics handled inside the agent modules, but they are central dependencies that every agent constructs at initialization time. The agent classes in [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py), [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py), and [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py) all accept a `vector_db: BaseVectorDB` instance and an `embedding_model: BaseEmbedding` instance as required constructor arguments. The configuration surface is therefore expressed in code as object construction, while the high-level `Configuration` object (referenced in the quickstart snippet from issue #80) is the user-facing entry point that wires those objects together.

Community feedback highlights the practical pain points of this configuration. Issue #67 reports a `ModuleNotFoundError: No module named 'milvus_lite'` when the local Milvus backend is selected, and the maintainers note that `milvus_lite` only supports Ubuntu >= 20.04 and macOS >= 11.0, which constrains how the default vector database can be configured on Windows. The latest release, "Milvus_default_embedding_model(GPTCache model)", further indicates that the default embedding model and vector backend are tightly coupled and shipped as a coordinated unit. Source: [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md).

## Vector Database Contract

All agents rely on a shared `BaseVectorDB` abstraction imported from `deepsearcher.vector_db.base`. The contract that the agents depend on is:

| Capability used by agents | Source location |
| --- | --- |
| `vector_db.search(...)` returning `List[RetrievalResult]` | `deepsearcher/agent/naive_rag.py:67-86`, `deepsearcher/agent/deep_search.py:170-200` |
| `vector_db.list_collections(dim=...)` returning collection metadata | `deepsearcher/agent/collection_router.py:55-62` |
| `vector_db.default_collection` used as a fallback target | `deepsearcher/agent/collection_router.py:88-94` |
| `embedding_model.dimension` passed as the `dim` argument for collection listing | `deepsearcher/agent/naive_rag.py:48`, `deepsearcher/agent/collection_router.py:42` |
| `deduplicate_results(...)` to merge results across iterations or collections | `deepsearcher/agent/deep_search.py:155`, `deepsearcher/agent/chain_of_rag.py:140` |

Because the agent layer only ever talks to `BaseVectorDB` and `BaseEmbedding`, the concrete data loader and vector database (Milvus, Milvus Lite, Qdrant, Azure Search, Oracle, etc., as enumerated in `deepsearcher/vector_db/__init__.py` in the broader project) can be swapped by changing the wired instance without modifying agent code. This is the design pattern that makes "configuration" effectively a matter of choosing the right concrete class and credential set. Source: [deepsearcher/agent/naive_rag.py:36-58](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py).

## Collection Routing and Data Placement

The `CollectionRouter` in [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py) is the bridge between the agent layer and the physical vector database collections where loaded data lives. At construction time, it enumerates all available collections using `self.vector_db.list_collections(dim=dim)`, where `dim` is taken from `embedding_model.dimension` so that the router only sees collections whose schema matches the active embedding model. This is the de-facto configuration check for whether a piece of loaded data is even visible to the current pipeline. Source: [deepsearcher/agent/collection_router.py:42-58](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py).

At query time, the router calls the LLM with `COLLECTION_ROUTE_PROMPT`, asking it to pick a Python list of collection names from the candidate list. Two rules then extend that selection: any collection with an empty `description` is always added (the query itself is used as the search query), and the `default_collection` is always appended. The final list is deduplicated before being passed to the per-collection search loop. This explains why data loaders should populate the `description` field on collections: an empty description forces the collection to be searched for every query, which is the correct behavior for a default "catch-all" collection but a footgun for named, domain-specific collections. Source: [deepsearcher/agent/collection_router.py:78-100](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py).

A known limitation surfaces in issue #267, "Collection routing ignores caller authorization context": because the router relies solely on the LLM to pick collections, it has no way to enforce per-caller access control. Any user with query access effectively gets to search every collection the router knows about.

## Agent-Level Configuration Knobs

The agent constructors expose the configuration levers that most directly affect vector DB behavior:

- `top_k` (NaiveRAG, default 10) — number of chunks fetched per collection per query. Source: [deepsearcher/agent/naive_rag.py:40-58](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py).
- `max_iter` (DeepSearch default 3, ChainOfRAG default 4) — caps the reflection/re-query loop that drives additional vector searches. Source: [deepsearcher/agent/deep_search.py:36-58](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py), [deepsearcher/agent/chain_of_rag.py:46-70](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py).
- `route_collection` (default True on all three agents) — toggles the `CollectionRouter`. When `False`, the agent searches `collection_router.all_collections` directly with zero routing tokens. Source: [deepsearcher/agent/naive_rag.py:70-78](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py).
- `text_window_splitter` (default True) — when enabled, the summarization step reads the `wider_text` metadata field produced by the splitter instead of the raw chunk text, which materially changes the context the LLM sees at answer time. Source: [deepsearcher/agent/naive_rag.py:96-110](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py).
- `early_stopping` (ChainOfRAG, default False) — uses a reflection prompt to break the loop as soon as the intermediate context is judged sufficient, reducing redundant vector searches. Source: [deepsearcher/agent/chain_of_rag.py:46-70](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py).

The `RAGRouter` in [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py) sits one level above these knobs: it uses the `__description__` attribute (registered via the `@describe_class` decorator in [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)) to pick which agent should handle a given query, which is the recommended way to expose the configuration trade-offs to end users without forcing them to choose an agent manually. Source: [deepsearcher/agent/base.py:8-26](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py).

## Data Flow and Common Failure Modes

```mermaid
flowchart LR
    A[Load documents] --> B[Embedding model]
    B --> C[Vector DB collections<br/>with dim from embedding_model]
    Q[User query] --> R[CollectionRouter<br/>uses LLM + dim]
    R --> S[Search selected collections]
    S --> T[deduplicate_results]
    T --> U[Agent summarization<br/>uses wider_text if available]
    U --> A2[Final answer]
```

Two failure modes recur in community reports and are visible from the code:

1. **Embedding/collection dimension mismatch.** The router passes `embedding_model.dimension` to `list_collections(dim=dim)`. If a previously loaded collection was built with a different embedding model (for example, switching from the GPTCache default mentioned in the latest release to Qwen3-Embedding as requested in issue #247), the dimension check will silently drop that collection from the candidate list, and the agent will return no results without an explicit error.
2. **Missing backend dependency on Windows.** Per issue #67, the `milvus_lite` extension required by the default local Milvus configuration is unavailable on Windows. The community-validated workaround is to upgrade `pymilvus` to a version that bundles the right native libraries, or to switch to a different `BaseVectorDB` implementation such as Qdrant or Azure Search.

## See Also

- [Agent Architecture](agent-architecture.md) — overview of `BaseAgent` and `RAGAgent`
- [LLM Provider Configuration](llm-providers.md) — selecting and configuring the language model backend
- [Evaluation Guide](evaluation.md) — Recall@K methodology and `2WikiMultiHopQA` benchmarks described in [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)

---

<a id='page-6'></a>

## RAG Agent System & Retrieval Strategies

### Related Pages

Related topics: [Project Overview & System Architecture](#page-1), [Vector Database & Data Loader Configuration](#page-5)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)
- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/agent/__init__.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/__init__.py)
</details>

# RAG Agent System & Retrieval Strategies

The `deepsearcher.agent` package is the orchestration layer that turns a natural-language question into a verified answer grounded in the user's private vector store (and, optionally, the public web). It defines a small hierarchy of abstract base classes and ships four concrete strategies that share the same inputs — an LLM, an embedding model, and a vector database — but apply very different retrieval and reasoning loops. Source: [deepsearcher/agent/__init__.py:1-12]().

## Agent Class Hierarchy

At the root is `BaseAgent`, an abstract class with a single `invoke(query, **kwargs)` entry point. `RAGAgent` extends it and adds two structured methods: `retrieve()` returning `(results, tokens, metadata)`, and `query()` returning `(answer, results, tokens)`. All concrete RAG agents inherit from `RAGAgent`. Source: [deepsearcher/agent/base.py:36-90]().

A `describe_class()` decorator injects a `__description__` string on each agent class. This description is later read by `RAGRouter` to decide which agent should handle a query, so every shipped agent ships with a curated one-liner explaining its strength. Source: [deepsearcher/agent/base.py:18-34]().

## The Three Retrieval Strategies

### NaiveRAG — Single-Pass Retrieval

`NaiveRAG` is the simplest implementation. It embeds the query, optionally uses `CollectionRouter` to narrow the search to relevant collections, pulls `top_k` chunks from `vector_db`, formats them inside `<chunk_i>...</chunk_i>` tags, and asks the LLM to write a final `SUMMARY_PROMPT`. Source: [deepsearcher/agent/naive_rag.py:28-110]().

It supports two flags: `route_collection` (off by default — must be opted in) and `text_window_splitter` which prefers the `wider_text` metadata field of each chunk so the LLM sees surrounding context instead of an isolated fragment. Source: [deepsearcher/agent/naive_rag.py:18-110]().

### ChainOfRAG — Iterative Sub-Query Decomposition

`ChainOfRAG` targets concrete, factual, multi-hop questions. At each iteration it (1) prompts the LLM for a single follow-up sub-query, (2) retrieves chunks for that sub-query, (3) generates an intermediate answer using only retrieved documents, (4) asks the LLM to pick the supporting docs via `GET_SUPPORTED_DOCS_PROMPT`, and (5) calls `REFLECTION_PROMPT` to decide whether to stop early or iterate again. Source: [deepsearcher/agent/chain_of_rag.py:30-180]().

Key configuration: `max_iter` (default 4), `early_stopping` (default `False`), `route_collection` (default `True`). When `early_stopping=True`, the loop terminates as soon as the reflection prompt returns "Yes", saving both tokens and latency. The final answer is composed from the union of deduplicated chunks plus all intermediate answer contexts. Source: [deepsearcher/agent/chain_of_rag.py:120-200]().

### DeepSearch — Reflective Multi-Iteration Search

`DeepSearch` is designed for "write me a report / survey / article" prompts. Its `async_retrieve()` first calls `_generate_sub_queries()` (up to 4 sub-questions, or the original query alone if already simple), then executes all vector searches in parallel with `asyncio.gather`. Source: [deepsearcher/agent/deep_search.py:130-180]().

Two LLM-driven filters shape the final corpus:

- `RERANK_PROMPT` — accepts or rejects each individual chunk against the active sub-queries.
- `REFLECT_PROMPT` — at the end of each iteration, asks the LLM whether further research is needed and, if so, returns up to 3 "gap queries" that feed the next iteration.

The loop runs for `max_iter` rounds (default 3), then a final `SUMMARY_PROMPT` consolidates everything. Note that `DeepSearch` always instantiates a `CollectionRouter` regardless of the `route_collection` flag in its constructor. Source: [deepsearcher/agent/deep_search.py:30-60](); [deepsearcher/agent/deep_search.py:90-130]().

## Query and Collection Routing

`RAGRouter` selects one agent out of a user-supplied list. It builds a numbered prompt of agent descriptions and parses the LLM's chosen index, with a defensive fallback that scans for the last digit if a reasoning model wraps the answer in prose. Source: [deepsearcher/agent/rag_router.py:15-75]().

Inside each agent, `CollectionRouter` performs a second, finer-grained routing step: choosing which vector-DB collections to search. It is constructed from `(llm, vector_db, dim)` and exposed via `collection_router.all_collections` when routing is disabled. Source: [deepsearcher/agent/naive_rag.py:50-80](); [deepsearcher/agent/chain_of_rag.py:30-60]().

## Data Flow

```mermaid
flowchart TD
    Q[User Query] --> RR[RAGRouter]
    RR -->|picks agent| NA[NaiveRAG]
    RR -->|picks agent| CR[ChainOfRAG]
    RR -->|picks agent| DS[DeepSearch]
    NA --> CR2[CollectionRouter]
    CR --> CR2
    DS --> CR2
    CR2 --> VDB[(Vector DB)]
    VDB --> CR2
    CR2 -->|top_k chunks| NA
    CR2 -->|top_k chunks| CR
    CR2 -->|top_k chunks| DS
    NA --> LLM[LLM Summary]
    CR -->|loop: follow-up + reflect| LLM
    DS -->|loop: reflect + gap queries| LLM
    LLM --> A[Final Answer + Citations]
```

## Failure Modes and Community Notes

- **Routing & authorization.** Community issue #267 reports that `CollectionRouter` ignores the caller's authorization context, so any user-visible collection may be searched even when it should be filtered. Treat collection names as non-sensitive labels until ACLs are layered on top. Source: [deepsearcher/agent/collection_router.py](); community context [#267](https://github.com/zilliztech/deep-searcher/issues/267).
- **Reasoning-model prompts.** The prompts intentionally call `llm.remove_think()` before parsing integers and indices because reasoning models (o-series, DeepSeek R1, Claude 3.7 Sonnet) wrap answers in `<think>...` tags. The `RAGRouter` falls back to `find_last_digit` when this still fails. Source: [deepsearcher/agent/rag_router.py:55-70](); [deepsearcher/agent/deep_search.py:170-200]().
- **Web vs. private data.** DeepSearcher is intentionally a private-data-first RAG; web search is a complementary add-on. The current `DeepSearch` code reserves a `search_res_from_internet = []` slot for a future web backend, which is the entry point relevant to feature request #270 for adding serpbase.dev. Source: [deepsearcher/agent/deep_search.py:150-160](); community context [#270](https://github.com/zilliztech/deep-searcher/issues/270).
- **LLM quality matters.** Small LLMs struggle with `literal_eval`, list-format responses, and the YES/NO reranker, producing hallucinations. The maintainers explicitly recommend reasoning-grade models.

## See Also

- Vector DB backends and `RetrievalResult` schema
- Embedding model providers (e.g., Qwen3-Embedding, GPTCache)
- Configuration module: `deepsearcher.configuration`
- Online query API: `deepsearcher.online_query`
- Evaluation pipeline: [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)

---

<a id='page-7'></a>

## Deployment, CLI & FastAPI Service

### Related Pages

Related topics: [Installation & Quickstart](#page-2), [RAG Agent System & Retrieval Strategies](#page-6)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py)
- [main.py](https://github.com/zilliztech/deep-searcher/blob/main/main.py)
- [Dockerfile](https://github.com/zilliztech/deep-searcher/blob/main/Dockerfile)
- [Makefile](https://github.com/zilliztech/deep-searcher/blob/main/Makefile)
- [evaluation/evaluate.py](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/evaluate.py)
- [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)
- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
</details>

# Deployment, CLI & FastAPI Service

Deep-Searcher is shipped as a Python package that exposes a command-line entry point, a programmatic `main.py` driver, a reproducible `Makefile` workflow, and an opt-in container image via the project `Dockerfile`. This page documents the deployment surface area, how the pieces fit together, and the failure modes reported by the community.

## 1. Entry Points and High-Level Topology

The repository exposes the system through three complementary surfaces: a generated console script, a Python module entry, and an evaluation harness.

- The console script `deepsearcher` is generated by the package's `[project.scripts]` (or `console_scripts`) metadata. When installed in a Python environment it becomes an executable that delegates to `deepsearcher.cli:main`. This is visible in community stack traces where the invocation resolves to `C:\...\Scripts\deepsearcher.exe\__main__.py` and immediately runs `from deepsearcher.cli import main` (community trace from issue [#255](https://github.com/zilliztech/deep-searcher/issues/255)).
- `main.py` at the repository root is the canonical "load → query" driver used in the Quickstart guide. It imports `Configuration` and `init_config` from `deepsearcher.configuration`, then drives `deepsearcher.online_query.query` against the configured provider stack.
- `evaluation/evaluate.py` is a separate entry point dedicated to batch benchmarking; it does not share runtime state with the CLI.

```mermaid
flowchart LR
    A[User Shell] -->|deepsearcher| B[deepsearcher/cli.py]
    A -->|python main.py| C[main.py driver]
    A -->|python evaluate.py| D[evaluation/evaluate.py]
    B --> E[Configuration + init_config]
    C --> E
    D --> E
    E --> F[Vector DB + LLM + Embedding]
    F --> G[Agents: NaiveRAG / ChainOfRAG / DeepSearch]
    G --> H[Answer + RetrievalResults]
```

Source: [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py), [main.py](https://github.com/zilliztech/deep-searcher/blob/main/main.py), [evaluation/evaluate.py](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/evaluate.py)

## 2. CLI Surface (`deepsearcher.cli`)

The CLI module is the most fragile of the entry points because it is what end users run after `pip install`. The traceback reproduced in issue [#255](https://github.com/zilliztech/deep-searcher/issues/255) confirms three facts that operators must understand:

1. The console-script wrapper lives under the active Python environment's `Scripts/` directory and re-exports `deepsearcher.cli.main`.
2. Importing `deepsearcher.cli` eagerly pulls in the full agent, vector-DB, and LLM dependency tree. A failure anywhere in that tree surfaces as a `ModuleNotFoundError` from the CLI before any user code runs.
3. On Windows with Python 3.13, the `deepsearcher.exe` shim and the underlying imports must both succeed; community reports show partial Windows breakage driven by `milvus_lite` (issue [#67](https://github.com/zilliztech/deep-searcher/issues/67) — `milvus_lite` officially supports Ubuntu ≥ 20.04 and macOS ≥ 11.0).

Operators deploying the CLI on Windows should therefore pin to a Python version supported by `pymilvus`/`milvus_lite`, or run the CLI inside WSL/Linux where the native library resolves correctly.

Source: [deepsearcher/cli.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/cli.py), community reports in issues [#255](https://github.com/zilliztech/deep-searcher/issues/255) and [#67](https://github.com/zilliztech/deep-searcher/issues/67).

## 3. Programmatic Driver and FastAPI-Style Service

`main.py` is the reference deployment pattern: instantiate `Configuration`, mutate it for the target LLM / embedding / vector-DB providers, then call `init_config(config)` before issuing `query(...)` calls. The driver composes the agents defined in `deepsearcher/agent/`:

- `NaiveRAG` — single-shot vector retrieval + summarization (see `SUMMARY_PROMPT` flow in `deepsearcher/agent/naive_rag.py`).
- `ChainOfRAG` — iterative sub-query decomposition with reflection (`REFLECTION_PROMPT`, `GET_SUPPORTED_DOCS_PROMPT`) and optional `early_stopping`, see `deepsearcher/agent/chain_of_rag.py`.
- `DeepSearch` — multi-aspect sub-query generation + gap-driven reflection (`REFLECT_PROMPT`), see `deepsearcher/agent/deep_search.py`.

These three classes share the `RAGAgent` contract defined in `deepsearcher/agent/base.py` (`retrieve(...)` returns `(List[RetrievalResult], int, dict)`, `query(...)` returns `(str, List[RetrievalResult], int)`). Any HTTP layer wrapping Deep-Searcher (FastAPI, Starlette, etc.) only needs to forward a string query into `query(...)` and serialize the returned tuple — no special protocol is required because the return shape is stable across agents.

Source: [main.py](https://github.com/zilliztech/deep-searcher/blob/main/main.py), [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py), [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py), [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py).

## 4. Containerization, Make Targets, and Evaluation Harness

The repository ships a `Dockerfile` and `Makefile` to make local and CI deployments reproducible. Issue [#78](https://github.com/zilliztech/deep-searcher/issues/78) ("Build up an OFFICIAL Docker image please") is the most up-voted deployment request, indicating that the in-repo `Dockerfile` is currently the supported path rather than a published registry image.

The `Makefile` typically wraps the most common operator tasks (install, lint, test, run-evaluate). Combined with `evaluation/evaluate.py`, the supported evaluation flow documented in [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md) is:

```shell
python evaluate.py \
  --dataset 2wikimultihopqa \
  --config_yaml ./eval_config.yaml \
  --pre_num 5 \
  --output_dir ./eval_output
```

Key flags and behaviors per the README:

| Flag | Purpose |
| --- | --- |
| `--dataset` | Selects the QA dataset (currently `2wikimultihopqa`). |
| `--config_yaml` | Path to a YAML file specifying LLM, embedding, and provider parameters. |
| `--pre_num` | Number of samples to evaluate; higher = more accurate but more token cost. |
| `--skip_load` | Reuse a previously loaded vector DB instead of re-ingesting. |
| `--output_dir` | Destination for recall plots (e.g. `plot_results/max_iter_vs_recall.png`). |

The evaluation uses **Recall@K**: the percentage of relevant documents appearing in the top-K retrieved results. The README notes diminishing returns as `max_iter` increases — most models gain steeply between 2–4 iterations and plateau afterwards, with Claude-3-7-sonnet approaching near-perfect recall at 7 iterations on the 50-sample preview.

Source: [Dockerfile](https://github.com/zilliztech/deep-searcher/blob/main/Dockerfile), [Makefile](https://github.com/zilliztech/deep-searcher/blob/main/Makefile), [evaluation/evaluate.py](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/evaluate.py), [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md).

## 5. Common Deployment Failure Modes

The community context surfaces four recurring deployment problems that operators should pre-empt:

1. **Windows + Python 3.13 CLI breakage** — issues [#255](https://github.com/zilliztech/deep-searcher/issues/255) and [#67](https://github.com/zilliztech/deep-searcher/issues/67). The console-script shim crashes before any user logic executes because `milvus_lite` lacks Windows wheels. Mitigation: run on Linux/macOS or via the supplied `Dockerfile`.
2. **Quickstart syntax bug** — issue [#80](https://github.com/zilliztech/deep-searcher/issues/80) reports an unbalanced parenthesis in the published `from deepsearcher.configuration import Configuration, init_config` snippet. Always copy the snippet directly from the repo rather than cached docs.
3. **No official Docker image** — issue [#78](https://github.com/zilliztech/deep-searcher/issues/78). Until an official image is published, build locally from the `Dockerfile`.
4. **Local LLM/embedding deployment** — issue [#247](https://github.com/zilliztech/deep-searcher/issues/247) shows that users wanting Qwen3-Embedding or Qwen3 LLM locally must use `vllm>=0.8.5` rather than Ollama for acceptable throughput, and must wire the provider through `Configuration` + `init_config` in `main.py`.

Source: community issues [#255](https://github.com/zilliztech/deep-searcher/issues/255), [#67](https://github.com/zilliztech/deep-searcher/issues/67), [#80](https://github.com/zilliztech/deep-searcher/issues/80), [#78](https://github.com/zilliztech/deep-searcher/issues/78), [#247](https://github.com/zilliztech/deep-searcher/issues/247).

## See Also

- Agents & RAG Pipelines (NaiveRAG / ChainOfRAG / DeepSearch / RAGRouter)
- Configuration & Provider Wiring (LLM, Embedding, Vector DB)
- Vector Database Connectors (Milvus / Milvus-Lite)
- Evaluation Harness & Recall@K Benchmarking

---

<a id='page-8'></a>

## Extensibility, Troubleshooting & FAQ

### Related Pages

Related topics: [LLM Provider Configuration](#page-3), [Embedding Model Configuration](#page-4), [Vector Database & Data Loader Configuration](#page-5)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py)
- [deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)
- [deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)
- [deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)
- [deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)
- [deepsearcher/agent/collection_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/collection_router.py)
- [deepsearcher/utils/log.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/utils/log.py)
- [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md)
</details>

# Extensibility, Troubleshooting & FAQ

## Overview

DeepSearcher is designed with extensibility as a first-class concern. Every core capability — large language models, embeddings, vector databases, retrieval agents, and web search backends — is abstracted behind a base class with a minimal interface. This lets contributors plug in new providers without forking the framework. At the same time, a predictable extension surface means troubleshooting stays tractable: most failures trace back to a misconfigured provider, a missing native dependency, or an LLM that is too weak for the prompt-following the framework expects.

This page consolidates how to extend the system, how to diagnose the most common runtime errors reported by the community, and answers to frequently asked questions drawn from the issue tracker.

## Extension Points

### Adding a New Agent

All retrieval agents inherit from `RAGAgent`, which in turn extends `BaseAgent` defined in [deepsearcher/agent/base.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/base.py). The contract is intentionally small: implement `retrieve(query, **kwargs) -> (List[RetrievalResult], int, dict)` and `query(query, **kwargs) -> (str, List[RetrievalResult], int)`.

To make an agent discoverable by the query router, decorate the class with `@describe_class("…")`. The decorator stores the description on `cls.__description__`, which `RAGRouter` reads at construction time when no explicit `agent_descriptions` list is supplied ([deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)).

Reference implementations to study:

- **NaiveRAG** — single-pass retrieve + summarize ([deepsearcher/agent/naive_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/naive_rag.py)).
- **ChainOfRAG** — iterative follow-up queries with early stopping, suitable for multi-hop factual questions ([deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)).
- **DeepSearch** — sub-query decomposition, LLM-based reranking, and reflection to fill gaps ([deepsearcher/agent/deep_search.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/deep_search.py)).

### Adding Providers (LLM, Embedding, Vector DB, Web Search)

The same pattern repeats across modules: subclass the `Base…` abstract class, implement the required methods, then register the provider in the configuration layer so `Configuration()` can resolve it by name. Each provider module follows this shape (constructor takes a config object, methods return typed dataclasses such as `RetrievalResult`).

Community discussions around new providers frequently reference this pattern, including requests to add additional web search backends such as `serpbase.dev` ([issue #270](https://github.com/zilliztech/deep-searcher/issues/270)) and local embedding/LLM stacks like `Qwen3-Embedding` served via `vllm` or an OpenAI-compatible endpoint ([issue #247](https://github.com/zilliztech/deep-searcher/issues/247)). Both are accommodated by the existing base classes without changes to core logic.

## Common Errors and Troubleshooting

The error categories below are the ones most frequently reported by users in the issue tracker.

### 1. `ModuleNotFoundError: No module named 'milvus_lite'`

`milvus_lite` is the default embedded vector database backend. Its prebuilt wheels only ship for **Ubuntu ≥ 20.04** and **macOS ≥ 11.0**, which is why Windows users hit this error ([issue #67](https://github.com/zilliztech/deep-searcher/issues/67)). Remedies:

- Upgrade to a recent `pymilvus` version that bundles a compatible wheel.
- Switch to an alternative vector DB backend that runs on Windows (e.g., a remote Milvus/Zilliz Cloud instance) by changing the `vector_db` block of your configuration.

### 2. Quickstart `SyntaxError` from an Unbalanced Parenthesis

An older quickstart snippet shipped with a missing closing bracket, producing an immediate `SyntaxError` on import ([issue #80](https://github.com/zilliztech/deep-searcher/issues/80)). Always copy from the latest README or docs site; the snippet should now read:

```python
from deepsearcher.configuration import Configuration, init_config
from deepsearcher.online_query import query

config = Configuration()
init_config(config=config)
```

### 3. `deepsearcher.exe` Traceback on Windows

A launcher traceback at process start (e.g., `from deepsearcher.cli import main` failing inside `Scripts/deepsearcher.exe/__main__.py`) usually means a partial or broken install ([issue #255](https://github.com/zilliztech/deep-searcher/issues/255)). Recommended fix:

```shell
pip uninstall deepsearcher
pip install --upgrade deepsearcher
```

If the launcher still fails, run the library directly with `python -m deepsearcher.cli` to surface the real error.

### 4. Collection Routing Ignoring Authorization

`CollectionRouter` selects collections based on the query alone; it does not receive the caller's authorization context ([issue #267](https://github.com/zilliztech/deep-searcher/issues/267)). In multi-tenant deployments you must pre-filter collections in your application layer before constructing the agent, or extend `CollectionRouter` to accept an auth context.

### 5. Weak LLM Producing Hallucinated Routing

`RAGRouter` parses the agent index from the LLM output and falls back to "last digit" parsing when a reasoning model emits prose ([deepsearcher/agent/rag_router.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/rag_router.py)). Smaller or non-reasoning LLMs frequently fail this step. The maintainers' guidance, mirrored in the issue template, is to use a **frontier or reasoning model** (OpenAI o-series, DeepSeek R1, Claude 3.7 Sonnet, etc.) for both routing and generation.

## FAQ

**Q: Which agent should I use?**  
NaiveRAG is the cheapest and works well for single-fact lookups. DeepSearch is the most thorough and is the default for general topic/report-style questions. ChainOfRAG strikes a middle ground for multi-hop factual queries that benefit from iterative refinement but do not need full sub-query decomposition ([deepsearcher/agent/chain_of_rag.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/agent/chain_of_rag.py)).

**Q: Can I run DeepSearcher fully offline?**  
Yes, provided the LLM and embedding model are exposed via an OpenAI-compatible endpoint (e.g., `vllm` serving Qwen3-Embedding — [issue #247](https://github.com/zilliztech/deep-searcher/issues/247)). Point the LLM and embedding provider configurations at that endpoint.

**Q: How is the evaluation suite run?**  
The evaluation harness in [evaluation/README.md](https://github.com/zilliztech/deep-searcher/blob/main/evaluation/README.md) supports the 2WikiMultiHopQA dataset out of the box and reports Recall@K against DeepSearcher versus a naive RAG baseline:

```shell
python evaluate.py \
  --dataset 2wikimultihopqa \
  --config_yaml ./eval_config.yaml \
  --pre_num 5 \
  --output_dir ./eval_output
```

Re-running after the first load can be accelerated with `--skip_load`.

**Q: Where do logs go?**  
The logger in [deepsearcher/utils/log.py](https://github.com/zilliztech/deep-searcher/blob/main/deepsearcher/utils/log.py) writes colored progress output via `color_print` and dev-level diagnostics via `dev_logger`. `critical()` raises `RuntimeError`, so it should only be used for fatal paths.

## See Also

- Agent overview: `naive_rag`, `chain_of_rag`, `deep_search`, `rag_router`
- Configuration and provider registration
- Evaluation harness (`evaluation/README.md`)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: zilliztech/deep-searcher

Summary: Found 12 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

## 1. Configuration risk - Configuration risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/255

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/270

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/67

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/zilliztech/deep-searcher

## 5. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/zilliztech/deep-searcher

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/zilliztech/deep-searcher

## 8. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/zilliztech/deep-searcher

## 9. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/254

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/zilliztech/deep-searcher/issues/267

## 11. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

## 12. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/zilliztech/deep-searcher

<!-- canonical_name: zilliztech/deep-searcher; human_manual_source: deepwiki_human_wiki -->