# https://github.com/SciPhi-AI/R2R Project Manual

Generated at: 2026-06-23 08:27:15 UTC

## Table of Contents

- [Introduction, Installation & SDK Usage](#page-1)
- [REST API, Services & Provider Architecture](#page-2)
- [Ingestion Modes, Parsers, Search & Agentic RAG](#page-3)
- [Deployment, Configuration, Extensibility & Troubleshooting](#page-4)

<a id='page-1'></a>

## Introduction, Installation & SDK Usage

### Related Pages

Related topics: [REST API, Services & Provider Architecture](#page-2), [Deployment, Configuration, Extensibility & Troubleshooting](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [py/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/sdk/README.md)
- [js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
- [js/sdk/package.json](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/package.json)
- [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)
- [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)
- [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py)
- [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [py/shared/api/models/__init__.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)
</details>

# Introduction, Installation & SDK Usage

## Overview

R2R (Reason to Retrieve) is an open-source retrieval-augmented generation (RAG) framework that exposes a RESTful API for multimodal content ingestion, hybrid search, knowledge-graph construction, and document management. Beyond standard RAG, R2R ships a **Deep Research / Agent API** that orchestrates multi-step reasoning across a vector store, a knowledge graph, and (optionally) the open web ([py/README.md:3-7](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)).

The project ships in two layers:

- **The server (`py/` package + Docker compose)** – hosts the FastAPI-based R2R service on `http://localhost:7272`.
- **Language SDKs** – a Python client (`r2r` on PyPI) and a JavaScript/TypeScript client (`r2r-js` on npm) that talk to that server over HTTP ([py/sdk/README.md:5-9](https://github.com/SciPhi-AI/R2R/blob/main/py/sdk/README.md), [js/sdk/README.md:36-37](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)).

```mermaid
flowchart LR
    A[Python/JS SDK Client] -->|HTTPS / JSON| B[R2R FastAPI Server :7272]
    B --> C[Documents Router]
    B --> D[Retrieval Router]
    B --> E[Graphs Router]
    D -->|RAG| F[LLM Provider]
    D -->|Vector + KG| G[(Postgres + pgvector)]
    C --> G
    E --> G
```

## Installation

### Light Mode (Single Python Process)

The fastest path runs R2R as a single Python process using SQLite and an in-memory vector store:

```bash
pip install r2r
export OPENAI_API_KEY=sk-...
python -m r2r.serve
```

Source: [py/README.md:18-23](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).

This mode is appropriate for local exploration but does **not** include Postgres, the knowledge-graph extraction pipeline, or persistent multi-user state.

### Full Mode (Docker Compose)

For production-grade workloads — multi-user auth, knowledge graphs, hybrid search, and persistent storage — the project recommends Docker Compose with the `full` configuration profile:

```bash
git clone git@github.com:SciPhi-AI/R2R.git && cd R2R
export R2R_CONFIG_NAME=full OPENAI_API_KEY=sk-...
docker compose -f compose.full.yaml --profile postgres up -d
```

Source: [py/README.md:25-28](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).

> **Community Note (Issue #2085):** Several users reported that "Chat or Search doesn't seem to work in self-hosted Docker mode" after uploading documents. When debugging Docker deployments, confirm that the `full` profile was selected (so Postgres is provisioned) and that the server's logs show successful connection to the vector store before assuming the SDK is misconfigured.

### Common Environment Variables

| Variable | Purpose | Notes |
|---|---|---|
| `OPENAI_API_KEY` | Auth for OpenAI models | Required for the default LLM/embedding providers ([py/README.md:20](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)). |
| `OPENAI_API_BASE` | Override the OpenAI base URL | Not honored in all code paths — see Issue #2020 below. |
| `R2R_CONFIG_NAME` | Selects server configuration profile (`light` vs `full`) | Used in the Docker quickstart ([py/README.md:26](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)). |

> **Community Note (Issue #2020):** A user attempted to point R2R at a self-hosted OpenAI-compatible endpoint via `OPENAI_API_BASE`, but ingestion failed with an async task error. The variable name expected by some OpenAI-compatible clients differs from R2R's internal configuration; you may need to set the base URL through the R2R config file instead of `OPENAI_API_BASE`.

## Python SDK Usage

Install the Python client with `pip install r2r`, then initialize it against a running server ([py/sdk/README.md:7-13](https://github.com/SciPhi-AI/R2R/blob/main/py/sdk/README.md)):

```python
from r2r import R2RClient

client = R2RClient("http://localhost:7272")
health = client.health()  # {"status": "ok"}

# Optional authentication
client.register("me@email.com", "my_password")
client.login("me@email.com", "my_password")

# Ingest a document
client.documents.create(file_path="/path/to/file.pdf")

# List ingested documents
client.documents.list()
```

Source: [py/sdk/README.md:15-34](https://github.com/SciPhi-AI/R2R/blob/main/py/sdk/README.md).

Once documents are ingested, the retrieval surface area is exposed through `client.retrieval`:

```python
# Semantic / hybrid search over chunks
results = client.retrieval.search(query="What is DeepSeek R1?")

# Single-turn RAG with citations
response = client.retrieval.rag(query="What is DeepSeek R1?")

# Multi-turn Deep Research agent (v3.6.5+ supports extended thinking)
response = client.retrieval.agent(
    message={"role": "user", "content": "What does DeepSeek R1 imply?"},
    rag_generation_config={
        "model": "anthropic/claude-3-7-sonnet-20250219",
        "extended_thinking": True,
        "thinking_budget": 4096,
        "temperature": 1,
        "max_tokens_to_sample": 16000,
    },
)
```

Source: [py/README.md:9-27](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).

The agent endpoint streams structured events back to the caller (`ThinkingEvent`, `MessageEvent`, `CitationEvent`, `FinalAnswerEvent`, `ToolCallEvent`, `ToolResultEvent`), all of which are exported as Pydantic models from the shared package ([py/shared/api/models/__init__.py:7-15](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)). The server-side handler at `retrieval_router.py` exposes both the `/retrieval/rag` and `/retrieval/agent` endpoints and returns a `StreamingResponse` when `stream=True` ([py/core/main/api/v3/retrieval_router.py:170-184](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).

## JavaScript / TypeScript SDK Usage

The JS SDK is published as `r2r-js` on npm and targets the same REST API ([js/sdk/README.md:36-37](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)):

```bash
npm install r2r-js
```

```javascript
const { r2rClient } = require("r2r-js");

const client = new r2rClient("http://localhost:7272");

await client.login("admin@example.com", "change_me_immediately");

await client.ingestFiles(
  [
    { path: "examples/data/raskolnikov.txt", name: "raskolnikov.txt" },
    { path: "examples/data/karamozov.txt",  name: "karamozov.txt"  },
  ],
  { metadatas: [{ title: "raskolnikov" }, { title: "karamozov" }] },
);
```

Source: [js/sdk/README.md:39-56](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md).

For agentic retrieval, the JS SDK's `retrieval.agent(...)` mirrors the Python call shape and accepts the same option names (`message`, `ragGenerationConfig`, `researchGenerationConfig`, `searchMode`, `searchSettings`, `taskPrompt`, `conversationId`, `ragTools`, `researchTools`, `mode`) ([js/sdk/src/v3/clients/retrieval.ts:62-89](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)). The agent supports two operating modes — `rag` (default) and `research` — the latter adding a reasoning model, a critique tool, and a Python executor on top of the RAG toolset ([py/core/main/api/v3/retrieval_router.py:120-138](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).

## Common Pitfalls and Configuration Tips

- **HTML ingestion (Issue #2182):** R2R's standard ingestion paths accept files via `client.documents.create(file_path=...)` ([py/sdk/README.md:31-34](https://github.com/SciPhi-AI/R2R/blob/main/py/sdk/README.md)). As of v3.6.5 there is no first-class "URL → ingest" endpoint; users wanting HTML ingestion must scrape the page themselves and feed the saved file through `documents.create`. The `retrieval.agent(...)` endpoint does include a `web_scrape` tool that can be invoked during a research session, but it does not persist the scraped page as an ingested document ([py/core/main/api/v3/retrieval_router.py:128-138](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).

- **Ingestion without knowledge-graph extraction (Issue #2243):** If you want pure chunking + embedding without entity/relationship extraction, configure your ingestion pipeline to skip the graph stage. The ingestion pipeline's stages are configurable, so disabling the graph stage yields a faster, cheaper ingestion run that produces only chunks and embeddings.

- **Self-hosted Docker debugging (Issue #2085):** When chat/search is non-functional after a fresh Docker install, verify that the `full` profile is active, that Postgres migrations have completed, and that the server is listening on port `7272` before pointing the SDK at it.

- **Documentation outage (Issue #2276):** The hosted docs at `r2r-docs.sciphi.ai` have been intermittently unavailable. The in-repo READMEs (`py/README.md`, `py/sdk/README.md`, `js/sdk/README.md`) and the source-level docstrings (e.g., the schema examples embedded in `py/shared/api/models/retrieval/responses.py`) remain the canonical reference until the site is restored.

## See Also

- Retrieval & Agent API reference (RAG modes, streaming events, response models) — coming soon
- Knowledge Graph ingestion and community extraction
- Document management endpoints (`/documents`, `/documents/{id}/search`)
- Configuration profiles (`light` vs `full`) and provider overrides
- Release notes for v3.6.5 (extended thinking, collection-aware chunking, k8s manifests)

---

<a id='page-2'></a>

## REST API, Services & Provider Architecture

### Related Pages

Related topics: [Ingestion Modes, Parsers, Search & Agentic RAG](#page-3), [Deployment, Configuration, Extensibility & Troubleshooting](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [py/core/main/api/v3/base_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/base_router.py)
- [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py)
- [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)
- [py/core/main/api/v3/graph_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/graph_router.py)
- [py/core/main/api/v3/users_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/users_router.py)
- [py/core/main/api/v3/indices_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/indices_router.py)
- [py/core/main/services/retrieval_service.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/services/retrieval_service.py)
- [py/core/main/providers/llm.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/providers/llm.py)
- [py/core/main/providers/database.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/providers/database.py)
- [py/core/main/providers/auth.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/providers/auth.py)
- [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)
- [js/sdk/src/v3/clients/documents.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/documents.ts)
- [py/shared/api/models/__init__.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)
- [py/shared/api/models/ingestion/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/ingestion/responses.py)
- [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
</details>

# REST API, Services & Provider Architecture

## Overview

R2R (SciPhi-AI/R2R) exposes its capabilities through a layered architecture: HTTP routers in `py/core/main/api/v3/` accept client requests, a **Services** layer orchestrates business logic, and a **Providers** layer plugs in concrete implementations (LLMs, vector databases, auth, file storage). The same surface is mirrored by official clients in [js/sdk/src/v3/clients/](https://github.com/SciPhi-AI/R2R/tree/main/js/sdk/src/v3/clients) and the Python client, so end-users can call RAG, ingestion, graph, and management endpoints without writing raw HTTP. Source: [py/README.md]() and [js/sdk/README.md]().

```mermaid
flowchart LR
    Client[SDK / HTTP Client] -->|JSON / SSE| Router["v3 Routers<br/>(base_router.py)"]
    Router -->|Depends| Auth["Auth Provider<br/>(auth.py)"]
    Router -->|Depends| RateLimit["Rate Limit"]
    Router -->|calls| Service["Services<br/>(retrieval_service.py)"]
    Service --> Providers["Providers<br/>(llm, database, embedding)"]
    Providers --> External[(Vector DB, Postgres, Object Store)]
    Service -->|Pydantic models| Response[Wrapped Response]
```

## Router Layer (`py/core/main/api/v3/`)

Each domain is encapsulated in a dedicated router that inherits shared behavior from `base_router.py`. Routers define FastAPI endpoints using path operations decorated with `@self.router.post(...)` and a custom `@self.base_endpoint` decorator that handles authentication, rate limiting, error normalization, and response wrapping. Source: [py/core/main/api/v3/retrieval_router.py]() and [py/core/main/api/v3/documents_router.py]().

### Endpoint Decorators and Cross-Cutting Concerns

Endpoints typically chain two dependencies:

- `Depends(self.providers.auth.auth_wrapper())` — resolves the current `auth_user` from the bearer token and enforces superuser or per-user scoping (e.g. `request_user_ids = None if auth_user.is_superuser else [auth_user.id]` in `patch_metadata`). Source: [py/core/main/api/v3/documents_router.py]().
- `Depends(self.rate_limit_dependency)` — applies per-endpoint throttling, as seen on `/retrieval/agent`, `/retrieval/embedding`, and document routes. Source: [py/core/main/api/v3/retrieval_router.py]().

### Streaming and Server-Sent Events

When `rag_generation_config.stream` is true, the router returns a `StreamingResponse` with `media_type="text/event-stream"`. A `stream_generator()` yields the upstream chunks in 1024-byte slices, handling `GeneratorExit` to release resources. The streaming protocol emits typed events: `search_results`, `message`, `citation`, `thinking` (when `extended_thinking` is enabled), and `final_answer`. Source: [py/core/main/api/v3/retrieval_router.py]() and [py/shared/api/models/__init__.py]() (where `RAGEvent`, `AgentEvent`, `ThinkingEvent`, `CitationEvent`, and `FinalAnswerEvent` are exported).

## Services Layer

Services own the orchestration of a single capability. For example, `services.retrieval.rag(...)` accepts a query, `search_settings`, `rag_generation_config`, `task_prompt`, `include_title_if_available`, and `include_web_search`, returning either a buffered `RAGResponse` or an async stream. The service selects the model when not specified (`rag_generation_config.model = self.config.app.quality_llm`) and prepares effective search settings via `self._prepare_search_settings(auth_user, search_mode, search_settings)`. Source: [py/core/main/api/v3/retrieval_router.py]().

The `completion` service is the LLM-only path: it accepts a `messages` list and a `GenerationConfig` (with `model`, `temperature`, `max_tokens`, `stream`) and returns a `WrappedLLMChatCompletion`. Source: [py/core/main/api/v3/retrieval_router.py]().

Response shapes are Pydantic models declared under `py/shared/api/models/`. The most relevant are `IngestionResponse` (message, task_id, document_ids), `RAGResponse` (generated_answer, search_results, citations), and `AgentResponse` (messages, conversation_id). All are wrapped in generic envelopes like `WrappedIngestionResponse = R2RResults[IngestionResponse]` and `WrappedRAGResponse`. Source: [py/shared/api/models/ingestion/responses.py]() and [py/shared/api/models/retrieval/responses.py]().

## Provider Layer

Providers are swappable infrastructure adapters registered on a `Providers` object that the router carries as `self.providers`. Major providers include:

| Provider | File | Responsibility |
|----------|------|---------------|
| LLM | [py/core/main/providers/llm.py]() | Routes completions to OpenAI, Anthropic, or local models via LiteLLM; picks `quality_llm` by default. |
| Database | [py/core/main/providers/database.py]() | Postgres + pgvector for documents, chunks, users, collections, conversations, graphs. |
| Auth | [py/core/main/providers/auth.py]() | Bearer token validation and the `auth_wrapper()` dependency. |
| Embeddings / Ingestion / Graph | under `providers/` | Vector embedding generation, file parsing pipelines, and entity/relationship extraction. |

Because providers are constructor-injected into the router and the services, the same `base_endpoint` works across the `documents_router`, `retrieval_router`, `graph_router`, `users_router`, and `indices_router` (see [py/core/main/api/v3/indices_router.py]()). Source: [py/core/main/api/v3/documents_router.py]().

## Client SDKs

The official clients call the same router endpoints. The JS client uses `makeRequest("POST", "retrieval/agent", { data: ragData, headers, responseType: "stream" })` for streaming RAG and supports a `createSample()` helper that downloads `DeepSeek_R1.pdf` and ingests it via `POST /documents`. Source: [js/sdk/src/v3/clients/retrieval.ts]() and [js/sdk/src/v3/clients/documents.ts](). The Python client mirrors these calls through `client.retrieval.rag(...)`, `client.retrieval.agent(...)`, `client.documents.create(...)`, and `client.documents.list()`. Source: [py/README.md]().

## Common Failure Modes and Community Notes

- **No chat/search after Docker self-host** — `pip install r2r` is a *light* mode without Postgres/vector; basic RAG requires the `full` Docker Compose profile. Source: [py/README.md]() and issue [#2085]().
- **`OPENAI_API_BASE` ignored** — the LLM provider reads its base URL from its own configuration, not from `OPENAI_API_BASE`. Source: issue [#2020]().
- **Graph extraction is on by default** — issue [#2243]() requests a way to skip graph extraction and emit only chunks/embeddings; this is governed by ingestion config in the router, not a top-level flag.
- **HTML ingestion is not yet first-class** — issue [#2182]() asks for URL-based scraping; until then, HTML must be ingested as a file via the documents endpoint.

## See Also

- [R2R README (Python)](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [R2R JS SDK README](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
- Ingestion pipeline and graph extraction
- Knowledge graph communities and entity/relationship models
- Authentication, collections, and access control

---

<a id='page-3'></a>

## Ingestion Modes, Parsers, Search & Agentic RAG

### Related Pages

Related topics: [REST API, Services & Provider Architecture](#page-2), [Deployment, Configuration, Extensibility & Troubleshooting](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)
- [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py)
- [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [py/shared/api/models/__init__.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)
- [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)
- [js/sdk/src/v3/clients/documents.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/documents.ts)
- [js/sdk/src/types.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/types.ts)
- [js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
</details>

# Ingestion Modes, Parsers, Search & Agentic RAG

R2R (RAG to Riches) is a multimodal, agentic Retrieval-Augmented Generation platform. This page explains the four interconnected capabilities that drive the system end-to-end: **ingestion modes** (how content is fed in), **parsers** (how it is split), **search** (how it is retrieved), and **agentic RAG** (how an LLM orchestrates retrieval to answer complex queries). Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).

## Ingestion Modes and Document Lifecycle

R2R exposes a multi-format ingestion pipeline. The TypeScript SDK enumerates four supported `ingestionMode` values passed into the documents client: `"hi-res"`, `"fast"`, `"custom"`, and `"ocr"`. Source: [js/sdk/src/v3/clients/documents.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/documents.ts) (see `createSample` signature: `ingestionMode?: "hi-res" | "fast" | "custom" | "ocr"`).

The high-level workflow is:

1. The user calls `client.documents.create(file_path=...)` to upload a file. Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).
2. The REST document router (`/v3/documents`) handles appending, patching, and metadata updates. Source: [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py) (see `patch_metadata` endpoint).
3. Supported file types include `.txt`, `.pdf`, `.json`, `.png`, `.mp3`, and more — R2R markets this as "multimodal ingestion". Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md).
4. Once ingested, the document router also supports a dedicated search endpoint (`POST /v3/documents/search`) that runs semantic similarity against automatically generated document summaries and supports PostgreSQL-style filters using `eq`, `neq`, `gt`, `gte`, `lt`, `lte`, `like`, `ilike`, `in`, and `nin` operators. Source: [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py).

### Community-Relevant Notes on Ingestion

- **HTML ingestion (Issue #2182):** Community members have requested first-class URL/HTML scraping ingestion. The retrieval agent does ship a `web_scrape` tool and a `web_search` tool, but the document ingestion path itself is geared toward uploaded files. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py) (`web_scrape` tool).
- **Skipping graph extraction (Issue #2243):** Users can tune ingestion settings to produce only chunks and embeddings, without triggering the knowledge-graph enrichment stage, by selecting an appropriate `ingestionMode` and adjusting per-document settings exposed through the documents router. Source: [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py).

## Search: Basic, Advanced, and Custom Modes

The retrieval router accepts a `search_mode` parameter with three legal values: `"basic"`, `"advanced"`, and `"custom"`. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py) (see search endpoint docstring: "Pre-configured search modes: `basic`: A simple semantic-based search. `advanced`: A more powerful hybrid search combining semantic and full-text. `custom`: Full control via `search_settings`.").

The TypeScript SDK mirrors this contract: `searchMode?: "basic" | "advanced" | "custom"`. Source: [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts).

Key search capabilities:

- **Hybrid search via Reciprocal Rank Fusion (RRF).** The default `hybrid_settings` shape is `{ full_text_weight: 1.0, semantic_weight: 5.0, full_text_limit: 200, rrf_k: 50 }`. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).
- **Graph-enhanced search.** Knowledge graph integration is enabled by default and controlled via `graph_search_settings` (e.g., `use_graph_search: true`, `kg_search_type: "local"`). Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).
- **Advanced filtering.** Filters can combine document-type predicates and metadata ranges using a JSON-Logic-style `$and` / `$eq` / `$gt` syntax. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).
- **Type-safe SDK settings.** `SearchSettings`, `HybridSearchSettings`, `GraphSearchSettings`, and `ChunkSearchSettings` are exported as strongly-typed interfaces in the JS SDK. Source: [js/sdk/src/types.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/types.ts).

## Agentic RAG: RAG and Research Modes

The `/v3/retrieval/agent` endpoint is the centerpiece of R2R's agentic RAG. It exposes **two operating modes** selectable via the `mode` parameter. Source: [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts) (`mode?: "rag" | "research"`).

| Aspect | RAG Mode | Research Mode |
|---|---|---|
| Purpose | Knowledge-base Q&A | Deep analysis and reasoning |
| Generation config | `ragGenerationConfig` | `researchGenerationConfig` |
| Tool families | `ragTools` | `researchTools` |
| Reasoning system | Not used | Dedicated reasoning model |
| Code execution | Not used | `python_executor` |
| Critique pass | Not used | `critique` tool |

Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py) and [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts).

The available tool set is partitioned accordingly. RAG tools include `search_file_knowledge`, `search_file_descriptions`, `content`, `web_search`, and `web_scrape`. Research tools add `rag`, `reasoning`, `critique`, and `python_executor`. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).

The agent response schema is defined in `AgentResponse`, which carries `messages: list[Message]` and a `conversation_id: str` so multi-turn context is preserved across calls. Source: [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py).

```mermaid
flowchart LR
    A[User message] --> B{Agent mode}
    B -- "rag" --> C[rag_generation_config]
    B -- "research" --> D[research_generation_config]
    C --> E[RAG Tools<br/>search_file_knowledge<br/>web_search<br/>web_scrape]
    D --> F[Research Tools<br/>reasoning<br/>critique<br/>python_executor]
    E --> G[Hybrid + Graph Search]
    F --> G
    G --> H[LLM streaming<br/>or final answer]
    H --> I[AgentResponse<br/>messages + conversation_id]
```

### Streaming and Generation Config

When `rag_generation_config.stream` is `true`, the router returns a `StreamingResponse` of `text/event-stream` and chunks payloads in 1024-byte segments to avoid overwhelming the client. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).

`GenerationConfig` supports `model`, `temperature`, `top_p`, `max_tokens_to_sample`, `stream`, `functions`, `tools`, `api_base`, `response_format`, plus reasoning-control flags: `extended_thinking`, `thinking_budget`, and `reasoning_effort`. Source: [js/sdk/src/types.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/types.ts). R2R v3.6.5 added support for "Context for rag tool and extended thinking with non-claude models", which directly affects the `extended_thinking` and `thinking_budget` fields. Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md) (release notes referenced via community context).

## Common Failure Modes and Operational Notes

- **Docs site down (Issue #2276):** `r2r-docs.sciphi.ai` has been intermittently returning a `404 DEPLOYMENT_NOT_FOUND`. Self-hosters should rely on the in-repo `py/README.md` and the OpenAPI schema served by the running app until the hosted docs recover. Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md) (canonical reference).
- **Self-hosted Docker retrieval failures (Issue #2085):** Reports of search/RAG returning empty results in Docker mode usually point to embedding/vector-store misconfiguration rather than the retrieval router itself. The router is provider-agnostic and delegates storage and embedding to the configured providers. Source: [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).
- **`OPENAI_API_BASE` (Issue #2020):** The Python env-var name expected by R2R's OpenAI provider is `OPENAI_API_BASE`, but client-side `api_base` overrides can be supplied per request via `GenerationConfig.apiBase`. Source: [js/sdk/src/types.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/types.ts) (`apiBase?: string`).
- **Self-host manifests:** v3.6.5 introduced kustomize-based Kubernetes manifests (PR #2150), useful for production deployments that need ingestion, search, and agentic RAG scaled independently. Source: [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md) (release notes referenced via community context).

## See Also

- [Getting Started with R2R](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [JS/TypeScript SDK Reference](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
- [Retrieval API Responses](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [Shared API Models](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)

---

<a id='page-4'></a>

## Deployment, Configuration, Extensibility & Troubleshooting

### Related Pages

Related topics: [Introduction, Installation & SDK Usage](#page-1), [Ingestion Modes, Parsers, Search & Agentic RAG](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
- [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)
- [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py)
- [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)
- [js/sdk/src/v3/clients/documents.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/documents.ts)
- [js/sdk/src/types.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/types.ts)
- [js/sdk/package.json](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/package.json)
- [py/shared/api/models/__init__.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py)
- [docker/compose.yaml](https://github.com/SciPhi-AI/R2R/blob/main/docker/compose.yaml)
- [docker/compose.full.yaml](https://github.com/SciPhi-AI/R2R/blob/main/docker/compose.full.yaml)
- [docker/compose.full.swarm.yaml](https://github.com/SciPhi-AI/R2R/blob/main/docker/compose.full.swarm.yaml)
- [docker/env/r2r.env](https://github.com/SciPhi-AI/R2R/blob/main/docker/env/r2r.env)
- [docker/env/r2r-full.env](https://github.com/SciPhi-AI/R2R/blob/main/docker/env/r2r-full.env)
- [docker/user_tools/README.md](https://github.com/SciPhi-AI/R2R/blob/main/docker/user_tools/README.md)
</details>

# Deployment, Configuration, Extensibility & Troubleshooting

## Deployment

R2R supports two principal deployment paths: a lightweight Python-only installation for development and a containerized deployment for production-like workloads. The Python distribution is installed with `pip install r2r` and launched with `python -m r2r.serve` after exporting `OPENAI_API_KEY` ([py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)). For the full stack — including Postgres, vector database, and graph database — the deployment switches to Docker Compose. Setting `R2R_CONFIG_NAME=full` selects the full profile, and the cluster is started with `docker compose -f compose.full.yaml --profile postgres up -d` ([py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)).

A Docker Swarm variant (`docker/compose.full.swarm.yaml`) and a Kubernetes manifest set (referenced in the `docker/` directory) are also available for higher-scale topologies. The R2R HTTP server listens on port `7272` by default — the same endpoint the JavaScript SDK uses to construct its client ([js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)).

## Configuration

R2R's configuration is environment-driven, with a named profile selecting bundled defaults. The `R2R_CONFIG_NAME` variable picks the active profile, while the `docker/env/r2r.env` and `docker/env/r2r-full.env` files override values for the light and full deployment modes respectively ([py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)).

Model selection for retrieval flows through two application-level config keys:

- `config.app.quality_llm` — the default model for RAG-mode generation, applied when the request does not specify a model in `rag_generation_config` ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).
- `config.app.planning_llm` — the default model for research-mode generation, applied similarly when `mode="research"` ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).

Request-time configuration is layered on top: callers can pass a full `GenerationConfig` (model, temperature, `extended_thinking`, `thinking_budget`, `top_p`, `max_tokens_to_sample`) per call. The retrieval router also offers three `search_mode` values, summarized below.

| `search_mode` | Default Behavior | Override Mechanism |
|---------------|------------------|--------------------|
| `basic` | Semantic-only vector search | `filters`, `limit` |
| `advanced` | Hybrid semantic + full-text with reciprocal rank fusion | `filters`, `limit` |
| `custom` | Caller-supplied `SearchSettings` | n/a |

Sources: [py/core/main/api/v3/documents_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py) and [py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py).

## Extensibility

R2R exposes a unified REST surface that is consumed by both Python and JavaScript SDKs. The Python client supports ingestion, search, RAG, and agent invocations through a builder-style API ([py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)). The TypeScript SDK is installed via `npm install r2r-js` and instantiated with `new r2rClient("http://localhost:7272")` ([js/sdk/README.md](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)).

The agent endpoint is the most extensible surface. In RAG mode, callers may enable any combination of `search_file_knowledge`, `search_file_descriptions`, `get_file_content`, `web_search`, and `web_scrape`. In research mode, the available tools are `rag`, `reasoning`, `critique`, and `python_executor` ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)). The TypeScript `agent()` method mirrors this with the `ragTools` and `researchTools` arrays, alongside `searchMode`, `searchSettings`, `conversationId`, and `maxToolContextLength` options ([js/sdk/src/v3/clients/retrieval.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)).

Document and entity data can be exported to CSV through dedicated endpoints, with the JS SDK exposing `exportEntities({ id, columns, filters })` for selective exports ([js/sdk/src/v3/clients/documents.ts](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/documents.ts)). Streaming extensions are also supported: setting `stream: true` on RAG and agent calls surfaces a sequence of typed events — `thinking`, `tool_call`, `tool_result`, `citation`, `message`, `final_answer` — that consumers can react to in real time ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)). The response envelope for these events is defined in [py/shared/api/models/retrieval/responses.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py) and exported through [py/shared/api/models/__init__.py](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/__init__.py).

## Troubleshooting

Recurring community-reported issues map to the following diagnostics:

- **Self-hosted Docker chat/search returns nothing (issue #2085).** After bringing up the full stack with `docker compose -f compose.full.yaml --profile postgres up -d`, retrieval may appear unresponsive. Confirm that `R2R_CONFIG_NAME=full` was exported in the same shell that started Compose, and verify that ingestion completed without errors before issuing retrieval calls ([py/README.md](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)).
- **Custom `OPENAI_API_BASE` is ignored (issue #2020).** Users have reported that the standard upstream `OPENAI_API_BASE` environment variable is not honored. The correct path for self-hosted or OpenAI-compatible endpoints is to set the provider's `base_url` inside the active R2R configuration profile rather than via the SDK-level variable.
- **Ingesting without graph extraction (issue #2243).** The ingestion pipeline supports a chunk-and-embed-only path. Skip the knowledge-graph stage by configuring the ingestion options to disable entity/relationship extraction while retaining chunking and embedding ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).
- **HTML/web ingestion (issue #2182).** R2R does not expose a dedicated HTML ingest endpoint. The recommended approach is to use the `web_scrape` tool through the agent API, or to pre-fetch content externally and feed it through the document ingestion endpoint ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).
- **Hosted docs unreachable (issue #2276).** When the `r2r-docs.sciphi.ai` site is down, the in-repo `py/README.md`, `js/sdk/README.md`, and the docstrings inside `py/core/main/api/v3/` are the canonical references.

When debugging a stalled request, enabling `stream: true` on the RAG or agent call surfaces intermediate events — including `tool_call` and `tool_result` — that often pinpoint where the pipeline stops progressing ([py/core/main/api/v3/retrieval_router.py](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)).

## See Also

- [Project Overview (py/README.md)](https://github.com/SciPhi-AI/R2R/blob/main/py/README.md)
- [JavaScript SDK Quickstart (js/sdk/README.md)](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/README.md)
- [Retrieval Router (py/core/main/api/v3/retrieval_router.py)](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/retrieval_router.py)
- [Documents Router (py/core/main/api/v3/documents_router.py)](https://github.com/SciPhi-AI/R2R/blob/main/py/core/main/api/v3/documents_router.py)
- [Retrieval Response Models (py/shared/api/models/retrieval/responses.py)](https://github.com/SciPhi-AI/R2R/blob/main/py/shared/api/models/retrieval/responses.py)
- [JS SDK Retrieval Client (js/sdk/src/v3/clients/retrieval.ts)](https://github.com/SciPhi-AI/R2R/blob/main/js/sdk/src/v3/clients/retrieval.ts)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: SciPhi-AI/R2R

Summary: Found 14 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2276

## 2. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2279

## 3. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2290

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/1820

## 5. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/SciPhi-AI/R2R

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2293

## 7. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/SciPhi-AI/R2R

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2289

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SciPhi-AI/R2R

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/SciPhi-AI/R2R

## 11. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/SciPhi-AI/R2R

## 12. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SciPhi-AI/R2R/issues/2295

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SciPhi-AI/R2R

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SciPhi-AI/R2R

<!-- canonical_name: SciPhi-AI/R2R; human_manual_source: deepwiki_human_wiki -->
