Doramagic Project Pack · Human Manual
ragapp
The easiest way to use Agentic RAG in any enterprise
Overview & System Architecture
Related topics: Agent System, Tools & LLM Providers, Data Pipeline: Ingestion, Retrieval & Generation, Deployment, Networking & Multi-RAGapp Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent System, Tools & LLM Providers, Data Pipeline: Ingestion, Retrieval & Generation, Deployment, Networking & Multi-RAGapp Management
Overview & System Architecture
RAGapp positions itself as "The easiest way to use Agentic RAG in any enterprise," aiming to combine the configurability of OpenAI's custom GPTs with a self-hostable Docker deployment backed by LlamaIndex. Source: README.md:1-15. The project is a single FastAPI image that bundles an Admin UI, an agentic retrieval engine, and a set of pluggable model providers so that operators can stand up a chat-with-your-data stack behind their own firewall.
High-Level Components
RAGapp is a monolith with two logical halves wired together in one Docker image:
- Backend (Python / FastAPI + LlamaIndex) — exposes REST routes for configuration, agents, chat, indexing, and management, and runs the agentic workflow. Source: src/ragapp/backend/routers/management/agents.py:1-45 and src/ragapp/backend/workflows/single.py:1-20.
- Admin UI (Next.js 14 / React / TypeScript) — a separate client that hits the backend's management API. The Next.js dev server is started via the standard scripts defined in src/ragapp/admin-ui/package.json:4-12, and the project layout follows the conventional
create-next-appstructure described in src/ragapp/admin-ui/README.md:1-12.
flowchart LR
User[User Chat Client] --> API[FastAPI Backend]
Admin[Admin UI - Next.js] --> API
API --> Engine[LlamaIndex Engine]
Engine --> Vector[(Vector Store)]
Engine --> LLM[LLM Provider]
Engine --> Tools[Tool Registry]
Tools --> Web[Wikipedia / DuckDuckGo]
Tools --> Gen[CodeGen / DocGen / Interpreter]
Tools --> Q[RAG Query Engine]The Admin UI and the user-facing chat both terminate on the same FastAPI process: the README advertises a single port (8000) and a single image (ragapp/ragapp). Source: README.md:18-24.
Core Domain Concepts
Agents and Tools
The agent is the unit of behavior. On the backend it is modeled by AgentConfig, a Pydantic model that requires role and goal and collects per-tool ToolConfig entries. Source: src/ragapp/backend/models/agent.py:1-30. The same shape is mirrored on the client through a Zod AgentConfigSchema and helper API functions (getAgents, createAgent, updateAgent). Source: src/ragapp/admin-ui/client/agent.ts:1-50.
The tool registry is extensible and ships with several built-ins, each with a paired backend model and client config:
| Tool | Backend model | Client config | Purpose |
|---|---|---|---|
| QueryEngine | (LlamaIndex tool wrapper) | DEFAULT_QUERY_ENGINE_TOOL_CONFIG | Run RAG over indexed data |
| Wikipedia | models/tools/wikipedia.py | tools/wikipedia.ts | Enrich answers from Wikipedia |
| CodeGenerator | models/tools/code_generator.py | DEFAULT_CODE_GENERATOR_TOOL_CONFIG | Generate code in a sandbox |
| DocumentGenerator | models/tools/document_generator.py | tools/document_generator.ts | Produce PDF/HTML reports |
| ImageGenerator, OpenAPI, Interpreter, DuckDuckGo | referenced via the tool registry | same | Image, OpenAPI calls, E2B code interpreter, web search |
The list above is the exact set enumerated in DEFAULT_TOOL_CONFIG. Source: src/ragapp/admin-ui/client/agent.ts:1-50.
Configuration Surfaces
Two parallel configuration models back the system. ModelConfig carries the active provider, model, embedding model, and credentials, while ChatConfig carries conversation behavior such as the system prompt, conversation starters, next-question prompting, and inline citation prompting. Source: src/ragapp/backend/models/chat_config.py:1-60 and src/ragapp/backend/routers/management/config.py:1-50. The configuration router reads and writes these models with rollback_on_failure semantics, so a partially invalid update is reverted instead of corrupting runtime state. Source: src/ragapp/backend/routers/management/config.py:28-50.
A separate AIProvider controller abstracts the model vendor; the T-Systems integration is one example, and the client enforces validation rules such as "API base must be a valid URL." Source: src/ragapp/admin-ui/client/providers/t-systems.ts:1-20.
Workflow and Multi-Agent Capability
The single-agent workflow handles streamed responses, detects an immediate tool call after the first token, and routes the request to handle_tool_calls if a function call is emitted. Source: src/ragapp/backend/workflows/single.py:1-40. The AgentManager.check_supported_multi_agents_model check plus the multi_agent_supported route gate the multi-agent orchestrator on whether the configured LLM is a function-calling model. Source: src/ragapp/backend/routers/management/agents.py:1-50.
Deployment Topologies
Two reference deployments are shipped:
- Single container — Ollama for inference and Qdrant as the vector store, both in the same Compose file. The
MODELenvironment variable selects the local model (defaultphi3);TRACKING_SNIPPETinjects analytics into the chat UI. Source: deployments/single/README.md:1-30. - Multiple RAGApps with Manager — a Traefik-fronted Keycloak-protected control plane that can start and stop several RAGapp containers, persisting state under
STATE_DIR. Source: deployments/multiple-ragapps/README.md:1-20.
Community-Driven Architectural Pressure Points
Several open issues highlight where the architecture is still stretching:
- OpenAI-compatible API surface — community request #265 asks for a
/v1/chat/completionsendpoint so third-party clients (Chatbox, etc.) can target RAGapp. The current chat endpoint family is described in the README, but a fully OpenAI-shaped route is not yet exposed. - In-UI configuration — issue #149 asks for
TOP_K,LLM_TEMPERATURE, andVECTOR_STORE_PROVIDERto be moved from.envto the admin UI. Today these flow through theModelConfig/ChatConfigenv layer. Source: src/ragapp/backend/models/chat_config.py:1-60. - Hybrid search — issue #103 notes that hybrid retrieval is currently blocked by ChromaDB's in-process limitations (see chroma-core/chroma#1686).
- Query rewriting — issue #43 proposes an LLM pre-pass over user queries to improve similarity, which would slot into the workflow layer in src/ragapp/backend/workflows/single.py:1-40.
- Admin observability — issue #161 requests a retrieved-context panel and embedding-distance plot, a frontend concern that lives under src/ragapp/admin-ui/.
Together these threads describe a system that is opinionated about its agent/tool model and its single-image deploy story, while inviting extensions at the configuration, search, and API-compat boundaries.
See Also
- Agents & Tools Configuration
- Model Providers and Credentials
- Deployment Topologies
- API Reference: Management Endpoints
- Community Roadmap and Open Issues
Source: https://github.com/ragapp/ragapp / Human Manual
Agent System, Tools & LLM Providers
Related topics: Overview & System Architecture, Data Pipeline: Ingestion, Retrieval & Generation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & System Architecture, Data Pipeline: Ingestion, Retrieval & Generation
Agent System, Tools & LLM Providers
Overview
RAGapp's Agent System is an agentic RAG layer built on top of LlamaIndex that allows administrators to configure one or more specialized AI agents through the Admin UI (port 8000 by default) without writing code. Each agent is bound to a role, a goal, an optional backstory, a system prompt, and a set of enabled tools. At runtime, agents can be used individually (single-agent mode) or composed by an orchestrator (multi-agent mode) that delegates user tasks to the most appropriate agent based on its role and goal.
The agent system is the runtime surface for several open community requests:
- OpenAI-compatible chat endpoint (#265) — operators want to point third-party apps (e.g., Chatbox) at RAGapp; this requires the model and agent system to expose the standard
/v1/chat/completionsshape. - Query rewriting with an LLM (#43) — the
system_promptof eachQueryEnginetool, plus per-toolcustom_promptoverrides, is the natural injection point for query pre-processing prompts. - UI-exposed parameters (#149) —
LLM_TEMPERATURE,TOP_K, and provider selection are surfaced via theChatConfig/ModelConfigendpoints that the agent system reads on startup.
Agent Configuration Model
The Pydantic model AgentConfig (src/ragapp/backend/models/agent.py) is the canonical agent definition persisted to config/agents.yaml (path declared in src/ragapp/backend/constants.py).
| Field | Type | Required | Notes |
|---|---|---|---|
agent_id | str | optional on input | Generated server-side via uuid.uuid4() |
name | str | yes | Sanitized at runtime to ^[a-zA-Z0-9_-]+$ (OpenAI tool name constraint) |
role | str (min_length=1) | yes | Used by the orchestrator to pick the right agent |
goal | str (min_length=1) | yes | Appended to the description handed to the orchestrator |
backstory | str | no | Interpolated into the default prompt template |
system_prompt | Optional[str] | no | Overrides the templated prompt when set |
tools | Dict[str, ToolConfig] | no | Map of tool name → {enabled, config} |
created_at | int (epoch seconds) | no | Backwards-compatible with datetime inputs |
The default system prompt template is:
You are a {role}, {backstory}. Your goal is: {goal}
Source: src/ragapp/backend/models/agent.py.
A separate, client-side template is used in the Admin UI when constructing new agents — "You are {role}. {backstory}\nYour personal goal is: {goal}" (src/ragapp/admin-ui/client/agent.ts).
Tool System
Each agent carries a typed tool dictionary. On the backend, every tool is described by a Pydantic subclass of ToolConfig with an enabled flag, a config dict, and optional custom_prompt / validation logic. On the client, the same tools are mirrored by Zod schemas.
The eight tools currently registered in the Admin UI's ToolsSchema (src/ragapp/admin-ui/client/agent.ts) are:
| Tool | Source of config | Notes |
|---|---|---|
ImageGenerator | Image provider tool | Generates images from prompts |
OpenAPI | OpenAPI spec tool | Calls external REST APIs described by an OpenAPI document |
Interpreter (E2B) | E2B code interpreter | Executes Python in a sandboxed kernel |
DuckDuckGo | Web search | No API key required |
Wikipedia | LlamaHub wikipedia.WikipediaToolSpec | Free-text lookup; see src/ragapp/backend/models/tools/wikipedia.py |
QueryEngine | RAG over indexed documents | Given a priority hint in the orchestrator (see below) |
CodeGenerator | Code-gen tool | Validates that api_key is present when enabled — Source: src/ragapp/backend/models/tools/code_generator.py |
DocumentGenerator | Document-gen tool | Used to author files end-to-end |
The QueryEngine tool is special-cased in src/ragapp/backend/workflows/orchestrator.py: the orchestrator appends "\nThis is a preferred tool to use" to its description so that the LLM biases toward using the RAG retriever before falling back to external tools. The same priority mechanism is also exposed per-tool in the client schema as a priority: number field (e.g., DEFAULT_WIKIPEDIA_TOOL_CONFIG.priority = 2).
Tool custom prompts are merged into the agent's final system prompt by AgentPromptManager.generate_agent_system_prompt (src/ragapp/backend/controllers/agent_prompt_manager.py). The template (SYSTEM_PROMPT_WITH_TOOLS_TPL in src/ragapp/backend/constants.py) injects each enabled tool's custom_prompt as:
===<ToolName>===
<custom_prompt>
===<ToolName>===
This is the seam where query-rewriting prompts (community request #43) and TOP-K / retriever overrides (#149) can be injected per-agent without code changes.
Multi-Agent Orchestration Workflow
flowchart LR UI[Admin UI<br/>agent.ts] -- POST /api/management/agents --> API[FastAPI Router<br/>agents.py] API --> Mgr[AgentManager<br/>controllers/agents.py] Mgr -- persist --> YAML[(config/agents.yaml)] YAML -- load on chat --> Orch[Orchestrator<br/>workflows/orchestrator.py] Orch --> LLM[LLM via LlamaIndex Settings] Orch -- delegate by role/goal --> A1[FunctionCallingAgent A] Orch -- delegate by role/goal --> A2[FunctionCallingAgent B] A1 --> T1[Enabled Tools] A2 --> T2[Enabled Tools] T1 --> Engine[QueryEngine / External APIs]
When a user sends a chat message, create_orchestrator (src/ragapp/backend/workflows/orchestrator.py) loads every agent from agents.yaml, materializes their enabled tools through ToolFactory, and constructs a FunctionCallingAgent for each one. Each agent receives a description of the form:
"<role>\n and its goals are <goal>"
which the LlamaIndex AgentOrchestrator uses to route user tasks. The runtime conversation loop — including streaming, immediate tool-call detection, and tool-call execution — lives in FunctionCallingAgent.astream_chat / handle_tool_calls (src/ragapp/backend/workflows/single.py).
Multi-agent prerequisites
Multi-agent mode is only viable when the configured LLM supports native function/tool calling. Two endpoints enforce this:
GET /api/management/agents/multi_agent_supported— returnsSettings.llm.metadata.is_function_calling_model(src/ragapp/backend/routers/management/agents.py).POST /api/management/config/models— refuses model changes with HTTP 400 when multi-agent mode is active and the new model is not function-calling (src/ragapp/backend/routers/management/config.py).
If a model switch invalidates the embeddings, the same config endpoint calls reset_index to force a rebuild of the vector index.
LLM Provider Integration
Providers are configured through the ModelConfig / ChatConfig pair and read by LlamaIndex's Settings on startup. The single-deployment compose file ships with Ollama and Qdrant (deployments/single/README.md), and the MODEL env var selects the chat model (default phi3, recommended llama3).
The Admin UI ships a per-provider Zod schema; for example, the T-Systems LLMHub integration (src/ragapp/admin-ui/client/providers/t-systems.ts) is registered under model_provider: "t-systems" and carries:
| Field | Default |
|---|---|
model | gpt-35-turbo |
embedding_model | text-embedding-bge-m3 |
embedding_dim | 1536 |
t_systems_llmhub_api_key | _(required, validated)_ |
t_systems_llmhub_api_base | https://llm-server.llmhub.t-systems.net/v2 |
A TRACKING_SCRIPT env var (deployments/single/README.md) can be set to inject any analytics snippet (e.g., Microsoft Clarity) into the chat UI.
The open feature request for an OpenAI-compatible/v1/chat/completionsendpoint (#265) sits one layer above the agent system: a thin adapter that calls the orchestrator with the user-suppliedmessagesarray and serializes the streamed response into OpenAI's SSE format.
See Also
- Configuration & Chat Settings
- Vector Store & Indexing
- Deployment Topologies
- REST API Reference
Source: https://github.com/ragapp/ragapp / Human Manual
Data Pipeline: Ingestion, Retrieval & Generation
Related topics: Overview & System Architecture, Agent System, Tools & LLM Providers, Deployment, Networking & Multi-RAGapp Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & System Architecture, Agent System, Tools & LLM Providers, Deployment, Networking & Multi-RAGapp Management
Data Pipeline: Ingestion, Retrieval & Generation
Overview
RAGApp implements a classic Retrieval-Augmented Generation (RAG) pipeline split into three cooperating stages: document ingestion (loaders and configuration), retrieval (query engines, streaming responses, source nodes), and generation (LLM streaming, tool calls, and multi-agent orchestration). The data pipeline is configured at deployment time through environment variables and YAML, then tuned at runtime through the management API exposed by FastAPI routers.
The pipeline is consumed through two top-level entry points:
- The management API (
/api/management/...) for configuration (src/ragapp/backend/routers/management/config.py) and agent management (src/ragapp/backend/routers/management/agents.py). - The chat streaming API (
/api/chat) which uses Vercel-style streaming response converters (src/ragapp/backend/routers/chat/vercel_response.py).
flowchart LR
A[Documents] -->|Ingestion: LoaderManager| B[Vector Store]
C[User Query] -->|Retrieval| B
B --> D[Source Nodes]
D --> E[LLM + Tools]
E -->|Streaming| F[Vercel Stream]
F --> G[Chat UI]
E -.->|Tool Calls| H[Wikipedia / DocumentGenerator / QueryEngine]1. Ingestion: Loaders and Configuration
Ingestion is driven by LoaderManager, which reads and persists a YAML configuration file referenced by the LOADER_CONFIG_FILE constant (src/ragapp/backend/controllers/loader.py). The manager exposes three responsibilities:
load_config_file()— loads the YAML configuration on startup.update_loader(loader_config)— writes a singleFileLoaderentry back to the YAML file and propagates API keys to the environment throughloader_config.update_env_api_key().get_loader(loader_name)— returns either the full configuration dictionary or a typedFileLoaderinstance.
Only the file loader is currently supported by the manager's typed accessor; any other name raises ValueError(f"Unsupported loader {loader_name}!") (Source: src/ragapp/backend/controllers/loader.py:36-44).
The deployment presets also shape ingestion. The single-container deployment pairs RAGApp with Ollama and Qdrant, and downloads the selected model into a shared volume (deployments/single/README.md). The MODEL environment variable selects the chat model; if omitted the default phi3 is used, which is "less capable than llama3 but faster to download" (Source: deployments/single/README.md:7-11). The multi-ragapps deployment instead persists the full state of all services (RAGApps, Manager, Keycloak) under the directory set by STATE_DIR (Source: deployments/multiple-ragapps/README.md:23-30).
Community interest in configurable ingestion behavior is documented in issue #149, which requests exposing backend environment variables such as LLM_TEMPERATURE, TOP_K, and VECTOR_STORE_PROVIDER as admin-UI settings.
2. Retrieval: Query Engines and Source Nodes
Retrieval is performed by query engines that are surfaced as tools to agents. Each tool is declared both on the server (Pydantic model) and on the client (Zod schema) so that the admin UI can render and toggle them.
The Wikipedia tool is a llamahub-typed tool with name="wikipedia" and a description instructing the agent to "gather more information about a topic from the query" (Source: src/ragapp/backend/models/tools/wikipedia.py:6-15). The corresponding client schema, WikipediaToolConfig, is defined in src/ragapp/admin-ui/client/tools/wikipedia.ts and ships with a default priority: 2 and enabled: false flag.
The Vercel streaming converter in src/ragapp/backend/routers/chat/vercel_response.py is the canonical place where retrieval results are converted into client-facing events:
convert_text(token)serializes each streaming token into a Vercel-compatible text frame.convert_data(data)serializes structured payloads such assource_nodesandsuggested_questionsinto data frames.- The
ChatEngineVercelStreamResponse._create_streammethod awaits the engine response, processes source nodes through_process_response_nodes, then yields them as a data event before streaming tokens (Source: src/ragapp/backend/routers/chat/vercel_response.py:55-95).
Issue #161 requests an admin information panel that visualizes retrieved context and embedding distance, complementing the existing source-node stream.
3. Generation: LLM Streaming, Tools, and Agents
Generation is orchestrated by the AgentConfig model and the workflow classes that drive it. An agent is required to have a non-empty role and goal, and may optionally carry a backstory, system_prompt, and a dictionary of ToolConfig entries (Source: src/ragapp/backend/models/agent.py:17-35). When system_prompt is not provided, get_system_prompt() falls back to the default template "You are a {role}, {backstory}. Your goal is: {goal}" defined in the same file.
Tools available to agents
| Tool | Type | Default enabled | Source |
|---|---|---|---|
Wikipedia | llamahub | false | src/ragapp/backend/models/tools/wikipedia.py |
DocumentGenerator | local | false | src/ragapp/backend/models/tools/document_generator.py |
QueryEngine | local | false | src/ragapp/admin-ui/client/agent.ts |
CodeGenerator | local | false | src/ragapp/admin-ui/client/agent.ts |
ImageGenerator, OpenAPI, Interpreter, DuckDuckGo | various | false | src/ragapp/admin-ui/client/agent.ts |
The DocumentGenerator tool has a custom_prompt that instructs the agent to use the tool whenever a report is requested and to return a relative path of the form /api/files/output/tool/<file> (Source: src/ragapp/backend/models/tools/document_generator.py:14-23). The default client config mirrors this with label: "Document Generator" and priority: 2 (src/ragapp/admin-ui/client/tools/document_generator.ts).
The client-side agent.ts file is also the source of truth for the CRUD API used by the admin UI: getAgents, createAgent, and updateAgent all POST/GET against ${getBaseURL()}/api/management/agents (Source: src/ragapp/admin-ui/client/agent.ts:73-100).
Workflow execution
The single-agent workflow (src/ragapp/backend/workflows/single.py) runs a streaming generator that checks for an immediate tool call. If the first LLM chunk contains tool calls, the workflow resolves them through handle_tool_calls, mapping tool names via tools_by_name = {tool.metadata.get_name(): tool for tool in self.tools}. Unknown tools are converted into a synthetic ChatMessage(role="tool", content=f"Tool {tool_call.tool_name} does not exist", ...) so the LLM can recover gracefully (Source: src/ragapp/backend/workflows/single.py:60-80).
The agents management router additionally exposes /check_supported_model and /multi_agent_supported to verify that the configured MODEL_PROVIDER / MODEL pair supports function calling — the prerequisite for multi-agent mode (Source: src/ragapp/backend/routers/management/agents.py:21-39).
4. Configuration Surface and Common Failure Modes
Configuration sources
Two configuration mechanisms coexist:
- Environment variables read through Pydantic
BaseEnvConfig(e.g.CUSTOM_PROMPTand the default next-question / citation prompts in src/ragapp/backend/models/chat_config.py). - YAML loader file mutated by
LoaderManagerand written back atomically by_update_config_file(Source: src/ragapp/backend/controllers/loader.py:46-48).
The T-Systems provider demonstrates how a model provider extends a BaseConfigSchema with API-key and base-URL fields, validating URLs and requiring non-empty secrets (Source: src/ragapp/admin-ui/client/providers/t-systems.ts:7-22). The default T-Systems config pins the embedding model to text-embedding-bge-m3 with embedding_dim: 1536, even though the embedding name suggests a different native dimension.
Common failure modes
- Unsupported model in multi-agent mode. Calling
/api/management/agents/multi_agent_supportedreturnsfalsewhenSettings.llm.metadata.is_function_calling_modelisFalse; the UI must fall back to single-agent mode (Source: src/ragapp/backend/routers/management/agents.py:41-46). - Stream cancellation. The Vercel response converter catches
asyncio.CancelledError, callsevent_handler.cancel_run(), and logs "Stopping workflow" so partial runs are not orphaned (Source: src/ragapp/backend/routers/chat/vercel_response.py:42-50). - Unknown tool calls. A tool name not present in
tools_by_nameproduces atoolchat message describing the missing tool rather than raising (Source: src/ragapp/backend/workflows/single.py:69-77). - Hybrid search gap. Issue #103 notes that hybrid search is currently blocked on ChromaDB upstream; ChromaDB is preferred for its ability to run in-process.
See Also
- Configuration & Providers
- Agent System & Tools
- Chat Streaming API
- Deployment Topologies
Source: https://github.com/ragapp/ragapp / Human Manual
Deployment, Networking & Multi-RAGapp Management
Related topics: Overview & System Architecture, Data Pipeline: Ingestion, Retrieval & Generation
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & System Architecture, Data Pipeline: Ingestion, Retrieval & Generation
Deployment, Networking & Multi-RAGapp Management
Overview
RAGapp ships with two first-class deployment topologies that share the same image (ragapp/ragapp) but differ in operational scale: a single-container deployment for self-contained experiments and small teams, and a multi-RAGapp deployment that uses a Manager service to orchestrate many RAGapp instances behind a reverse proxy with authentication. The repository also exposes a programmatic management API for both, enabling configuration to be driven from the Admin UI or external tooling.
The project positions itself as "the easiest way to use Agentic RAG in any enterprise," packaged as a Docker image with an Admin UI exposed on port 8000 (Source: README.md). The release 0.1.5 ("ragbox") further indicates an ongoing focus on multi-deployment reliability — for example, the 0.1.4 patch fixed an issue where "cannot show document files in multiple deployments mode" (Source: package.json).
Single RAGapp Deployment
The single deployment profile lives in deployments/single/ and bundles RAGapp with two well-known open-source services: Ollama for local LLM inference and Qdrant as the vector store. The README highlights the simplicity of this topology: set a MODEL environment variable and start the stack (Source: deployments/single/README.md).
MODEL=llama3 docker-compose up
When MODEL is unset, the default phi3 model is used; the setup container is responsible for pulling the model into the bundled ollama/ volume, which can take several minutes on first boot. The deployment also supports injecting an arbitrary TRACKING_SCRIPT (e.g., Microsoft Clarity) to capture chat session analytics, and a separate OLLAMA_HOST variable to point RAGapp at an external Ollama server.
This topology maps closely to the request captured in community issue #265 — users want to be able to point third-party apps (like Chatbox) at an OpenAI-compatible endpoint. While the single deployment itself does not yet implement /v1/chat/completions, it is the natural substrate for that feature because Ollama already exposes an OpenAI-compatible surface that RAGapp can proxy.
Multi-RAGapp Deployment with Manager
For enterprise use, the deployments/multiple-ragapps/ directory provides a complete multi-tenant stack (Source: deployments/multiple-ragapps/README.md):
- A Manager UI to create, start, and stop individual RAGapp containers
- Traefik as a reverse proxy that routes external traffic to the correct RAGapp
- Keycloak for authentication and user management
- A configurable
STATE_DIR(defaulting to${PWD}/data) that persists RAGapp data, configurations, and the Manager's bookkeeping
Bringing it up requires creating a dedicated Docker network and pulling the published images:
cd deployments/multiple-ragapps
docker pull ragapp/ragapp
docker compose pull
docker network create ragapp-network
docker compose up
On Windows, STATE_DIR must be set to an absolute path in .env because Docker Compose does not expand relative paths consistently. The same TRACKING_SCRIPT mechanism from the single deployment is also exposed here.
The Manager service itself is implemented under src/manager/. Its backend orchestrates the Docker containers while a Next.js-based Admin Dashboard (Source: src/manager/frontend/package.json) provides the user-facing control surface. The dashboard's createAgent.tsx component, for example, renders a modal dialog that calls a createRAGAppService function — illustrating that "Add App" is a first-class UX primitive in the multi-tenant model (Source: src/manager/frontend/src/components/sections/createAgent.tsx).
flowchart LR
User[End User] --> Traefik[Traefik Reverse Proxy]
Traefik --> Keycloak[Keycloak SSO]
Keycloak --> R1[RAGapp #1]
Traefik --> R2[RAGapp #2]
Traefik --> Rn[RAGapp #n]
Admin[Operator] --> Manager[Manager UI]
Manager --> Docker[(Docker Engine)]
Docker --> R1
Docker --> R2
Docker --> Rn
R1 -.shared state.-> StateDir[(STATE_DIR volume)]
R2 -.shared state.-> StateDir
Rn -.shared state.-> StateDirNetworking, Configuration & Management API
Both deployment modes converge on a common management surface exposed under /api/management/. The backend routers define the contracts:
/api/management/agents— list, create, update, and delete RAGapp agent configurations, plus acheck_supported_modelendpoint that validates the currentMODEL_PROVIDER/MODELagainst the multi-agents feature set (Source: src/ragapp/backend/routers/management/agents.py)./api/management/config/chatand/api/management/config/models— read and update the chat UI configuration and the underlying model configuration, withrollback_on_failure=Truefor safety on the chat endpoint (Source: src/ragapp/backend/routers/management/config.py).
The Admin UI consumes these endpoints through typed client modules such as agent.ts, which defines the ToolsSchema (ImageGenerator, OpenAPI, Interpreter, DuckDuckGo, Wikipedia, QueryEngine, CodeGenerator, DocumentGenerator) and provides getAgents, createAgent, and updateAgent helpers (Source: src/ragapp/admin-ui/client/agent.ts). This Zod-validated contract is what the Manager and the Admin UI share.
The cross-cutting environment variables that govern networking and behavior are summarized below.
| Variable | Scope | Purpose |
|---|---|---|
MODEL | Both deployments | Selects the LLM pulled by the setup container (e.g., llama3); defaults to phi3 |
MODEL_PROVIDER | Both deployments | Identifies the provider used by /check_supported_model |
OLLAMA_HOST | Single deployment | Points RAGapp at an external Ollama instance |
TRACKING_SCRIPT | Both deployments | Injects an analytics script (e.g., Clarity) into the chat UI |
STATE_DIR | Multi-RAGapp deployment | Persists RAGapp data, Manager state, and Keycloak realm data |
ENVIRONMENT=dev | Development only | Switches the backend into dev mode for make dev workflows |
Common Failure Modes & Community Signals
Several recurring failure modes surface from both the code and community discussions:
- Stale local model after switching
MODEL. Thesetupcontainer only pulls the model into theollama/volume on first boot; changingMODELrequires either re-running the setup or clearing the volume (Source: deployments/single/README.md). - Windows path expansion.
STATE_DIRis not expanded relative toPWDon Windows, so it must be an absolute path in.env(Source: deployments/multiple-ragapps/README.md). - UI rendering in multi-tenant mode. The 0.1.4 patch explicitly fixed "cannot show document files in multiple deployments mode," suggesting that the Manager and per-RAGapp Admin UI have historically diverged in how they handle file rendering (Source: package.json).
- Configuration drift between
.envand Admin UI. Community issue #149 requests exposing backend variables such asTOP_K,LLM_TEMPERATURE, andVECTOR_STORE_PROVIDERas Admin UI settings so operators do not have to re-deploy the container. TheEnvConfigManagerpath inconfig.pyalready supports in-place updates with rollback, providing the foundation for that work (Source: src/ragapp/backend/routers/management/config.py). - Multi-agent model compatibility. The
multi_agent_supportedendpoint readsSettings.llm.metadata.is_function_calling_model, meaning that enabling the multi-agent mode requires a function-calling-capable model. Misconfiguration here results in a silent capability mismatch surfaced only at runtime (Source: src/ragapp/backend/routers/management/agents.py).
See Also
- README.md — top-level project introduction and quickstart
- deployments/single/README.md — single-container deployment with Ollama + Qdrant
- deployments/multiple-ragapps/README.md — multi-tenant deployment with Manager, Traefik, and Keycloak
- src/manager/README.md — Manager service development notes
- Community issue #265 — OpenAI-compatible
/v1/chat/completionsrequest - Community issue #149 — exposing
TOP_Kand other backend settings in the Admin UI
Source: https://github.com/ragapp/ragapp / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 13 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: identity.distribution | https://github.com/ragapp/ragapp
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/289
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/287
4. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/271
5. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/ragapp/ragapp
6. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/283
7. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/ragapp/ragapp
8. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/ragapp/ragapp
9. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/ragapp/ragapp
10. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/282
11. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/ragapp/ragapp/issues/293
12. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/ragapp/ragapp
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using ragapp with real data or production workflows.
- Could not fetch Ollama models. Make sure the Ollama base URL is accessib - github / github_issue
- Add OpenRouter adapter for LLM - github / github_issue
- Path traversal in knowledge file upload allows writing files outside dat - github / github_issue
- Add governance and audit trails for enterprise RAG deployments - github / github_issue
- Question / suggestion: use WFGY 16-problem map as an optional RAG failur - github / github_issue
- Add 'BaseURL' for OpenAI provider - github / github_issue
- Add OpenRouter adapter for LLM - github / github_issue
- Response always empty - github / github_issue
- Add user memory - github / github_issue
- Release v0.1.5 - github / github_release
- Release v0.1.4 - github / github_release
- Release v0.1.3 - github / github_release
Source: Project Pack community evidence and pitfall evidence