quivr Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

quivr

Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.

Quivr-Core Overview & Quick Start

Related topics: Brain, RAG Engine & LangGraph Workflows, LLM Endpoints, Tools & Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section Installation

Continue reading this section for the full explanation and source context.

Section Minimal Example: Ask a Question About a File

Continue reading this section for the full explanation and source context.

Quivr-Core Overview & Quick Start

Purpose and Scope

quivr-core is the standalone RAG (Retrieval-Augmented Generation) engine that powers Quivr.com. It is packaged as an installable Python library and is the "brain" of the larger Quivr monorepo. The README describes the goal succinctly: *"This is the core of Quivr, the brain of Quivr.com"* Source: README.md:31-33.

The project is positioned as a developer-facing library: *"We take care of the RAG so you can focus on your product. Simply install quivr-core and add it to your project"* Source: README.md:23-27. The library is published as quivr-core on PyPI and is independently versioned — the latest tagged core release is v0.0.33 (2025-02-03) Source: core/CHANGELOG.md:1-9.

Quivr-core provides:

A Brain abstraction for ingesting files and asking questions.
Pluggable file processors (e.g. SimpleTxtProcessor).
A configurable workflow engine driven by YAML.
A tool registry for extending LLM capabilities (search, custom tools).
Integration with Megaparse, LangChain, Anthropic, OpenAI, Mistral, and Ollama.

High-Level Architecture

The library is organized around three pillars: Brains (user-facing entry point), Processors (file ingestion and chunking), and Workflows (YAML-defined RAG pipelines).

flowchart LR
    User[Developer] -->|pip install| QCore[quivr-core package]
    QCore --> Brain[Brain.from_files]
    Brain --> Processor[File Processor Registry]
    Processor --> Simple[SimpleTxtProcessor]
    Processor --> Custom[Custom Processors]
    Brain --> Workflow[YAML Workflow Config]
    Workflow --> LLM[LLM Endpoint<br/>OpenAI/Anthropic/Mistral/Ollama]
    Workflow --> Tools[Tool Registry]
    Tools --> Search[Internet Search]
    Tools --> CustomTools[Custom Tools]
    Brain --> Answer[ask -> Answer]

The configuration backbone is the QuivrBaseConfig Pydantic model, which forbids unknown keys and provides a from_yaml classmethod Source: core/quivr_core/base_config.py:13-49. Every workflow, splitter, and processor config inherits from this class, ensuring strict schema validation across the project.

Quick Start

Prerequisites

Python 3.10 or newer is required Source: README.md:48-50.
An LLM provider API key (e.g. OPENAI_API_KEY). Quivr supports *"APIs from Anthropic, OpenAI, and Mistral. It also supports local models using Ollama"* Source: README.md:83-85.

Installation

pip install quivr-core

Source: core/README.md:5-11 confirms the package is quivr-core, distributed under the Apache 2.0 License.

Minimal Example: Ask a Question About a File

The README advertises a "30 seconds installation" that creates a working RAG with five lines of code Source: README.md:52-79:

import tempfile

from quivr_core import Brain

if __name__ == "__main__":
    with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
        temp_file.write("Gold is a liquid of blue-like colour.")
        temp_file.flush()

        brain = Brain.from_files(
            name="test_brain",
            file_paths=[temp_file.name],
        )

        answer = brain.ask("what is gold? answer in french")
        print("answer:", answer)

The Brain.from_files factory handles ingestion, chunking, embedding, and storage. Once constructed, brain.ask(question) runs the configured workflow and returns the answer.

Configuring a Workflow (Basic RAG)

Workflows are declared in YAML. The README example basic_rag_workflow.yaml defines nodes such as START, filter_history, and rewrite Source: README.md:91-103. The workflow_config block names the pipeline, lists its nodes, and declares edges between them. Configurable retrieval workflows were introduced in v0.0.17 Source: core/CHANGELOG.md:33-35.

Processors and Splitters

Each file type has a dedicated processor. The default text processor, SimpleTxtProcessor, registers against FileExtension.txt and uses a recursive character splitter Source: core/quivr_core/processor/implementations/simple_txt_processor.py:24-49. The splitter enforces chunk_overlap < chunk_size and exposes its settings via SplitterConfig. Processor metadata — including the class name and splitter config — is reported through processor_metadata for observability.

Tool Registry

Beyond retrieval, the library supports callable tools via a ToolRegistry Source: core/quivr_core/llm_tools/entity.py:24-33:

ToolsCategory groups tools with a shared name, description, default, and factory callable.
ToolWrapper pairs a LangChain BaseTool with input/output formatters.
ToolRegistry.register_tool and create_tool allow registering and instantiating tools by name; unknown names raise ValueError.

This makes internet search, Zendesk workflows (added in v0.0.33 Source: core/CHANGELOG.md:5-8), and third-party tools pluggable without modifying the core engine.

Example Applications

The repository ships with runnable examples that exercise different surfaces of quivr-core:

Example	Interface	Purpose	Source
`simple_question`	Script	Single-shot Q&A over a file	examples/simple_question/README.md
`chatbot`	Chainlit UI	Upload-and-chat interface over uploaded files	examples/chatbot/README.md
`chatbot_voice`	Chainlit + voice	Voice-driven variant of the chatbot	examples/chatbot_voice/README.md

The chatbot example uses rye for environment management and prompts the user to upload a .txt file before answering questions about its contents Source: examples/chatbot/README.md:1-35.

Known Limitations and Community Notes

Several community discussions shape how quivr-core is positioned and used:

Model providers: Azure OpenAI support has been requested and tracked (issue #650); v0.0.32 added o3-mini Source: core/CHANGELOG.md:11-13.
Long conversations: Issue #3135 highlights the need for conversation memory; quivr-core provides per-call Brain.ask and configurable workflows that can be extended with external memory layers.
Pydantic v1 removal: Completed in v0.0.28 Source: core/CHANGELOG.md:78-80, confirming the project has migrated to Pydantic v2.
Tokenizer caching: Added in v0.0.31 to bound cache size Source: core/CHANGELOG.md:21-30, useful for high-throughput deployments.
Multi-modal ingestion: Open feature request (issue #3684) for video/audio via Whisper and vision models — not yet implemented in quivr-core.
Vector database flexibility: The original Quivr platform tied users to Supabase (issues #181, #484, #618), but quivr-core is library-only and does not impose that dependency.

Brain, RAG Engine & LangGraph Workflows

Related topics: Quivr-Core Overview & Quick Start, Processors, Files & Storage, LLM Endpoints, Tools & Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Search

Continue reading this section for the full explanation and source context.

Section File processing

Continue reading this section for the full explanation and source context.

Section Rerankers and retrievers

Continue reading this section for the full explanation and source context.

Brain, RAG Engine & LangGraph Workflows

Overview

Quivr's quivr-core library is the RAG engine that powers the wider Quivr product. Its central abstraction is the Brain class, which represents a per-user knowledge base backed by a vector store, an embedding model, and a language model. Around that, the RAG module implements a multi-stage LangGraph workflow that decomposes user input, optionally augments it with web search or other tools, retrieves relevant context, and produces a final answer with citations and follow-up questions.

The promise documented in the project README is that a developer can go from pip install quivr-core to "ingest your files and ask questions" with minimal code, while still being able to customize the RAG pipeline at every stage (README.md:1-40). The community-raised issues around Supabase, vector database choice (#484, #181) and "long conversations losing context" (#3135) are all addressed at this layer, since Brain and the RAG workflow are the components that own retrieval, memory and tool orchestration.

The `Brain` Class

Brain is the main entry point exported by quivr_core (core/quivr_core/brain/__init__.py:1-3). It is created from one of three sources:

Files on disk — Brain.from_files(name, file_paths=[...]) picks a file-extension-aware processor and chunks the documents.
Pre-built LangChain documents — Brain.afrom_langchain_documents(name, langchain_documents=[...]) skips the loader step and uses caller-supplied langchain_core.documents.Document objects. It builds a default vector DB if none is supplied, otherwise delegates to vector_db.aadd_documents (core/quivr_core/brain/brain.py:afrom_langchain_documents).
A persisted brain — deserialization is handled by a dedicated module so that a brain can survive across process restarts.

Internally, every Brain carries a UUID id, a name, a storage backend, an llm, an embedder, and a vector_db. These are the only fields the workflow consumes (core/quivr_core/brain/brain.py:afrom_langchain_documents).

Search

Brain.asearch(query, n_results=5, filter=None, fetch_n_neighbors=20) returns a list[SearchResult]. The fetch_n_neighbors parameter is used during the vector fetch step before optional reranking, which is what allows the engine to over-fetch from a vector store and then narrow results with a reranker (core/quivr_core/brain/brain.py:asearch). The filter argument accepts a callable or a metadata dict, mirroring LangChain's retriever interface.

File processing

The default processor dynamically instantiates loader classes from langchain_community.document_loaders for a wide range of extensions — including BibtexLoader, CSVLoader, Docx2txtLoader, NotebookLoader, PythonLoader, UnstructuredEPubLoader, UnstructuredExcelLoader, UnstructuredHTMLLoader, UnstructuredMarkdownLoader, UnstructuredODTLoader, UnstructuredPDFLoader, UnstructuredPowerPointLoader, and TextLoader (core/quivr_core/processor/implementations/default.py:1-30). Token counting uses tiktoken's cl100k_base encoding. For richer PDF extraction, a TikaProcessor is also shipped, configured by default to talk to http://localhost:9998/tika and tunable through the TIKA_SERVER_URL environment variable (core/quivr_core/processor/implementations/tika_processor.py:1-40).

RAG Engine & LangGraph Workflow

The RAG engine is implemented in core/quivr_core/rag/quivr_rag_langgraph.py as QuivrQARAGLangGraph. The constructor takes a RetrievalConfig, an LLMEndpoint, a VectorStore, and an optional BaseDocumentCompressor reranker (defaulting to IdempotentCompressor when none is supplied) (core/quivr_core/rag/quivr_rag_langgraph.py:QuivrQARAGLangGraph.__init__).

Rerankers and retrievers

get_reranker(**kwargs) resolves the reranker from the retrieval config. It supports DefaultRerankers.COHERE (via CohereRerank) and DefaultRerankers.JINA (via JinaRerank), with explicit model, top_n and api_key parameters. Any unknown supplier falls back to IdempotentCompressor, which performs no reranking (core/quivr_core/rag/quivr_rag_langgraph.py:get_reranker).

Prompt-driven multi-stage workflow

The workflow is a state machine driven by ChatPromptTemplate instances registered in core/quivr_core/rag/prompts.py. The TemplatePromptName enum holds the names of every step, including SPLIT_PROMPT, UPDATE_PROMPT, and the answer-grading prompts. The SPLIT_PROMPT instructs the model to split a user turn into standalone, self-contained tasks and to condense behavior-shaping instructions into a separate string, returning a list of Tasks plus a found flag (core/quivr_core/rag/prompts.py:SPLIT_PROMPT). The UPDATE_PROMPT rewrites the system prompt and tool list based on activated tools (core/quivr_core/rag/prompts.py:UPDATE_PROMPT).

Pydantic models TasksCompletion, FinalAnswer and UpdatedPromptAndTools are used to coerce LLM output into structured fields. FinalAnswer carries reasoning_answer, answer, and all_tasks_completed so the workflow can decide whether to retry, fetch more context, or call a tool (core/quivr_core/rag/entities/models.py:FinalAnswer).

flowchart LR
    A[User input + chat history] --> B[SPLIT_PROMPT<br/>Tasks]
    B --> C[UPDATE_PROMPT<br/>UpdatedPromptAndTools]
    C --> D{Task completable?}
    D -- yes --> E[Retrieve & rerank<br/>VectorStore + Reranker]
    D -- no --> F[Activate tool<br/>WebSearch / Other]
    F --> E
    E --> G[FinalAnswer<br/>citations, follow-ups]
    G --> H[RAGResponseMetadata<br/>sources, workflow_step]

Tool registry

Tools are categorised through TOOLS_CATEGORIES and a TOOLS_LISTS registry in core/quivr_core/llm_tools/llm_tools.py. LLMToolFactory.create_tool(tool_name, config) looks up the tool, falling back to the category's default_tool if the caller passes a category name (e.g. "Web Search") (core/quivr_core/llm_tools/llm_tools.py:LLMToolFactory). Web search is implemented by create_tavily_tool, which wraps the Tavily search engine, normalises results into langchain_core.documents.Document objects, and exposes them through WebSearchTools (core/quivr_core/llm_tools/web_search_tools.py:create_tavily_tool, WebSearchTools).

Response shape

ParsedRAGResponse and ParsedRAGChunkResponse wrap answer strings with RAGResponseMetadata, which holds citations, followup_questions, sources, ChatLLMMetadata, workflow_step, and optional LangchainMetadata for Langfuse tracing (core/quivr_core/rag/entities/models.py:ParsedRAGResponse, RAGResponseMetadata). This is the structure every consumer — including the example chatbots — consumes when displaying a turn.

Configuration, Customization & Community Context

Retrieval, reranking, and LLM behaviour are all driven by the YAML-based RetrievalConfig referenced in the README's "30 seconds installation" walkthrough. Changing the YAML and re-instantiating Brain.ask(question, retrieval_config=retrieval_config) is the supported path for A/B testing different pipelines (README.md:30-90).

The chatbot example in examples/chatbot and the voice variant in examples/chatbot_voice both demonstrate the same end-to-end pattern: build a brain from uploaded files, call ask, and render answer.answer to the user (examples/chatbot/README.md:1-30, examples/chatbot_voice/README.md:1-30). The examples/quivr-whisper README shows an alternative front-end that transcribes audio with OpenAI Whisper and queries the same Quivr RAG API (examples/quivr-whisper/README.md:1-30).

Known limitations surfaced by the community

Concern	Where it lives in the engine
Vector DB lock-in / self-hosting (#484, #618, #181)	`Brain.vector_db` is pluggable; choose a non-managed `VectorStore` and pass it to `afrom_langchain_documents` (core/quivr_core/brain/brain.py:afrom_langchain_documents)
Long conversations lose context (#3135)	`chat_history` flows into `SPLIT_PROMPT`; memory must be passed in by the caller (core/quivr_core/rag/prompts.py:SPLIT_PROMPT)
Multi-modal ingestion (audio/video, #3684)	Currently limited to text loaders; multi-modal would need new processor subclasses under `processor/implementations/` (core/quivr_core/processor/implementations/default.py:1-30)
Custom Rerankers / LLMs	`get_reranker` supports Cohere and Jina out of the box; new suppliers extend `DefaultRerankers` (core/quivr_core/rag/quivr_rag_langgraph.py:get_reranker)

A common failure mode is silently falling back to IdempotentCompressor when a reranker is misconfigured — the workflow still completes, but quality drops without a visible error. Other recurring issues (project maintenance pace, EU AI Act compliance) are tracked at the platform level rather than the engine (Issue #3681, Issue #3667).

Processors, Files & Storage

Related topics: Brain, RAG Engine & LangGraph Workflows, LLM Endpoints, Tools & Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Processors, Files & Storage

Purpose and Scope

The processor subsystem is the ingestion layer of quivr-core. It accepts user files of many formats, normalizes them into LangChain Document objects, chunks them, and feeds them to the vector store that backs a Brain. According to the project README, *"quivr works with any file, you can use it with PDF, TXT, Markdown, etc and even add your own parsers"* (Source: README.md).

The subsystem has three concerns:

File abstraction — a uniform QuivrFile object that exposes a path and a recognized extension.
Processor implementations — one parser per file family, registered against the supported extensions.
Splitting & storage — chunking of parsed text and handoff to the vector database used by the Brain.

File Abstraction and the Extension Registry

QuivrFile wraps an on-disk file together with a strongly-typed FileExtension enum, so dispatch in the registry can be done with simple equality checks rather than fragile string matching (Source: core/quivr_core/files/file.py).

The processor registry maps each FileExtension value to a concrete ProcessorBase subclass. Processors declare the set of extensions they handle via the supported_extensions class attribute, and the registry resolves the correct processor at ingest time. Each implementation is a thin dynamic wrapper built by the _build_processor factory in default.py, which pairs a LangChain document loader with the extensions it understands (Source: core/quivr_core/processor/implementations/default.py).

The default registry covers the formats listed below. The implementation file shows how each loader is bound to its extensions through _build_processor:

File family	Extensions	Loader used
CSV	`.csv`	`CSVLoader`
Word	`.docx`	`Docx2txtLoader`
Excel	`.xlsx`, `.xlsm`	`UnstructuredExcelLoader`
PowerPoint	`.pptx`	`UnstructuredPowerPointLoader`
Markdown	`.md`, `.mdx`, `.markdown`	`UnstructuredMarkdownLoader`
EPUB	`.epub`	`UnstructuredEPubLoader`
BibTeX	`.bib`	`BibtexLoader`
ODT	`.odt`	`UnstructuredODTLoader`
HTML	`.html`	`UnstructuredHTMLLoader`
Python	`.py`	`PythonLoader`
Notebook	`.ipynb`	`NotebookLoader`
PDF (default)	`.pdf`	`UnstructuredPDFLoader`

Source: core/quivr_core/processor/implementations/default.py:1-60

Alternative and Specialized Processors

Beyond the default table, the codebase ships additional processors that can be selected for specific needs:

SimpleTxtProcessor — a lightweight, dependency-free loader for .txt files that reads via aiofiles and applies a custom recursive_character_splitter instead of the LangChain splitter. It is useful in minimal environments where the full unstructured stack is undesirable (Source: core/quivr_core/processor/implementations/simple_txt_processor.py).
TikaProcessor — delegates parsing of .pdf files to an external Apache Tika server, configurable via the TIKA_SERVER_URL environment variable (default http://localhost:9998/tika). It uses httpx.AsyncClient with a configurable timeout and retry count. The docstring recommends running Tika with docker run -d -p 9998:9998 apache/tika (Source: core/quivr_core/processor/implementations/tika_processor.py).
MegaparseProcessor — integrates with the external Megaparse service for high-fidelity document parsing, mentioned in the README as a supported ingestion path (Source: README.md).

Splitter Configuration and the Ingestion Pipeline

Every processor accepts a splitter_config: SplitterConfig and an optional pre-built TextSplitter. When no splitter is supplied, the processor constructs a RecursiveCharacterTextSplitter.from_tiktoken_encoder using the chunk size and chunk overlap from SplitterConfig (Source: core/quivr_core/processor/implementations/default.py).

SplitterConfig and the surrounding configuration objects are defined in the RAG config module. ParserConfig bundles SplitterConfig with MegaparseConfig, which is in turn wrapped by IngestionConfig. This lets a Brain ingest files with one consistent chunking policy while still allowing per-format overrides (Source: core/quivr_core/rag/entities/config.py).

flowchart LR
    A[QuivrFile] --> B{ProcessorBase}
    B --> C[LangChain Loader]
    C --> D[RecursiveCharacterTextSplitter]
    D --> E[ProcessedDocument]
    E --> F[VectorStore via Brain]
    F --> G[LangChain Documents]

The process_file_inner method on every processor loads the file with the configured loader, splits the resulting documents, and returns a ProcessedDocument that the Brain can hand to the vector database (Source: core/quivr_core/processor/implementations/default.py).

Storage and Integration with Brain

Processed chunks are persisted through a pluggable vector store. The Brain class accepts an external vector_db; when none is provided, Brain.afrom_langchain_documents calls build_default_vectordb to construct one in-process using the supplied embedder (Source: core/quivr_core/brain/brain.py).

This decoupled design is what enables the popular community request to *"allow the user to choose which db she uses locally"*, including open-source backends such as Milvus and ChromaDB (Source: Issue #484). The retrieval side then performs similarity search with optional reranking — controlled by RetrievalConfig.reranker_config — which exposes Cohere and Jina rerankers alongside the default idempotent compressor (Source: core/quivr_core/rag/entities/config.py).

The chatbot example demonstrates the end-to-end flow: a user uploads a .txt file, the processor splits it, the Brain stores the embeddings, and the Chainlit UI lets the user ask questions whose answers are grounded in the parsed chunks (Source: examples/chatbot/README.md).

Common Failure Modes and Configuration Pitfalls

No loader matched — when a file extension is not in any registered processor's supported_extensions, ingestion will fail. Users who want a new format must register a new processor instance.
External services unreachable — TikaProcessor and MegaparseProcessor depend on external HTTP services; connection errors and timeouts (5.0 s by default) are the most common runtime failures (Source: core/quivr_core/processor/implementations/tika_processor.py).
Oversized chunks — SplitterConfig defaults can produce chunks that exceed the target LLM context. Tuning chunk_size and chunk_overlap is required for long documents.
Tokenizer mismatch — processors default to cl100k_base via tiktoken, which is correct for OpenAI models but may be inaccurate for non-OpenAI providers (Source: core/quivr_core/processor/implementations/default.py).

LLM Endpoints, Tools & Configuration

Related topics: Brain, RAG Engine & LangGraph Workflows, Quivr-Core Overview & Quick Start

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Supported Model Suppliers

Continue reading this section for the full explanation and source context.

Section Web Search

Continue reading this section for the full explanation and source context.

Section Other Tools

Continue reading this section for the full explanation and source context.

LLM Endpoints, Tools & Configuration

Overview

quivr-core ships a pluggable runtime that decouples the RAG pipeline from the underlying language model, the tools that augment the model, and the YAML-driven configuration that controls them. Three subsystems collaborate at answer-generation time:

LLMEndpoint — a cached factory that instantiates and reuses a chat model (OpenAI, Anthropic, Mistral, Meta, Groq, or any OpenAI-compatible base_url).
LLMTools registry — a category-based registry that exposes LangChain BaseTool wrappers (Tavily web search, cited_answer, etc.) to the RAG workflow.
RetrievalConfig / WorkflowConfig — Pydantic models loaded from YAML (or Python) that wire rerankers, LLM parameters, history depth, and LangGraph node sequences together.

The README advertises this flexibility explicitly: *"Customize your RAG: Quivr allows you to customize your RAG, add internet search, add tools, etc."* Source: README.md.

flowchart LR
    YAML[RetrievalConfig YAML] --> RC[RetrievalConfig]
    RC --> LLM[LLMEndpointConfig]
    RC --> RR[RerankerConfig]
    RC --> WF[WorkflowConfig]
    LLM --> EP[LLMEndpoint<br/>cached singleton]
    EP --> RAG[QuivrQARAGLangGraph]
    RR --> RAG
    WF --> RAG
    RAG --> Tools[LLMToolFactory]
    Tools --> T1[Tavily Web Search]
    Tools --> T2[cited_answer]
    RAG --> Stream[RAGResponse / chunks]

LLM Endpoints

LLMEndpoint wraps a chat model behind a from_config(...) factory. The factory hashes the configuration and caches the resulting instance, so repeated calls with the same parameters reuse the same client — an optimization added in release core-0.0.30 (*"adding cache to LLMEndpoint"*). Source: core/quivr_core/llm/llm_endpoint.py:71-95.

LLMEndpointConfig accepts a model, llm_api_key, llm_base_url, temperature, max_output_tokens, and supplier. When the supplier is OPENAI_COMPATIBLE, the factory builds a ChatOpenAI pointed at a custom base_url (Ollama, vLLM, Azure, etc.). Source: core/quivr_core/llm/llm_endpoint.py:55-80 and core/quivr_core/rag/entities/config.py:21-58.

Supported Model Suppliers

LLMEndpoint ships preset metadata for several providers; the table below summarises the entries that drive context-window budgeting and tokenizer selection.

Supplier	Example Models	`max_context_tokens`	`tokenizer_hub`
OpenAI	`gpt-4o`, `gpt-3.5-turbo`	per-model preset	`Quivr/text-embedding-ada-002`
Anthropic	`claude-opus-4`, `claude-sonnet-4`, `claude-3-7-sonnet`	200000	`Quivr/claude-tokenizer`
Mistral	`mistral-large`	100000	`Quivr/claude-tokenizer`
Meta	`llama-3.1`, `llama-3`, `code-llama`	128000 / 8192 / 16384	`Quivr/Meta-Llama-3.1-Tokenizer`
Groq	`llama-3.3-70b`, `llama-3.1-70b`	128000	`Quivr/Meta-Llama-3.1-Tokenizer`

Source: core/quivr_core/rag/entities/config.py:30-120.

LLMEndpoint.info() returns an LLMInfo record (model, base URL, temperature, supports_function_calling) used by callers that need to choose between tool-using and plain-completion paths. The supports_func_calling() flag is a runtime check that gates whether the workflow can attach tools. Source: core/quivr_core/llm/llm_endpoint.py:100-115.

Tool System

Tools live in core/quivr_core/llm_tools/. The architecture is a small registry pattern:

ToolWrapper (llm_tools/entity.py) bundles a LangChain BaseTool with format_input and format_output callables that translate between the LLM tool-call schema and Quivr's Document representation.
ToolRegistry registers factory functions per tool name and exposes create_tool(name, config).
ToolsCategory groups tools under a human-readable category ("Web Search", "Other") and provides a default_tool for shorthand lookups.

Source: core/quivr_core/llm_tools/llm_tools.py:15-30 and core/quivr_core/llm_tools/web_search_tools.py:25-55.

Web Search

WebSearchTools registers TavilySearchResults (named WebSearchToolsList.TAVILY). create_tavily_tool returns a ToolWrapper that converts the raw response into a list of Document objects, populating file_name and original_file_name with the result URL. Source: core/quivr_core/llm_tools/web_search_tools.py:1-55.

Other Tools

OtherTools is the category for non-network tools; the current entry is cited_answer (defined in core/quivr_core/rag/entities/models.py). Source: core/quivr_core/llm_tools/other_tools.py:1-20.

LLMToolFactory.create_tool(tool_name, config) walks TOOLS_CATEGORIES, dispatching on exact tool name or category alias (case-insensitive). It raises ValueError(f"Tool {tool_name} is not supported.") for unknown names. Source: core/quivr_core/llm_tools/llm_tools.py:22-30.

Retrieval and Workflow Configuration

RetrievalConfig is the user-facing knob. It bundles:

reranker_config (RerankerConfig) — supplier, model, top_n, api_key. Supported suppliers include cohere and jina; an unrecognized supplier returns the no-op IdempotentCompressor. Source: core/quivr_core/rag/quivr_rag_langgraph.py:35-55.
llm_config (LLMEndpointConfig) — model + credentials + sampling.
max_history (default 10) — previous turns fed back into the rewrite step. Source: core/quivr_core/rag/entities/config.py:60-75.
k (default 40) — chunks returned by the retriever before reranking.
workflow_config (WorkflowConfig) — a list of named LangGraph nodes (e.g., START → filter_history → rewrite → retrieve → generate_rag → END). Source: README.md and core/quivr_core/rag/entities/config.py:60-80.

RetrievalConfig.from_yaml(path) hydrates a YAML file like basic_rag_workflow.yaml into this Pydantic model. The __init__ hook calls llm_config.set_api_key(force_reset=True) to reconcile env-var credentials at instantiation. Source: core/quivr_core/rag/entities/config.py:65-72.

Workflow Nodes and Prompts

The default RAG workflow has five nodes. The prompts.py module assembles the system prompts that drive filter_history, rewrite, and generate_rag. The UPDATE_PROMPT, for example, instructs the model to *"collect and condense all the instructions into a single string"* and to split the user input into standalone tasks. Source: core/quivr_core/rag/prompts.py:60-95.

ChatHistory (core/quivr_core/rag/entities/chat.py) backs max_history. It stores ChatMessage instances, supports reverse-chronological retrieval via get_chat_history(newest_first=True), and is keyed by a chat_id/brain_id pair. Source: core/quivr_core/rag/entities/chat.py:18-55.

Document Processing

Tools and LLMs are only useful if documents make it into the vector store. core/quivr_core/processor/implementations/default.py builds a ProcessorInit for every supported FileExtension (.pdf, .csv, .docx, .html, .md, .epub, .pptx, .xlsx, .odt, .py, .ipynb, .bib, plain text). The default splitter is RecursiveCharacterTextSplitter, configured by SplitterConfig and chunk-size counted with the cl100k_base tiktoken encoding. Source: core/quivr_core/processor/implementations/default.py:1-60 and core/quivr_core/rag/entities/config.py:73-80.

Failure Modes and Community Notes

Supabase / self-hosting debates dominate the issue tracker (#484, #181, #612, #618). The README's *"Your data, your control"* claim is qualified by the fact that quivr-core itself does not hard-require Supabase — the dependency lives in the surrounding Quivr.com product. Source: README.md.
Azure OpenAI is requested in #650; the OPENAI_COMPATIBLE supplier in LLMEndpoint is the current path to point base_url at an Azure deployment. Source: core/quivr_core/llm/llm_endpoint.py:55-80.
Project momentum: Issue #3681 asks *“Is this project abandoned?”*, citing an apparent gap in core releases after v0.0.33 (Feb 2025) and a long-standing frontend bug (#2004). This is a context note for anyone evaluating the project today. Source: community context.
CSP / Next.js styling bug #2004 affects the Next.js front-end, not quivr-core; it is included for completeness because users commonly conflate the two repos.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 10 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/QuivrHQ/quivr/issues/3684

2. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/QuivrHQ/quivr/issues/3667

3. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/QuivrHQ/quivr/issues/2004

4. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/QuivrHQ/quivr

5. Runtime risk: Runtime risk requires verification

Severity: medium
Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/QuivrHQ/quivr/issues/3686

6. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/QuivrHQ/quivr

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/QuivrHQ/quivr

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/QuivrHQ/quivr

9. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/QuivrHQ/quivr

10. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/QuivrHQ/quivr

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using quivr with real data or production workflows.

[[Bug]:](https://github.com/QuivrHQ/quivr/issues/3687) - github / github_issue
[[Bug]:](https://github.com/QuivrHQ/quivr/issues/3686) - github / github_issue
H " - github / github_issue
[[Feature]: Multi-Modal RAG (Video/Audio Ingestion via Whisper & Vision M](https://github.com/QuivrHQ/quivr/issues/3684) - github / github_issue
[[Bug]:](https://github.com/QuivrHQ/quivr/issues/2004) - github / github_issue
Improving user experience in long conversations - github / github_issue
Is this project abandoned? - github / github_issue
EU AI Act Compliance Scan Results — Sharing Findings for Feedback - github / github_issue
core: v0.0.33 - github / github_release
core: v0.0.32 - github / github_release
core: v0.0.31 - github / github_release
core: v0.0.30 - github / github_release

Source: Project Pack community evidence and pitfall evidence

quivr

Quivr-Core Overview & Quick Start

Related Pages

Quivr-Core Overview & Quick Start

Purpose and Scope

High-Level Architecture

Quick Start

Prerequisites

Installation

Minimal Example: Ask a Question About a File

Configuring a Workflow (Basic RAG)

Processors and Splitters

Tool Registry

Example Applications

Known Limitations and Community Notes

See Also

Brain, RAG Engine & LangGraph Workflows

Related Pages

Brain, RAG Engine & LangGraph Workflows

Overview

The `Brain` Class

Search

File processing

RAG Engine & LangGraph Workflow

Rerankers and retrievers

Prompt-driven multi-stage workflow

Tool registry

Response shape

Configuration, Customization & Community Context

Known limitations surfaced by the community

See Also

Processors, Files & Storage

Related Pages

Processors, Files & Storage

Purpose and Scope

File Abstraction and the Extension Registry

Alternative and Specialized Processors

Splitter Configuration and the Ingestion Pipeline

Storage and Integration with Brain

Common Failure Modes and Configuration Pitfalls

See Also

LLM Endpoints, Tools & Configuration

Related Pages

LLM Endpoints, Tools & Configuration

Overview

LLM Endpoints

Supported Model Suppliers

Tool System

Web Search

Other Tools

Retrieval and Workflow Configuration

Workflow Nodes and Prompts

Document Processing

Failure Modes and Community Notes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Installation risk: Installation risk requires verification

2. Security or permission risk: Security or permission risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Capability evidence risk: Capability evidence risk requires verification

5. Runtime risk: Runtime risk requires verification

6. Maintenance risk: Maintenance risk requires verification

7. Security or permission risk: Security or permission risk requires verification

8. Security or permission risk: Security or permission risk requires verification

9. Maintenance risk: Maintenance risk requires verification

10. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence