Doramagic Project Pack Β· Human Manual

quivr

Quivr follows a modular architecture with the quivr-core package as its central component. The architecture is designed around a workflow-based system where different processing nodes are ...

Introduction to Quivr

Related topics: Getting Started, System Architecture Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Package Structure

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section Installation

Continue reading this section for the full explanation and source context.

Related topics: Getting Started, System Architecture Overview

Introduction to Quivr

Quivr is an open-source project that helps you build your "second brain" by leveraging the power of Generative AI. It provides a Retrieval-Augmented Generation (RAG) framework that enables users to ingest documents and ask questions about their content using natural language. Source: README.md

The core philosophy of Quivr is to handle all the complexity of RAG implementations so developers can focus on their products. It provides an opinionated, fast, and efficient RAG system that supports multiple file types, various LLM providers, and customizable workflows. Source: README.md

Key Features

Quivr offers several distinctive capabilities that make it a powerful choice for building document-aware AI applications:

FeatureDescription
Opinionated RAGPre-configured RAG pipeline optimized for speed and efficiency
Multi-LLM SupportWorks with OpenAI, Anthropic, Mistral, Gemma, and local models via Ollama
Any File TypeSupports PDF, TXT, Markdown, and custom parsers
Customizable WorkflowsExtend RAG with internet search, tools, and custom processing
Megaparse IntegrationOptional integration with Megaparse for advanced document ingestion
Language DetectionAutomatic detection of document language for better processing

Source: README.md

Architecture Overview

Quivr follows a modular architecture with the quivr-core package as its central component. The architecture is designed around a workflow-based system where different processing nodes are connected to form a complete RAG pipeline.

graph TD
    A[User Input] --> B[Brain.ask]
    B --> C[RetrievalConfig]
    C --> D[Workflow Engine]
    D --> E[Processing Nodes]
    E --> F[LLM Response]
    
    G[Documents] --> H[Processor]
    H --> I[Chunks]
    I --> J[Vector Store]
    J --> D

Package Structure

The main components of the Quivr architecture include:

ComponentPurpose
quivr_core.BrainCentral class for managing document collections and answering questions
ProcessorBaseAbstract base class for document processors
RetrievalConfigConfiguration for retrieval and workflow settings
QuivrFileFile wrapper for document handling
ProcessedDocumentContainer for processed document chunks

Source: core/README.md

Getting Started

Prerequisites

Before installing Quivr, ensure you have:

  • Python 3.10 or newer
  • An API key for your chosen LLM provider (OpenAI, Anthropic, or Mistral)

Source: README.md

Installation

Install the quivr-core package using pip:

pip install quivr-core

Source: README.md

Quick Start Example

The following example demonstrates creating a simple RAG application with Quivr in approximately 5 lines of code:

import tempfile
from quivr_core import Brain

with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
    temp_file.write("Gold is a liquid of blue-like colour.")
    temp_file.flush()

    brain = Brain.from_files(
        name="test_brain",
        file_paths=[temp_file.name],
    )

    answer = brain.ask("what is gold? answer in french")
    print("answer:", answer)

Source: examples/simple_question/simple_question.py

Core Components

Brain Class

The Brain class is the central interface for interacting with Quivr. It manages document ingestion, storage, and querying.

Key Methods:

MethodDescription
Brain.from_files()Create a brain from one or more files
brain.ask()Ask a question about the ingested documents
brain.print_info()Display information about the brain

Typical Workflow:

graph LR
    A[Create Brain] --> B[from_files]
    B --> C[Process Documents]
    C --> D[Store Chunks]
    D --> E[Ask Questions]
    E --> F[Get Answers]

Source: README.md

Document Processors

Processors handle the parsing and chunking of different file types. The ProcessorBase abstract class defines the interface that all processors must implement.

class ProcessorBase(ABC):
    @abstractmethod
    async def process_file_inner(self, file: QuivrFile) -> ProcessedDocument[R]:
        raise NotImplementedError

Source: core/quivr_core/processor/processor_base.py

Processing Pipeline:

During document processing, Quivr performs the following operations:

  1. Parse the input file using the appropriate processor
  2. Split documents into chunks based on SplitterConfig
  3. Add metadata including chunk index, version info, and detected language
  4. Sanitize content by removing null characters and encoding issues

Source: core/quivr_core/processor/processor_base.py

Simple Txt Processor

The SimpleTxtProcessor is a built-in processor for handling plain text files. It uses recursive character splitting to divide documents into manageable chunks:

def recursive_character_splitter(
    doc: Document, chunk_size: int, chunk_overlap: int
) -> list[Document]:
    assert chunk_overlap < chunk_size, "chunk_overlap is greater than chunk_size"

    if len(doc.page_content) <= chunk_size:
        return [doc]

    chunk = Document(page_content=doc.page_content[:chunk_size], metadata=doc.metadata)
    remaining = Document(
        page_content=doc.page_content[chunk_size - chunk_overlap :],
        metadata=doc.metadata,
    )

    return [chunk] + recursive_character_splitter(remaining, chunk_size, chunk_overlap)

Source: core/quivr_core/processor/implementations/simple_txt_processor.py

Configuration

Environment Setup

Set your API keys as environment variables before creating a brain:

import os
os.environ["OPENAI_API_KEY"] = "my_openai_apikey"

Source: README.md

Retrieval Configuration

The RetrievalConfig class allows customization of the RAG workflow. Configuration can be loaded from YAML files:

from quivr_core.config import RetrievalConfig

config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)

Source: README.md

Workflow Configuration

Workflows are defined using YAML files that specify processing nodes and their connections:

workflow_config:
  name: "standard RAG"
  nodes:
    - name: "START"
      edges: ["filter_history"]
    - name: "filter_history"
      edges: ["rewrite"]
    - name: "rewrite"
      edges: ["retrieve"]

Workflow Node Types:

NodeFunction
STARTEntry point for the workflow
filter_historyFilters conversation history
rewriteRewrites the query for better retrieval
retrieveFetches relevant document chunks

Source: README.md

Advanced Usage

Custom Workflow with Rich Console

For interactive applications, Quivr can be combined with the rich library for enhanced console output:

from quivr_core import Brain
from quivr_core.config import RetrievalConfig
from rich.console import Console
from rich.panel import Panel
from rich.prompt import Prompt

brain = Brain.from_files(
    name="my smart brain",
    file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)

config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)

console = Console()
while True:
    question = Prompt.ask("[bold cyan]Question[/bold cyan]")
    if question.lower() == "exit":
        break
    answer = brain.ask(question, retrieval_config=retrieval_config)
    console.print(f"[bold green]Answer[/bold green]: {answer.answer}")

Source: README.md

Supported LLM Providers

Quivr integrates with multiple LLM providers:

ProviderModel SupportConfiguration
OpenAIGPT-4, GPT-3.5OPENAI_API_KEY
AnthropicClaude familyANTHROPIC_API_KEY
MistralMistral modelsMISTRAL_API_KEY
OllamaLocal modelsOLLAMA_BASE_URL

Source: README.md

Version History

Quivr follows semantic versioning for the quivr-core package. Key releases include:

VersionDateKey Changes
0.0.272024-12-16Max context tokens enforcement, megaparse SDK integration
0.0.192024-10-21Beginning of quivr-core development
0.0.132024-08-01Added parsers and tox tests
0.0.22024-07-09Initial quivr-core package release

Source: core/CHANGELOG.md

Documentation and Community

Additional resources for learning and contributing to Quivr:

ResourceDescription
Official DocumentationComprehensive guides and API reference
GitHub IssuesBug reports and feature requests
Discord CommunityReal-time support and discussions
Good First IssuesBeginner-friendly contribution opportunities

Source: README.md

License

Quivr is licensed under the Apache 2.0 License, making it freely available for commercial and personal use. Source: core/README.md

Source: https://github.com/QuivrHQ/quivr / Human Manual

Getting Started

Related topics: Introduction to Quivr, Installation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Supported LLM Providers

Continue reading this section for the full explanation and source context.

Section Package Installation

Continue reading this section for the full explanation and source context.

Section Verify Installation

Continue reading this section for the full explanation and source context.

Related topics: Introduction to Quivr, Installation

Getting Started

Overview

Quivr is an open-source RAG (Retrieval-Augmented Generation) framework that enables developers to build AI-powered "second brain" applications. The quivr-core package provides the core RAG functionality, allowing users to ingest documents and query them using large language models. Source: README.md:1-15

The framework is designed to be opinionated, fast, and efficient, abstracting away the complexity of document processing, vector storage, and LLM integration so developers can focus on building their products. Source: README.md:27-30

Prerequisites

Before getting started with Quivr, ensure your environment meets the following requirements:

RequirementVersionDescription
Python3.10+The programming language runtime
pipLatestPackage installer for Python
API Key-Required for cloud LLM providers (OpenAI, Anthropic, Mistral)

Source: README.md:44-50

Supported LLM Providers

Quivr supports multiple LLM providers:

ProviderAPI TypeNotes
OpenAIAPI KeySet OPENAI_API_KEY environment variable
AnthropicAPI KeySupports Claude models
MistralAPI KeySupports Mistral AI models
OllamaLocalFor running models locally

Source: README.md:58-62

Installation

Package Installation

Install the core package using pip:

pip install quivr-core

Source: README.md:53-55

Verify Installation

To verify the installation worked correctly, you can import the package:

from quivr_core import Brain
print("Quivr-core installed successfully!")

Core Concepts

Brain

The Brain is the central entity in Quivr that manages document ingestion and querying. It encapsulates:

  • LLM Integration: The language model used for generating answers
  • Embedder: The embedding model for vectorizing documents
  • Vector Database: Storage for document embeddings and retrieval
  • File Processors: Components that parse various file formats

Source: core/quivr_core/brain/brain.py:1-50

QuivrFile

Files are represented as QuivrFile objects with the following attributes:

AttributeTypeDescription
idUUIDUnique identifier for the file
brain_idUUIDID of the brain this file belongs to
pathPathFile system path
original_filenamestrOriginal filename
file_sizeintFile size in bytes
file_extensionFileExtensionFile type enum
file_sha1strSHA1 hash for deduplication
additional_metadatadictCustom metadata

Source: core/quivr_core/files/file.py:50-70

Processor Pipeline

Documents go through a processing pipeline that:

  1. Parses the file content based on file type
  2. Splits content into manageable chunks
  3. Detects language for each chunk
  4. Embeds chunks for vector storage
  5. Stores in the vector database

Source: core/quivr_core/processor/processor_base.py:40-60

Quick Start Guide

Minimal Example (5 Lines of Code)

The fastest way to get started with Quivr:

import tempfile
from quivr_core import Brain

with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
    temp_file.write("Gold is a liquid of blue-like colour.")
    temp_file.flush()

    brain = Brain.from_files(
        name="test_brain",
        file_paths=[temp_file.name],
    )

    answer = brain.ask("what is gold? answer in french")
    print("answer:", answer)

Source: examples/simple_question/simple_question.py:1-20

Interactive Chat Example

For a more complete example with a console-based chat interface:

import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

from quivr_core import Brain
from quivr_core.config import RetrievalConfig

brain = Brain.from_files(
    name="my_smart_brain",
    file_paths=["./document.pdf", "./notes.txt"],
)

answer = brain.ask(
    "What is the main topic of these documents?",
    retrieval_config=RetrievalConfig()
)
print(answer.answer)

Source: README.md:85-100

PDF Processing Example

For processing PDF files with custom LLM configuration:

from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.rag.entities.config import LLMEndpointConfig
from quivr_core.llm.llm_endpoint import LLMEndpoint

brain = Brain.from_files(
    name="test_brain",
    file_paths=["tests/processor/data/dummy.pdf"],
    llm=LLMEndpoint(
        llm=FakeListChatModel(responses=["good"]),
        llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
    ),
    embedder=DeterministicFakeEmbedding(size=20),
)

answer = brain.ask("What is this document about?")
print(answer.answer)

Source: examples/pdf_parsing_tika.py:1-25

Workflow Architecture

Basic RAG Workflow

The following diagram illustrates the basic RAG workflow in Quivr:

graph TD
    A[User Query] --> B[Filter History]
    B --> C[Query Rewrite]
    C --> D[Retrieval]
    D --> E[LLM Generation]
    E --> F[Response]
    
    G[Document Ingestion] --> H[File Processing]
    H --> I[Chunking]
    I --> J[Embedding]
    J --> K[Vector Storage]
    
    D --> K

Data Flow

graph LR
    A[Files] -->|Parse| B[Documents]
    B -->|Chunk| C[Chunks]
    C -->|Embed| D[Vectors]
    D -->|Store| E[Vector DB]
    
    F[Query] -->|Embed| G[Query Vector]
    G -->|Search| E
    E -->|Results| H[Context]
    H -->|Generate| I[Answer]

Configuration

Environment Variables

Configure your API keys as environment variables:

export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"  # Optional
export MISTRAL_API_KEY="your-mistral-key"       # Optional

Custom Retrieval Configuration

Create a YAML configuration file for customized retrieval strategies:

workflow_config:
  name: "standard RAG"
  nodes:
    - name: "START"
      edges: ["filter_history"]
    - name: "filter_history"
      edges: ["rewrite"]
    - name: "rewrite"
      edges: ["retrieval"]
    - name: "retrieval"
      edges: ["generation"]
    - name: "generation"
      edges: ["END"]
  llm:
    temperature: 0.7

Source: README.md:65-85

RetrievalConfig Usage

from quivr_core.config import RetrievalConfig

# Load from YAML file
retrieval_config = RetrievalConfig.from_yaml("./basic_rag_workflow.yaml")

# Use with brain.ask()
answer = brain.ask(
    question="Your question here",
    retrieval_config=retrieval_config
)

Supported File Types

Quivr works with various file formats through pluggable processors:

File TypeExtensionProcessor
Plain Text.txtSimpleTxtProcessor
PDF.pdfTikaProcessor
Markdown.mdMarkdownProcessor
CSV.csvCSVProcessor
JSON.jsonJSONProcessor

Source: README.md:31-35

Advanced Usage

Streaming Responses

For real-time response streaming:

from quivr_core import Brain

brain = Brain.from_files(
    name="streaming_brain",
    file_paths=["./documents/"],
)

for chunk in brain.answer_astream("Explain the findings"):
    if chunk.last_chunk:
        break
    print(chunk.answer, end="", flush=True)

Custom Processors

Implement your own processor by extending ProcessorBase:

from quivr_core.processor.processor_base import ProcessorBase, ProcessedDocument

class CustomProcessor(ProcessorBase):
    supported_extensions = [".custom"]
    
    async def process_file_inner(self, file: QuivrFile) -> ProcessedDocument:
        # Implement custom parsing logic
        pass

Project Structure

quivr/
β”œβ”€β”€ README.md                    # Main documentation
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ README.md               # quivr-core package info
β”‚   └── quivr_core/
β”‚       β”œβ”€β”€ brain/
β”‚       β”‚   └── brain.py        # Brain class implementation
β”‚       β”œβ”€β”€ files/
β”‚       β”‚   └── file.py         # QuivrFile dataclass
β”‚       β”œβ”€β”€ processor/
β”‚       β”‚   β”œβ”€β”€ processor_base.py
β”‚       β”‚   └── implementations/
β”‚       └── rag/
β”‚           β”œβ”€β”€ quivr_rag.py    # RAG implementation
β”‚           └── prompts.py      # LLM prompts
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ simple_question/        # Basic usage examples
β”‚   β”œβ”€β”€ chatbot/                # Chainlit chatbot example
β”‚   └── pdf_parsing_tika.py     # PDF processing example

Next Steps

After completing the Getting Started guide:

  1. Explore Examples: Check the examples/ directory for more use cases
  2. Read Documentation: Visit core.quivr.com for full documentation
  3. Join Community: Connect with other users on Discord
  4. Contribute: Check open issues for contribution opportunities

Troubleshooting

Common Issues

IssueSolution
ImportError: No module named 'quivr_core'Run pip install quivr-core
API Key not foundSet environment variable before running
File parsing failsVerify file format and size (max 20MB for examples)
Slow retrievalAdjust n_results parameter or use local embeddings

Getting Help

Source: https://github.com/QuivrHQ/quivr / Human Manual

Installation

Related topics: Getting Started, LLM Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Requirements

Continue reading this section for the full explanation and source context.

Section API Keys

Continue reading this section for the full explanation and source context.

Section Using pip (Recommended)

Continue reading this section for the full explanation and source context.

Related topics: Getting Started, LLM Integration

Installation

This guide covers all aspects of installing and setting up Quivr, including the core package, examples, and required dependencies.

Overview

Quivr is a RAG (Retrieval-Augmented Generation) framework that enables users to create AI-powered knowledge bases from various file types. The installation process varies depending on your use case:

  • Core package installation for integrating Quivr into existing Python projects
  • Example applications for testing and learning purposes
  • Development setup for contributing to the project

Source: core/README.md

Prerequisites

System Requirements

RequirementMinimumRecommended
Python Version3.8+3.10+
Operating SystemLinux, macOS, WindowsLinux/macOS
RAM4 GB8 GB+
Disk Space500 MB1 GB+

Note: While the core package officially requires Python 3.10 or newer for full compatibility, some examples (such as chatbot_voice) support Python 3.8 or higher.

Source: core/README.md, examples/chatbot_voice/README.md

API Keys

Quivr supports multiple LLM providers. You must configure at least one API key:

ProviderEnvironment VariableRequired
OpenAIOPENAI_API_KEYYes (if using OpenAI)
AnthropicANTHROPIC_API_KEYYes (if using Anthropic)
MistralMISTRAL_API_KEYYes (if using Mistral)

Source: core/README.md

Installing the Core Package

The simplest way to install Quivr Core is via pip:

pip install quivr-core

Verify the installation by checking that the package is importable:

from quivr_core import Brain

Source: core/README.md

Package Contents

The quivr-core package includes:

  • Brain class: The main interface for creating and managing knowledge bases
  • RAG components: Retrieval, processing, and answer generation pipelines
  • Built-in processors: Support for PDF, TXT, Markdown, and other common formats
  • Default configurations: Ready-to-use settings for quick setup

Source: core/README.md, core/quivr_core/processor/processor_base.py

Installation for Examples

Chatbot Example with Chainlit

The chatbot example demonstrates file upload and Q&A capabilities using Chainlit.

#### Using rye (Recommended)

# Clone or navigate to the chatbot directory
cd examples/chatbot

# Install dependencies with rye
rye sync

# Activate the virtual environment
source ./venv/bin/activate

#### Using pip

# Navigate to the chatbot directory
cd examples/chatbot

# Install from requirements
pip install -r requirements.txt

Source: examples/chatbot/README.md

Voice Chatbot Example

The voice chatbot example adds voice interaction capabilities.

# Navigate to the voice chatbot directory
cd examples/chatbot_voice

# Install dependencies
pip install -r requirements.lock

Source: examples/chatbot_voice/README.md

Flask-based Example (quivr-whisper)

This example uses Flask for a web server implementation.

# Install Flask and dependencies
pip install flask openai requests python-dotenv

Source: examples/quivr-whisper/README.md

Simple Question Example

The simplest example demonstrating basic usage:

# Create a Python script with the following content
import tempfile
from quivr_core import Brain

brain = Brain.from_files(
    name="test_brain",
    file_paths=["your_file.txt"],
)

answer = brain.ask("Your question here")
print(answer)

Requires python-dotenv for loading environment variables.

Source: examples/simple_question/simple_question.py

Environment Configuration

Required Environment Variables

Create a .env file in your project root:

# LLM API Keys (at least one required)
OPENAI_API_KEY=your_openai_api_key
# ANTHROPIC_API_KEY=your_anthropic_key
# MISTRAL_API_KEY=your_mistral_key

# Optional: For specific integrations
QUIVR_API_KEY=your_quivr_api_key
QUIVR_CHAT_ID=your_chat_id
QUIVR_BRAIN_ID=your_brain_id
QUIVR_URL=https://api.quivr.app

Loading Environment Variables

Use python-dotenv to load environment variables:

import dotenv
dotenv.load_dotenv()

Source: examples/quivr-whisper/README.md, examples/simple_question/simple_question.py

Installation Flow

graph TD
    A[Start Installation] --> B{Choose Installation Type}
    
    B --> C[Core Package Only]
    B --> D[Examples & Demos]
    B --> E[Development Setup]
    
    C --> F[pip install quivr-core]
    F --> G[Set API Keys]
    G --> H[Ready to Integrate]
    
    D --> I{Which Example?}
    
    I --> J[Chatbot with Chainlit]
    I --> K[Voice Chatbot]
    I --> L[Flask App]
    
    J --> M[rye sync or pip install]
    K --> N[pip install -r requirements.lock]
    L --> O[pip install flask openai requests]
    
    M --> P[Run with chainlit run main.py]
    N --> Q[Run with chainlit run main.py]
    O --> R[Run with flask run]
    
    E --> S[Clone Repository]
    S --> T[Navigate to core/]
    T --> U[Install from pyproject.toml]
    U --> V[Run Tests]

Verifying Installation

Quick Verification

After installation, verify that Quivr is correctly installed:

import quivr_core
print(quivr_core.__version__)  # Should print the installed version

Full Installation Test

Create a test script to verify all components:

import tempfile
from quivr_core import Brain

# Create a temporary test file
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
    f.write("Quivr is a RAG framework for building AI knowledge bases.")
    temp_path = f.name

# Create a brain from the file
brain = Brain.from_files(
    name="test_brain",
    file_paths=[temp_path],
)

# Test the ask function
answer = brain.ask("What is Quivr?")
print(f"Answer: {answer}")

# Print brain info
brain.print_info()

Source: core/README.md, examples/simple_question/simple_question.py

Troubleshooting

Common Issues

IssueCauseSolution
ImportError: No module named 'quivr_core'Package not installedRun pip install quivr-core
AuthenticationErrorInvalid API keyVerify your API key is correct
Version mismatchIncompatible Python versionEnsure Python 3.10+ is used
File not foundIncorrect file pathCheck the file path exists

Checking Installed Version

The Quivr version is automatically tracked in document metadata:

from quivr_core.processor.processor_base import get_version

try:
    from importlib.metadata import version
    qvr_version = version("quivr-core")
except PackageNotFoundError:
    qvr_version = "dev"

Source: core/quivr_core/processor/processor_base.py

Next Steps

After successful installation:

  1. Create your first Brain: Load documents and create a knowledge base
  2. Configure RAG: Customize retrieval strategies using YAML configuration files
  3. Explore Examples: Test different example applications to understand capabilities
  4. Read Documentation: Visit core.quivr.com for advanced usage

Source: core/README.md, examples/chatbot/README.md

Source: https://github.com/QuivrHQ/quivr / Human Manual

System Architecture Overview

Related topics: Core Components, Brain Class, RAG Implementation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Brain Class

Continue reading this section for the full explanation and source context.

Section File Processing Pipeline

Continue reading this section for the full explanation and source context.

Section QuivrFile Data Model

Continue reading this section for the full explanation and source context.

Related topics: Core Components, Brain Class, RAG Implementation

System Architecture Overview

Introduction

Quivr is an open-source framework for building RAG (Retrieval Augmented Generation) applications that enables users to create personal "brains" from various document types. The system leverages generative AI to provide intelligent question-answering capabilities over user-provided documents.

The architecture is designed around three core pillars:

  • Document Processing: Extracting and chunking content from files
  • Vector Storage: Storing embeddings for semantic search
  • LLM-powered Answer Generation: Using large language models to generate answers based on retrieved context

High-Level Architecture

graph TD
    A[User Files] --> B[File Processing]
    B --> C[Chunking & Embedding]
    C --> D[Vector Database]
    E[User Query] --> F[Embedding]
    F --> G[Semantic Search]
    G --> H[Context Assembly]
    H --> I[LLM Generation]
    I --> J[Answer Response]
    
    D --> G

Core Components

Brain Class

The Brain class is the central orchestrator of the Quivr framework. It manages the entire lifecycle from file ingestion to query answering.

File: core/quivr_core/brain/brain.py

#### Key Responsibilities

ResponsibilityDescription
File IngestionProcesses files through various parsers
Vector StorageManages the vector database for embeddings
Query ProcessingHandles search and retrieval operations
Answer GenerationOrchestrates LLM-based response generation

#### Core Methods

MethodTypePurpose
from_filesSynchronousCreate a Brain from file paths
afrom_filesAsyncAsync creation from files
afrom_langchain_documentsAsyncCreate from LangChain Document objects
asearchAsyncSearch for relevant documents
ask_streamingAsync GeneratorStream answers to questions

#### Initialization Parameters

class Brain:
    def __init__(
        self,
        id: UUID,
        name: str,
        storage: BrainStorage | None,
        llm: LLMEndpoint,
        embedder: Embeddings,
        vector_db: QuivrVectorStore,
    )

Source: core/quivr_core/brain/brain.py:1-100

File Processing Pipeline

The processor system handles extraction and transformation of various file formats.

File: core/quivr_core/processor/processor_base.py

#### Processing Flow

graph LR
    A[QuivrFile] --> B[process_file]
    B --> C[process_file_inner]
    C --> D[ProcessedDocument]
    D --> E[Metadata Enrichment]
    E --> F[Chunk Output]

#### Metadata Enrichment

During processing, each chunk receives comprehensive metadata:

doc.metadata = {
    "chunk_index": idx,
    "quivr_core_version": qvr_version,
    "language": detect_language(text=...),
    **file.metadata,
    **doc.metadata,
    **self.processor_metadata,
}

Source: core/quivr_core/processor/processor_base.py:20-35

QuivrFile Data Model

File: core/quivr_core/files/file.py

The QuivrFile class represents uploaded files with their associated metadata.

class QuivrFile:
    __slots__ = [
        "id",
        "brain_id",
        "path",
        "original_filename",
        "file_size",
        "file_extension",
        "file_sha1",
        "additional_metadata",
    ]
FieldTypeDescription
idUUIDUnique identifier
brain_idUUIDAssociated brain identifier
pathPathFile system path
original_filenamestrOriginal file name
file_sizeintFile size in bytes
file_extensionFileExtensionFile type
file_sha1strSHA1 hash for deduplication

Source: core/quivr_core/files/file.py:1-50

Retrieval and Answer Generation

RAG Architecture

The RAG (Retrieval Augmented Generation) system combines semantic search with LLM-powered answer generation.

File: core/quivr_core/rag/quivr_rag_langgraph.py

#### RAG Workflow

graph TD
    A[User Question] --> B[Filter History]
    B --> C[Query Rewrite]
    C --> D[Retrieval]
    D --> E[Context Assembly]
    E --> F[LLM Generation]
    F --> G[Streaming Response]
    
    H[System Prompt] --> F
    I[Chat History] --> B
    J[File Filters] --> D

#### Configuration

The system uses RetrievalConfig for configuring the RAG pipeline:

ParameterTypeDefaultDescription
llm_configLLMEndpointConfigBrain's LLMLLM model configuration
temperaturefloat0.7Generation temperature
n_resultsint5Number of retrieval results

Source: core/quivr_core/rag/quivr_rag.py:1-50

Streaming Answer Generation

The system supports streaming responses for real-time answer delivery:

async for response in rag_instance.answer_astream(
    run_id=run_id,
    question=question,
    system_prompt=system_prompt or None,
    history=chat_history,
    list_files=list_files,
    metadata=metadata,
):
    if not response.last_chunk:
        yield response

Source: core/quivr_core/brain/brain.py:150-170

Prompt Templates

File: core/quivr_core/rag/prompts.py

The system uses structured prompts with multiple context sections:

SectionPurpose
user_metadataUser-specific context
ticket_metadataQuery-related metadata
similar_ticketsReference information
ticket_historyConversation history
additional_informationExtra context
client_queryThe actual question

Default instructions ensure consistent response quality:

  • Verbosity matching similar responses
  • Proper formatting with paragraphs, bold, italic
  • Language consistency with the query
  • No signature at end of response

LLM Integration

Default LLM Configuration

The system supports multiple LLM providers:

ProviderConfiguration
OpenAIOPENAI_API_KEY environment variable
AnthropicAnthropic API support
MistralMistral API support
OllamaLocal model support

Source: README.md - Configuration section

Embedder Abstraction

Embedders are abstracted to support multiple backends:

if embedder is None:
    embedder = default_embedder()

Vector DB initialization with embeddings:

if vector_db is None:
    vector_db = await build_default_vectordb(langchain_documents, embedder)

Source: core/quivr_core/brain/brain.py:60-70

Search Capabilities

Async Search Method

async def asearch(
    self,
    query: str | Document,
    n_results: int = 5,
    filter: Callable | Dict[str, Any] | None = None,
    fetch_n_neighbors: int = 20,
) -> list[SearchResult]
ParameterTypeDefaultDescription
querystr \DocumentRequiredSearch query
n_resultsint5Number of results
filterCallable \DictNoneCustom filtering
fetch_n_neighborsint20Extended fetch for re-ranking

Source: core/quivr_core/brain/brain.py:70-85

State Management

Brain ID Generation

Each brain receives a unique identifier:

brain_id = uuid4()

Workspace and Chat Context

The system tracks conversation context:

metadata = LangchainMetadata(
    langfuse_trace_id=str(run_id),
    langfuse_user_id=str(self.workspace_id),
    langfuse_session_id=str(self.chat_id),
)

Installation and Dependencies

Core Package

pip install quivr-core

Quick Start Example

from quivr_core import Brain

brain = Brain.from_files(
    name="my_smart_brain",
    file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)

answer = brain.ask("What is the main topic?")

Summary

The Quivr architecture implements a modular RAG pipeline where:

  1. Files are processed and chunked with metadata enrichment
  2. Embeddings are generated and stored in a vector database
  3. Queries trigger semantic search with configurable filters
  4. LLMs generate contextual answers from retrieved content
  5. Streaming enables real-time response delivery

The design prioritizes flexibility through dependency injection, allowing custom LLM providers, embedders, and vector stores while maintaining a consistent API for brain creation and querying.

Source: https://github.com/QuivrHQ/quivr / Human Manual

Core Components

Related topics: System Architecture Overview, Brain Class, LLM Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: System Architecture Overview, Brain Class, LLM Integration

Core Components

Quivr is an open-source RAG (Retrieval-Augmented Generation) framework that enables users to create "brains" from various document types and query them using natural language. The core components form the architectural foundation that powers document processing, vector storage, retrieval, and LLM-based answer generation.

Architecture Overview

Quivr's architecture follows a modular design with clear separation of concerns across several key layers:

graph TD
    A[User Query] --> B[Brain.ask]
    B --> C[RetrievalConfig]
    C --> D[QuivrQARAGLangGraph]
    D --> E[Vector Store Retrieval]
    E --> F[Context Chunks]
    F --> G[LLM Generation]
    G --> H[Streaming Response]
    
    I[Files] --> J[Processor]
    J --> K[ProcessedDocument]
    K --> L[Vector DB Indexing]
    L --> E
    
    M[LLM Config] --> G
    N[Embedder] --> L

The system is built around three primary abstractions:

ComponentPurposeKey Files
BrainCentral orchestrator managing files, vectors, and LLM interactionsbrain.py
RAG EngineHandles retrieval and answer generation pipelinequivr_rag.py
ProcessorsParse and chunk various file formatsprocessor_base.py

Source: https://github.com/QuivrHQ/quivr / Human Manual

Brain Class

Related topics: RAG Implementation, File Processing and Parsers, Storage System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section From Files

Continue reading this section for the full explanation and source context.

Section From LangChain Documents

Continue reading this section for the full explanation and source context.

Section With Custom LLM and Embedder

Continue reading this section for the full explanation and source context.

Related topics: RAG Implementation, File Processing and Parsers, Storage System

Brain Class

The Brain class is the central abstraction in Quivr for building Retrieval Augmented Generation (RAG) systems. It serves as the primary interface for creating knowledge bases from documents, performing semantic search, and generating AI-powered answers based on retrieved context.

Overview

The Brain class encapsulates:

  • Vector storage for semantic document indexing
  • LLM integration for answer generation
  • Embedder configuration for document vectorization
  • File processing pipeline for document ingestion
  • Chat history management for conversational context

A Brain can be created from files (PDF, TXT, Markdown, etc.) or from LangChain documents, then queried to generate context-aware responses.

Architecture

graph TD
    A[Files / Documents] --> B[Brain Class]
    B --> C[Vector DB]
    B --> D[LLM]
    B --> E[Embedder]
    B --> F[Storage]
    G[Query] --> B
    B --> H[QuivrQARAGLangGraph]
    H --> I[Answer]
    C --> H

Creating a Brain

From Files

The most common way to create a Brain is from a collection of files:

from quivr_core import Brain

brain = Brain.from_files(
    name="my_smart_brain",
    file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)

Source: examples/simple_question/simple_question.py:1-14

From LangChain Documents

For programmatic document creation:

from langchain_core.documents import Document
from quivr_core import Brain

documents = [Document(page_content="Hello, world!")]
brain = await Brain.afrom_langchain_documents(name="My Brain", langchain_documents=documents)

Source: core/quivr_core/brain/brain.py:1-150

With Custom LLM and Embedder

Override default configurations with custom implementations:

from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.rag.entities.config import LLMEndpointConfig
from quivr_core.llm.llm_endpoint import LLMEndpoint

brain = Brain.from_files(
    name="test_brain",
    file_paths=["tests/processor/data/dummy.pdf"],
    llm=LLMEndpoint(
        llm=FakeListChatModel(responses=["good"]),
        llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
    ),
    embedder=DeterministicFakeEmbedding(size=20),
)

Source: examples/pdf_parsing_tika.py:1-20

Core Methods

Asking Questions

#### Synchronous

answer = brain.ask(
    "what is gold? answer in french",
    retrieval_config=retrieval_config
)
print("answer:", answer)

#### Asynchronous Streaming

async for chunk in brain.ask_streaming("What is the meaning of life?"):
    print(chunk.answer)

Source: core/quivr_core/brain/brain.py:150-250

#### Async Search

results = await brain.asearch(
    query="your search query",
    n_results=5,
    filter=None,
    fetch_n_neighbors=20
)
ParameterTypeDefaultDescription
query`str \Document`RequiredThe search query
n_resultsint5Number of results to return
filter`Callable \Dict \None`NoneOptional filter for results
fetch_n_neighborsint20Number of neighbors to fetch

Source: core/quivr_core/brain/brain.py:100-120

Storage and Persistence

Saving a Brain

Brains can be persisted to disk for later reuse:

save_path = await brain.save("/home/user/.local/quivr")

Source: examples/save_load_brain.py:1-22

Loading a Brain

brain_loaded = Brain.load(save_path)
brain_loaded.print_info()

Source: examples/save_load_brain.py:18

File Processing

QuivrFile Entity

The Brain processes files through the QuivrFile dataclass:

class QuivrFile:
    __slots__ = [
        "id",
        "brain_id",
        "path",
        "original_filename",
        "file_size",
        "file_extension",
        "file_sha1",
        "additional_metadata",
    ]

Source: core/quivr_core/files/file.py:20-35

File Metadata

During processing, each chunk receives metadata including:

FieldDescription
chunk_indexPosition of the chunk in the document
quivr_core_versionVersion of quivr-core used
languageDetected language of the content
original_file_nameSource filename

Source: core/quivr_core/processor/processor_base.py:1-50

Retrieval Configuration

The RetrievalConfig controls the RAG pipeline behavior:

from quivr_core.config import RetrievalConfig

config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)

YAML Configuration Example

workflow_config:
  name: "standard RAG"
  nodes:
    - name: "START"
      edges: ["filter_history"]
    - name: "filter_history"
      edges: ["rewrite"]
    - name: "rewrite"
      edges: [...]

Complete Usage Example

import tempfile
from quivr_core import Brain

with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
    temp_file.write("Gold is a liquid of blue-like colour.")
    temp_file.flush()

    brain = Brain.from_files(
        name="test_brain",
        file_paths=[temp_file.name],
    )

    answer = brain.ask("what is gold? answer in french")
    print("answer:", answer)

Source: README.md

Attributes Summary

AttributeTypeDescription
idUUIDUnique identifier for the brain
namestrHuman-readable name
vector_dbVectorStoreVector storage for embeddings
llmLLMEndpointLanguage model for generation
embedderEmbeddingsEmbedding model for vectorization
storageBrainStoragePersistence layer

Module Export

The Brain class is exported from the main quivr_core package:

from quivr_core import Brain

Source: core/quivr_core/brain/__init__.py:1-5

Source: https://github.com/QuivrHQ/quivr / Human Manual

RAG Implementation

Related topics: Brain Class

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Brain Class

Continue reading this section for the full explanation and source context.

Section RAG Pipeline Flow

Continue reading this section for the full explanation and source context.

Section ParsedRAGChunkResponse

Continue reading this section for the full explanation and source context.

Related topics: Brain Class

RAG Implementation

Overview

The RAG (Retrieval-Augmented Generation) implementation in Quivr provides a flexible, opinionated framework for building Retrieval-Augmented Generation pipelines. It combines vector-based document retrieval with Large Language Model (LLM) generation to enable question-answering over uploaded documents and files.

The implementation supports both synchronous and streaming responses, configurable retrieval strategies, and integration with multiple LLM providers including OpenAI, Anthropic, Mistral, and local models via Ollama.

Architecture

graph TD
    A[User Query] --> B[Brain.ask]
    B --> C[RAG Pipeline]
    C --> D[Retrieval Phase]
    C --> E[Generation Phase]
    D --> F[Vector DB Search]
    F --> G[Relevant Chunks]
    E --> H[LLM Processing]
    G --> H
    H --> I[Streaming Response]
    I --> J[ParsedRAGChunkResponse]

Core Components

Brain Class

The Brain class serves as the main entry point for RAG operations. It orchestrates document processing, retrieval, and generation.

Key Methods:

MethodDescriptionReturn Type
from_files()Create a brain from file pathsBrain
afrom_langchain_documents()Create a brain from LangChain documentsBrain
ask()Synchronous question answeringRAGResponse
asearch()Search for relevant documentslist[SearchResult]

Source: core/quivr_core/brain/brain.py:1-100

RAG Pipeline Flow

sequenceDiagram
    participant User
    participant Brain
    participant Retriever
    participant VectorDB
    participant LLM
    participant Response
    
    User->>Brain: ask(question)
    Brain->>Retriever: retrieve(query)
    Retriever->>VectorDB: similarity_search()
    VectorDB-->>Retriever: relevant_chunks
    Retriever-->>Brain: context_chunks
    Brain->>LLM: generate(context, question)
    LLM-->>Response: ParsedRAGChunkResponse
    Response-->>User: Streaming Answer

Streaming Response System

ParsedRAGChunkResponse

The streaming response is built around the ParsedRAGChunkResponse data class, which provides incremental chunks of the generated answer along with metadata.

Chunk Structure:

FieldTypeDescription
answerstrThe partial or complete answer text
metadatadictSources and additional context
last_chunkboolIndicates final chunk of response
sourceslistRetrieved document sources
chunk_idintSequence number of the chunk

Source: core/quivr_core/rag/quivr_rag.py:1-50

Answer Streaming Implementation

The answer_astream method implements asynchronous streaming of LLM responses:

async def answer_astream(self, query: str, ...) -> AsyncGenerator[ParsedRAGChunkResponse]:
    # Processing logic yields ParsedRAGChunkResponse chunks
    yield ParsedRAGChunkResponse(
        answer="",
        metadata=get_chunk_metadata(rolling_message, sources),
        last_chunk=True,
    )

Streaming Characteristics:

  • Yields multiple ParsedRAGChunkResponse objects during generation
  • Each chunk contains accumulated rolling_message content
  • Uses chunk_id for tracking sequence order
  • Final chunk marked with last_chunk=True
  • Metadata includes retrieved sources for citation

Source: core/quivr_core/rag/quivr_rag.py:50-100

Retrieval Configuration

RetrievalConfig

The retrieval behavior is controlled through RetrievalConfig which supports YAML-based configuration:

workflow_config:
  name: "standard RAG"
  nodes:
    - name: "START"
      edges: ["filter_history"]
    - name: "filter_history"
      edges: ["rewrite"]
    - name: "rewrite"

Source: README.md

Configuration Parameters

ParameterTypeDefaultDescription
n_resultsint5Number of documents to retrieve
fetch_n_neighborsint20Number of neighbors to fetch from vector DB
filter`Callable \Dict`NoneOptional metadata filtering

Source: core/quivr_core/rag/entities/config.py

Search Operation

Async Search Method

async def asearch(
    self,
    query: str | Document,
    n_results: int = 5,
    filter: Callable | Dict[str, Any] | None = None,
    fetch_n_neighbors: int = 20,
) -> list[SearchResult]:

Parameters:

ParameterTypeRequiredDescription
query`str \Document`YesThe search query
n_resultsintNoMaximum results to return
filter`Callable \Dict`NoMetadata filter condition
fetch_n_neighborsintNoInitial fetch size before re-ranking

Source: core/quivr_core/brain/brain.py:50-80

Document Processing Pipeline

Processor Base

The processor_base.py handles document chunking and metadata enrichment:

async def process_file(file: QuivrFile) -> ProcessedDocument:
    docs = await self.process_file_inner(file)
    qvr_version = version("quivr-core")
    
    for idx, doc in enumerate(docs.chunks, start=1):
        doc.metadata = {
            "chunk_index": idx,
            "quivr_core_version": qvr_version,
            "language": detect_language(text=...).value,
            **file.metadata,
            **doc.metadata,
        }

Metadata Enrichment:

  • chunk_index: Sequential position of chunk
  • quivr_core_version: Version of quivr-core
  • language: Auto-detected language of content
  • Original filename embedded in content for reference

Source: core/quivr_core/processor/processor_base.py:1-50

Prompt Templates

User Prompt Template

The prompt system uses structured templates with metadata injection:

<user_metadata>
{user_metadata}
</user_metadata>

<ticket_metadata>
{ticket_metadata}
</ticket_metadata>

<similar_tickets>
{similar_tickets}
</similar_tickets>

<ticket_history>
{ticket_history}
</ticket_history>

<additional_information>
{additional_information}
</additional_information>

<client_query>
{client_query}
</client_query>

Source: core/quivr_core/rag/prompts.py

Default Instructions

The system includes default instructions that guide response generation:

InstructionDescription
ConcisenessUse same level of detail as similar responses
FormattingProper paragraphs, bold, italic for readability
LanguageRespond in same language as user query
ConsistencyMaintain terminology consistency
No SignatureSignature added separately after response

LangGraph Integration

The implementation includes a LangGraph-based RAG pipeline (quivr_rag_langgraph.py) for more complex workflows:

graph LR
    A[Query] --> B[History Filter]
    B --> C[Query Rewrite]
    C --> D[Retrieval]
    D --> E[Answer Generation]
    E --> F[Response]

Features:

  • Multi-step processing pipelines
  • Conversation history integration
  • Query rewriting for better retrieval
  • Customizable workflow nodes

Source: core/quivr_core/rag/quivr_rag_langgraph.py

LLM Configuration

LLMEndpointConfig

ParameterTypeDescription
modelstrModel identifier
llm_base_urlstrAPI endpoint URL
temperaturefloatGeneration temperature (default: 0.7)

Source: core/quivr_core/rag/entities/config.py

Supported Providers

  • OpenAI: GPT-4, GPT-3.5-turbo
  • Anthropic: Claude models
  • Mistral: Mistral AI models
  • Ollama: Local model support

Usage Example

from quivr_core import Brain
from quivr_core.config import RetrievalConfig

# Create brain from files
brain = Brain.from_files(
    name="my_brain",
    file_paths=["./document.pdf", "./notes.txt"],
)

# Configure retrieval
retrieval_config = RetrievalConfig.from_yaml("./workflow.yaml")

# Ask a question
answer = brain.ask(
    "What is the main topic of these documents?",
    retrieval_config=retrieval_config
)

print(answer.answer)

Source: examples/simple_question/simple_question.py

Key Design Patterns

PatternImplementation
Async/AwaitAll I/O operations are asynchronous
StreamingResponses streamed via async generators
Dependency InjectionLLM and embedder are configurable
Configuration-drivenWorkflows defined via YAML
Metadata EnrichmentAutomatic language detection and versioning

Source: https://github.com/QuivrHQ/quivr / Human Manual

LLM Integration

Related topics: Core Components, System Architecture Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Components

Continue reading this section for the full explanation and source context.

Section Class Diagram

Continue reading this section for the full explanation and source context.

Section LLMEndpointConfig Parameters

Continue reading this section for the full explanation and source context.

Related topics: Core Components, System Architecture Overview

LLM Integration

Overview

The LLM Integration module in Quivr provides a flexible, abstracted interface for interacting with Large Language Models (LLMs) from various providers. This module enables Quivr's RAG (Retrieval-Augmented Generation) pipeline to leverage different LLM backends without requiring changes to the core business logic. The integration supports OpenAI, Anthropic, Mistral, Meta (Llama), Groq, and local models via Ollama.

The primary goals of the LLM Integration are:

  • Provider Abstraction: Uniform API regardless of the underlying LLM provider
  • Configuration Management: Centralized configuration for model parameters, token limits, and provider-specific settings
  • Runtime Flexibility: Support for both synchronous and asynchronous operations
  • Embeddings Integration: Seamless integration with embedding models for semantic search

Source: core/quivr_core/llm/llm_endpoint.py

Architecture

High-Level Components

The LLM Integration consists of several key components:

ComponentPurposeLocation
LLMEndpointMain wrapper class for LLM interactionsquivr_core/llm/llm_endpoint.py
LLMEndpointConfigConfiguration dataclass for LLM settingsquivr_core/rag/entities/config.py
LLMConfigPer-model configuration with token limitsquivr_core/rag/entities/config.py
DefaultModelSuppliersEnum for supported LLM providersquivr_core/rag/entities/config.py

Class Diagram

classDiagram
    class LLMEndpoint {
        +llm: BaseChatModel
        +llm_config: LLMEndpointConfig
        +__init__(llm, llm_config)
        +get_client() BaseChatModel
    }
    
    class LLMEndpointConfig {
        +model: str
        +llm_base_url: str
        +temperature: float
        +max_output_tokens: int
        +model_type: LLMEndpointType
    }
    
    class LLMConfig {
        +max_context_tokens: int
        +max_output_tokens: int
        +tokenizer_hub: str
    }
    
    class DefaultModelSuppliers {
        <<enumeration>>
        OPENAI
        ANTHROPIC
        MISTRAL
        META
        GROQ
        OLLAMA
    }
    
    LLMEndpoint --> LLMEndpointConfig
    LLMEndpointConfig ..> DefaultModelSuppliers

Source: core/quivr_core/llm/llm_endpoint.py Source: core/quivr_core/rag/entities/config.py

Configuration

LLMEndpointConfig Parameters

The LLMEndpointConfig class provides configuration for LLM endpoints:

ParameterTypeDefaultDescription
modelstrRequiredModel identifier (e.g., "gpt-4o", "claude-3-opus")
llm_base_urlstr"local"Base URL for the LLM API endpoint
temperaturefloat0.7Sampling temperature for generation (0.0-2.0)
max_output_tokensint4096Maximum tokens in the generated response
model_typeLLMEndpointTypeLLMEndpointType.CHATType of LLM endpoint

Source: core/quivr_core/rag/entities/config.py

Default Model Configurations

Quivr provides pre-configured settings for various models through the DefaultModelSuppliers enum and associated LLMConfig dictionaries:

ProviderModelMax Context TokensMax Output TokensTokenizer Hub
OpenAIgpt-4o128,00016,384Quivr/claude-tokenizer
OpenAIgpt-4-turbo128,0004,096Quivr/claude-tokenizer
Anthropicclaude-3.5-sonnet200,0008,192Quivr/claude-tokenizer
Anthropicclaude-3-opus200,0004,096Quivr/claude-tokenizer
Mistralmistral-large32,000N/A-
Metallama-3.1128,0004,096Quivr/Meta-Llama-3.1-Tokenizer
Metallama-38,1922,048Quivr/llama3-tokenizer-new
Groqllama-3.3-70b128,00032,768Quivr/Meta-Llama-3.1-Tokenizer

Source: core/quivr_core/rag/entities/config.py

Environment Variables

API keys should be set as environment variables before initializing the LLM:

export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"

Example usage in code:

import os
os.environ["OPENAI_API_KEY"] = "myopenai_apikey"

Source: README.md

Usage Patterns

Basic Integration with Brain

The most common pattern is to pass an LLMEndpoint instance when creating a Brain:

from langchain_openai import ChatOpenAI
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig

brain = Brain.from_files(
    name="my_smart_brain",
    file_paths=["./documents/*.pdf"],
    llm=LLMEndpoint(
        llm_config=LLMEndpointConfig(model="gpt-4o"),
        llm=ChatOpenAI(model="gpt-4o", api_key=str(os.getenv("OPENAI_API_KEY"))),
    ),
)

Source: examples/simple_question_megaparse.py

Using Fake LLM for Testing

For testing purposes, Quivr supports fake LLM implementations:

from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig
from langchain_core.embeddings import DeterministicFakeEmbedding

brain = Brain.from_files(
    name="test_brain",
    file_paths=["tests/processor/data/dummy.pdf"],
    llm=LLMEndpoint(
        llm=FakeListChatModel(responses=["good"]),
        llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
    ),
    embedder=DeterministicFakeEmbedding(size=20),
)

Source: examples/pdf_parsing_tika.py

Custom Embeddings Integration

When using custom LLM configurations, you can also specify custom embedders:

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig

# LLM Configuration
llm = LLMEndpoint(
    llm_config=LLMEndpointConfig(model="gpt-4o"),
    llm=ChatOpenAI(model="gpt-4o", api_key=str(os.getenv("OPENAI_API_KEY"))),
)

# Embedder Configuration
embedder = OpenAIEmbeddings(model="text-embedding-3-large")

brain = Brain.from_files(
    name="custom_brain",
    file_paths=["./data/documents/"],
    llm=llm,
    embedder=embedder,
)

Source: examples/simple_question_megaparse.py

Supported Providers

OpenAI

OpenAI models are supported through the langchain_openai package:

from langchain_openai import ChatOpenAI
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig

llm = LLMEndpoint(
    llm=ChatOpenAI(model="gpt-4o", api_key=api_key),
    llm_config=LLMEndpointConfig(
        model="gpt-4o",
        temperature=0.7,
    ),
)

Supported models: gpt-4o, gpt-4-turbo, gpt-3.5-turbo

Anthropic

Anthropic models require the langchain-anthropic package:

from langchain_anthropic import ChatAnthropic
from quivr_core.llm.llm_endpoint import LLMEndpoint

llm = LLMEndpoint(
    llm=ChatAnthropic(model="claude-3-5-sonnet-20241022", anthropic_api_key=api_key),
    llm_config=LLMEndpointConfig(model="claude-3.5-sonnet"),
)

Mistral

Mistral models are supported via their API:

from langchain_mistralai import ChatMistralAI

llm = LLMEndpoint(
    llm=ChatMistralAI(model="mistral-large-latest", mistral_api_key=api_key),
    llm_config=LLMEndpointConfig(model="mistral-large"),
)

Local Models (Ollama)

For local inference using Ollama:

from langchain_ollama import ChatOllama

llm = LLMEndpoint(
    llm=ChatOllama(model="llama3.1", base_url="http://localhost:11434"),
    llm_config=LLMEndpointConfig(model="llama-3.1", llm_base_url="http://localhost:11434"),
)

Source: README.md

RAG Workflow Integration

The LLM Integration is a core component of Quivr's RAG pipeline. When processing a user query, the LLM is responsible for generating the final answer based on retrieved context.

graph TD
    A[User Query] --> B[Brain.ask]
    B --> C[Vector Search]
    C --> D[Retrieve Relevant Chunks]
    D --> E[LLMEndpoint]
    E --> F[Generate Answer]
    F --> G[RAG Response]
    
    H[LLMEndpointConfig] --> E
    I[Chat History] --> E
    J[System Prompt] --> E

The LLM receives contextual information from the retrieval step and generates responses that are then formatted and returned to the user. The quivr_rag_langgraph.py module orchestrates this workflow using LangGraph for complex graph-based processing.

Source: core/quivr_core/rag/quivr_rag.py Source: core/quivr_core/rag/quivr_rag_langgraph.py

Default LLM Fallback

If no LLM is explicitly provided when creating a Brain, Quivr automatically initializes a default LLM:

# From brain.py source code
if llm is None:
    llm = default_llm()

This fallback mechanism ensures that users can get started quickly without explicit configuration, while still allowing advanced users to customize their LLM setup.

Source: core/quivr_core/brain/brain.py

Best Practices

API Key Security

  • Store API keys in environment variables, never hardcode them
  • Use .env files with proper .gitignore entries for local development
  • Consider using secret management services (AWS Secrets Manager, HashiCorp Vault) for production

Model Selection

  • Choose models based on your use case requirements (speed vs. quality)
  • Use smaller models for simple queries to reduce costs
  • Reserve larger models (e.g., GPT-4o, Claude 3.5) for complex reasoning tasks

Token Management

  • Monitor max_context_tokens to avoid exceeding model limits
  • Set appropriate max_output_tokens based on expected response length
  • Use the RetrievalConfig to control how many chunks are fed to the LLM

Temperature Settings

TemperatureUse CaseCharacteristics
0.0 - 0.3Factual/DeterministicMore focused, consistent responses
0.4 - 0.7General PurposeBalanced creativity and consistency
0.8 - 1.0Creative/BrainstormingMore varied, potentially surprising outputs

Troubleshooting

Common Issues

`` Error: OPENAI_API_KEY environment variable not set ` Solution: Ensure the API key is set before running the script: `bash export OPENAI_API_KEY="your-key" ``

  1. API Key Not Found

Solution: Reduce the number of retrieved chunks or adjust chunk size in processing

  1. Model Context Limit Exceeded

Solution: Implement retry logic or use a rate limiting middleware

  1. Rate Limiting

Source: README.md

See Also

Source: https://github.com/QuivrHQ/quivr / Human Manual

File Processing and Parsers

Related topics: Storage System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Related topics: Storage System

File Processing and Parsers

Quivr's file processing system provides a extensible architecture for ingesting, parsing, and chunking various document formats for use in Retrieval-Augmented Generation (RAG) workflows. The processor subsystem sits at the core of Quivr's data ingestion pipeline, transforming raw files into structured document chunks ready for embedding and retrieval.

Architecture Overview

The file processing architecture follows a plugin-based design pattern where processors are registered by file extension and lazily loaded on demand. This separation of concerns allows new file format support to be added without modifying core processing logic.

graph TD
    A[QuivrFile Input] --> B[Processor Registry]
    B --> C{File Extension}
    C -->|.txt| D[SimpleTxtProcessor]
    C -->|.pdf| E[MegaparseProcessor]
    C -->|Other| F[Default Processors]
    D --> G[ProcessedDocument]
    E --> G
    F --> G
    G --> H[Document Chunks]
    H --> I[RAG Pipeline]

Core Components

ComponentPurposeLocation
ProcessorBaseAbstract base class for all processorsprocessor_base.py
ProcessedDocumentGeneric container for processing resultsprocessor_base.py
ProcessorRegistryMaps extensions to processor classesregistry.py
SplitterConfigConfiguration for text chunkingsplitter.py

Source: processor_base.py:1-75, registry.py

Source: https://github.com/QuivrHQ/quivr / Human Manual

Storage System

Related topics: File Processing and Parsers, Brain Class

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section Component Hierarchy

Continue reading this section for the full explanation and source context.

Section StorageBase (Abstract Base Class)

Continue reading this section for the full explanation and source context.

Related topics: File Processing and Parsers, Brain Class

Storage System

Overview

The Storage System in Quivr is a modular abstraction layer responsible for managing file uploads, storage, and retrieval across different storage backends. It provides a clean separation between the RAG (Retrieval-Augmented Generation) logic and the underlying file persistence mechanisms, enabling Quivr to work with various storage implementations while maintaining a consistent API.

The system is designed to handle:

  • File upload and deduplication using SHA-1 hashing
  • Asynchronous file access
  • Multiple storage backends (currently supporting local filesystem)
  • File metadata tracking and organization by brain

Source: core/quivr_core/storage/storage_base.py:1-47

Architecture

High-Level Architecture

graph TD
    A[Brain] --> B[StorageBase]
    B --> C[LocalStorage]
    B --> D[Future: CloudStorage]
    B --> E[Future: S3Storage]
    C --> F[QuivrFile]
    C --> G[File System]
    F --> H[Metadata]
    F --> I[SHA-1 Hash]

Component Hierarchy

classDiagram
    class StorageBase {
        <<abstract>>
        +name: str
        +nb_files() int
        +get_files() List~QuivrFile~
        +upload_file(file, exists_ok)*
    }
    
    class LocalStorage {
        +name: str = "local_storage"
        +files: List~QuivrFile~
        +hashes: Set~str~
        +copy_flag: bool
        +dir_path: Path
    }
    
    class QuivrFile {
        +id: UUID
        +brain_id: UUID
        +path: Path
        +original_filename: str
        +file_size: int
        +file_extension: FileExtension
        +file_sha1: str
        +additional_metadata: dict
    }
    
    StorageBase <|-- LocalStorage

Core Components

StorageBase (Abstract Base Class)

The StorageBase is an abstract base class that defines the contract for all storage implementations. It enforces a consistent interface across different storage backends through Python's ABC (Abstract Base Class) mechanism.

Source: core/quivr_core/storage/storage_base.py:19-47

#### Required Subclass Attributes

AttributeTypeDescription
namestrHuman-readable name of the storage type

#### Abstract Methods

MethodReturn TypeDescription
nb_files()intReturns the total number of files stored
get_files()List[QuivrFile]Asynchronously retrieves all files in storage
upload_file(file, exists_ok)NoneUploads a file to the storage

The base class enforces that subclasses must define the name attribute through __init_subclass__:

def __init_subclass__(cls, **kwargs):
    for required in ("name",):
        if not getattr(cls, required):
            raise TypeError(
                f"Can't instantiate abstract class {cls.__name__} without {required} attribute defined"
            )
    return super().__init_subclass__(**kwargs)

Source: core/quivr_core/storage/storage_base.py:19-27

LocalStorage Implementation

LocalStorage is the concrete implementation of StorageBase that persists files to the local filesystem.

Source: core/quivr_core/storage/local_storage.py:1-50

#### Attributes

AttributeTypeDefaultDescription
namestr"local_storage"Storage type identifier
filesList[QuivrFile][]In-memory list of stored files
hashesSet[str]set()Set of SHA-1 hashes for deduplication
copy_flagboolTrueIf True, copy files; if False, create symlinks
dir_pathPathSee belowDirectory path for file storage

#### Directory Resolution

The storage directory is resolved in the following order:

  1. Explicitly provided dir_path argument
  2. Environment variable QUIVR_LOCAL_STORAGE
  3. Default path: ~/.cache/quivr/files
if dir_path is None:
    self.dir_path = Path(
        os.getenv("QUIVR_LOCAL_STORAGE", "~/.cache/quivr/files")
    )
else:
    self.dir_path = dir_path
os.makedirs(self.dir_path, exist_ok=True)

Source: core/quivr_core/storage/local_storage.py:38-46

QuivrFile Data Model

QuivrFile represents a file stored in the Quivr system with all associated metadata.

Source: core/quivr_core/files/file.py:48-74

#### Class Slots

The class uses __slots__ for memory-efficient attribute storage:

SlotTypeDescription
idUUIDUnique identifier for the file
brain_id`UUID \None`ID of the brain this file belongs to
pathPathActual filesystem path to the file
original_filenamestrOriginal name of the uploaded file
file_size`int \None`Size of the file in bytes
file_extension`FileExtension \str`File extension/type
file_sha1strSHA-1 hash of the file content
additional_metadatadictCustom metadata dictionary

Source: core/quivr_core/files/file.py:49-57

#### Constructor Parameters

def __init__(
    self,
    id: UUID,
    original_filename: str,
    path: Path,
    file_sha1: str,
    file_extension: FileExtension | str,
    brain_id: UUID | None = None,
    file_size: int | None = None,
    metadata: dict[str, Any] | None = None,
) -> None:

Source: core/quivr_core/files/file.py:59-70

#### Async File Access

The QuivrFile class provides an async context manager for reading file contents:

@asynccontextmanager
async def open(self) -> AsyncGenerator[AsyncIterable[bytes], None]:
    f = await aiofiles.open(self.path, mode="rb")
    try:
        yield f
    finally:
        await f.close()

Source: core/quivr_core/files/file.py:76-83

File Upload Workflow

sequenceDiagram
    participant Client
    participant LocalStorage
    participant FileSystem
    
    Client->>LocalStorage: upload_file(QuivrFile)
    LocalStorage->>LocalStorage: Check SHA-1 hash
    alt file exists and exists_ok=False
        LocalStorage-->>Client: FileExistsError
    else file doesn't exist
        LocalStorage->>FileSystem: Copy or Symlink
        FileSystem-->>LocalStorage: Success
        LocalStorage->>LocalStorage: Update files list
        LocalStorage->>LocalStorage: Add hash to hashes set
    end

Upload Process Details

  1. File Path Construction: Files are stored at {dir_path}/{brain_id}/{file_id}
  1. Deduplication: Before upload, the system checks if a file with the same SHA-1 hash already exists using file_sha1:

``python if file.file_sha1 in self.hashes and not exists_ok: raise FileExistsError(...) ``

  1. Storage Strategy: Based on copy_flag:
  • True: Copy file content to destination
  • False: Create symbolic link to original file
  1. File Registration: After successful upload, the file is added to the internal tracking list.

Source: core/quivr_core/storage/local_storage.py:48-70

File Creation Helper

The get_file_path() function creates a QuivrFile instance from a filesystem path:

def get_file_path(
    path: Path,
    brain_id: UUID | None = None
) -> QuivrFile:

Processing Steps

StepOperation
1Get file size using os.path.getsize()
2Compute SHA-1 hash by reading file bytes
3Extract or generate UUID for file ID
4Determine file extension
5Create and return QuivrFile instance
file_size = os.path.getsize(path)
with aiofiles.open(path, mode="rb") as f:
    file_sha1 = hashlib.sha1(await f.read()).hexdigest()

try:
    id = UUID(path.name)
except ValueError:
    id = uuid4()

Source: core/quivr_core/storage/file.py:19-43

Configuration Options

LocalStorage Configuration

ParameterTypeDefaultDescription
dir_path`Path \None`NoneCustom storage directory
copy_flagboolTrueFile storage method

Environment Variables

VariableDescription
QUIVR_LOCAL_STORAGEOverride default storage directory

Usage Examples

Creating a LocalStorage Instance

from quivr_core.storage.local_storage import LocalStorage
from pathlib import Path

# Use default directory (~/.cache/quivr/files)
storage = LocalStorage()

# Use custom directory
storage = LocalStorage(dir_path=Path("/my/custom/storage"))

# Use symlinks instead of copying
storage = LocalStorage(dir_path=Path("/mnt/data"), copy_flag=False)

Checking Storage Information

from quivr_core.storage.local_storage import LocalStorage

storage = LocalStorage()

# Get number of files
file_count = storage.nb_files()

# Get storage info
info = storage.info()
# Returns: {"directory_path": "...", "name": "local_storage", "nb_files": N}

Uploading Files

import asyncio
from quivr_core.storage.local_storage import LocalStorage
from quivr_core.files.file import get_file_path
from pathlib import Path

async def upload_example():
    storage = LocalStorage()
    
    # Create QuivrFile from path
    qfile = get_file_path(Path("./document.pdf"), brain_id=None)
    
    # Upload with overwrite allowed
    await storage.upload_file(qfile, exists_ok=True)
    
    print(f"Total files: {storage.nb_files()}")

Listing All Files

import asyncio
from quivr_core.storage.local_storage import LocalStorage

async def list_files():
    storage = LocalStorage()
    
    files = await storage.get_files()
    
    for f in files:
        print(f"File: {f.original_filename}")
        print(f"  ID: {f.id}")
        print(f"  Size: {f.file_size} bytes")
        print(f"  SHA-1: {f.file_sha1}")

Memory Management

The QuivrFile class uses __slots__ to optimize memory usage by restricting attribute creation:

class QuivrFile:
    __slots__ = [
        "id",
        "brain_id",
        "path",
        "original_filename",
        "file_size",
        "file_extension",
        "file_sha1",
        "additional_metadata",
    ]

This approach:

  • Prevents arbitrary attribute assignment
  • Reduces memory overhead per instance
  • Improves attribute access speed

Source: core/quivr_core/files/file.py:48-57

Extending the Storage System

To create a custom storage backend, implement the StorageBase abstract class:

from quivr_core.storage.storage_base import StorageBase
from quivr_core.files.file import QuivrFile

class MyCustomStorage(StorageBase):
    name = "my_custom_storage"
    
    def __init__(self):
        self._files = []
    
    def nb_files(self) -> int:
        return len(self._files)
    
    async def get_files(self) -> list[QuivrFile]:
        return self._files
    
    async def upload_file(self, file: QuivrFile, exists_ok: bool = False) -> None:
        # Custom implementation
        pass

Best Practices

  1. Deduplication: Always compute and check SHA-1 hashes before uploading to avoid duplicate files.
  1. Async Operations: All file I/O operations are asynchronous. Use await properly in async contexts.
  1. Memory Efficiency: Use __slots__ for custom file classes to reduce memory footprint.
  1. Path Handling: Use pathlib.Path for cross-platform path operations.
  1. Error Handling: Handle FileExistsError when uploading files with exists_ok=False (default).
  1. Directory Management: Ensure storage directories exist before operations using os.makedirs(path, exist_ok=True).
ComponentDescription
BrainUses Storage for file management
Processor SystemProcesses files from storage
RAG SystemRetrieves documents from storage for query answering

Source: https://github.com/QuivrHQ/quivr / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high EU AI Act Compliance Scan Results β€” Sharing Findings for Feedback

The project may affect permissions, credentials, data exposure, or host boundaries.

high [Bug]:

The project may affect permissions, credentials, data exposure, or host boundaries.

medium Integration idea: Screenpipe for screen/audio context

First-time setup may fail or require extra isolation and rollback planning.

medium [Bug]: RuntimeError: There is no current event loop in thread 'MainThread' when using Brain.from_files() in script

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Security or permission risk: EU AI Act Compliance Scan Results β€” Sharing Findings for Feedback

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: EU AI Act Compliance Scan Results β€” Sharing Findings for Feedback. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3667

2. Security or permission risk: [Bug]:

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: [Bug]:. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/2004

3. Installation risk: Integration idea: Screenpipe for screen/audio context

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: Integration idea: Screenpipe for screen/audio context. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3658

4. Installation risk: [Bug]: RuntimeError: There is no current event loop in thread 'MainThread' when using Brain.from_files() in script

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: [Bug]: RuntimeError: There is no current event loop in thread 'MainThread' when using Brain.from_files() in script. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3650

5. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:640079149 | https://github.com/QuivrHQ/quivr | README/documentation is current enough for a first validation pass.

6. Project risk: The garbage collector is trying to clean up non-checked-in connection <AdaptedConnection <asyncpg.connection.Connection…

  • Severity: medium
  • Finding: Project risk is backed by a source signal: The garbage collector is trying to clean up non-checked-in connection <AdaptedConnection <asyncpg.connection.Connection…. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3654

7. Project risk: core: v0.0.25

  • Severity: medium
  • Finding: Project risk is backed by a source signal: core: v0.0.25. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.25

8. Project risk: core: v0.0.29

  • Severity: medium
  • Finding: Project risk is backed by a source signal: core: v0.0.29. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.29

9. Project risk: core: v0.0.33

  • Severity: medium
  • Finding: Project risk is backed by a source signal: core: v0.0.33. Treat it as a review item until the current version is checked.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.33

10. Maintenance risk: core: v0.0.24

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: core: v0.0.24. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.24

11. Maintenance risk: core: v0.0.26

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: core: v0.0.26. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.26

12. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:640079149 | https://github.com/QuivrHQ/quivr | last_activity_observed missing

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using quivr with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence