Doramagic Project Pack Β· Human Manual
quivr
Quivr follows a modular architecture with the quivr-core package as its central component. The architecture is designed around a workflow-based system where different processing nodes are ...
Introduction to Quivr
Related topics: Getting Started, System Architecture Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Getting Started, System Architecture Overview
Introduction to Quivr
Quivr is an open-source project that helps you build your "second brain" by leveraging the power of Generative AI. It provides a Retrieval-Augmented Generation (RAG) framework that enables users to ingest documents and ask questions about their content using natural language. Source: README.md
The core philosophy of Quivr is to handle all the complexity of RAG implementations so developers can focus on their products. It provides an opinionated, fast, and efficient RAG system that supports multiple file types, various LLM providers, and customizable workflows. Source: README.md
Key Features
Quivr offers several distinctive capabilities that make it a powerful choice for building document-aware AI applications:
| Feature | Description |
|---|---|
| Opinionated RAG | Pre-configured RAG pipeline optimized for speed and efficiency |
| Multi-LLM Support | Works with OpenAI, Anthropic, Mistral, Gemma, and local models via Ollama |
| Any File Type | Supports PDF, TXT, Markdown, and custom parsers |
| Customizable Workflows | Extend RAG with internet search, tools, and custom processing |
| Megaparse Integration | Optional integration with Megaparse for advanced document ingestion |
| Language Detection | Automatic detection of document language for better processing |
Source: README.md
Architecture Overview
Quivr follows a modular architecture with the quivr-core package as its central component. The architecture is designed around a workflow-based system where different processing nodes are connected to form a complete RAG pipeline.
graph TD
A[User Input] --> B[Brain.ask]
B --> C[RetrievalConfig]
C --> D[Workflow Engine]
D --> E[Processing Nodes]
E --> F[LLM Response]
G[Documents] --> H[Processor]
H --> I[Chunks]
I --> J[Vector Store]
J --> DPackage Structure
The main components of the Quivr architecture include:
| Component | Purpose |
|---|---|
quivr_core.Brain | Central class for managing document collections and answering questions |
ProcessorBase | Abstract base class for document processors |
RetrievalConfig | Configuration for retrieval and workflow settings |
QuivrFile | File wrapper for document handling |
ProcessedDocument | Container for processed document chunks |
Source: core/README.md
Getting Started
Prerequisites
Before installing Quivr, ensure you have:
- Python 3.10 or newer
- An API key for your chosen LLM provider (OpenAI, Anthropic, or Mistral)
Source: README.md
Installation
Install the quivr-core package using pip:
pip install quivr-core
Source: README.md
Quick Start Example
The following example demonstrates creating a simple RAG application with Quivr in approximately 5 lines of code:
import tempfile
from quivr_core import Brain
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
temp_file.write("Gold is a liquid of blue-like colour.")
temp_file.flush()
brain = Brain.from_files(
name="test_brain",
file_paths=[temp_file.name],
)
answer = brain.ask("what is gold? answer in french")
print("answer:", answer)
Source: examples/simple_question/simple_question.py
Core Components
Brain Class
The Brain class is the central interface for interacting with Quivr. It manages document ingestion, storage, and querying.
Key Methods:
| Method | Description |
|---|---|
Brain.from_files() | Create a brain from one or more files |
brain.ask() | Ask a question about the ingested documents |
brain.print_info() | Display information about the brain |
Typical Workflow:
graph LR
A[Create Brain] --> B[from_files]
B --> C[Process Documents]
C --> D[Store Chunks]
D --> E[Ask Questions]
E --> F[Get Answers]Source: README.md
Document Processors
Processors handle the parsing and chunking of different file types. The ProcessorBase abstract class defines the interface that all processors must implement.
class ProcessorBase(ABC):
@abstractmethod
async def process_file_inner(self, file: QuivrFile) -> ProcessedDocument[R]:
raise NotImplementedError
Source: core/quivr_core/processor/processor_base.py
Processing Pipeline:
During document processing, Quivr performs the following operations:
- Parse the input file using the appropriate processor
- Split documents into chunks based on
SplitterConfig - Add metadata including chunk index, version info, and detected language
- Sanitize content by removing null characters and encoding issues
Source: core/quivr_core/processor/processor_base.py
Simple Txt Processor
The SimpleTxtProcessor is a built-in processor for handling plain text files. It uses recursive character splitting to divide documents into manageable chunks:
def recursive_character_splitter(
doc: Document, chunk_size: int, chunk_overlap: int
) -> list[Document]:
assert chunk_overlap < chunk_size, "chunk_overlap is greater than chunk_size"
if len(doc.page_content) <= chunk_size:
return [doc]
chunk = Document(page_content=doc.page_content[:chunk_size], metadata=doc.metadata)
remaining = Document(
page_content=doc.page_content[chunk_size - chunk_overlap :],
metadata=doc.metadata,
)
return [chunk] + recursive_character_splitter(remaining, chunk_size, chunk_overlap)
Source: core/quivr_core/processor/implementations/simple_txt_processor.py
Configuration
Environment Setup
Set your API keys as environment variables before creating a brain:
import os
os.environ["OPENAI_API_KEY"] = "my_openai_apikey"
Source: README.md
Retrieval Configuration
The RetrievalConfig class allows customization of the RAG workflow. Configuration can be loaded from YAML files:
from quivr_core.config import RetrievalConfig
config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)
Source: README.md
Workflow Configuration
Workflows are defined using YAML files that specify processing nodes and their connections:
workflow_config:
name: "standard RAG"
nodes:
- name: "START"
edges: ["filter_history"]
- name: "filter_history"
edges: ["rewrite"]
- name: "rewrite"
edges: ["retrieve"]
Workflow Node Types:
| Node | Function |
|---|---|
| START | Entry point for the workflow |
| filter_history | Filters conversation history |
| rewrite | Rewrites the query for better retrieval |
| retrieve | Fetches relevant document chunks |
Source: README.md
Advanced Usage
Custom Workflow with Rich Console
For interactive applications, Quivr can be combined with the rich library for enhanced console output:
from quivr_core import Brain
from quivr_core.config import RetrievalConfig
from rich.console import Console
from rich.panel import Panel
from rich.prompt import Prompt
brain = Brain.from_files(
name="my smart brain",
file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)
config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)
console = Console()
while True:
question = Prompt.ask("[bold cyan]Question[/bold cyan]")
if question.lower() == "exit":
break
answer = brain.ask(question, retrieval_config=retrieval_config)
console.print(f"[bold green]Answer[/bold green]: {answer.answer}")
Source: README.md
Supported LLM Providers
Quivr integrates with multiple LLM providers:
| Provider | Model Support | Configuration |
|---|---|---|
| OpenAI | GPT-4, GPT-3.5 | OPENAI_API_KEY |
| Anthropic | Claude family | ANTHROPIC_API_KEY |
| Mistral | Mistral models | MISTRAL_API_KEY |
| Ollama | Local models | OLLAMA_BASE_URL |
Source: README.md
Version History
Quivr follows semantic versioning for the quivr-core package. Key releases include:
| Version | Date | Key Changes |
|---|---|---|
| 0.0.27 | 2024-12-16 | Max context tokens enforcement, megaparse SDK integration |
| 0.0.19 | 2024-10-21 | Beginning of quivr-core development |
| 0.0.13 | 2024-08-01 | Added parsers and tox tests |
| 0.0.2 | 2024-07-09 | Initial quivr-core package release |
Source: core/CHANGELOG.md
Documentation and Community
Additional resources for learning and contributing to Quivr:
| Resource | Description |
|---|---|
| Official Documentation | Comprehensive guides and API reference |
| GitHub Issues | Bug reports and feature requests |
| Discord Community | Real-time support and discussions |
| Good First Issues | Beginner-friendly contribution opportunities |
Source: README.md
License
Quivr is licensed under the Apache 2.0 License, making it freely available for commercial and personal use. Source: core/README.md
Source: https://github.com/QuivrHQ/quivr / Human Manual
Getting Started
Related topics: Introduction to Quivr, Installation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Quivr, Installation
Getting Started
Overview
Quivr is an open-source RAG (Retrieval-Augmented Generation) framework that enables developers to build AI-powered "second brain" applications. The quivr-core package provides the core RAG functionality, allowing users to ingest documents and query them using large language models. Source: README.md:1-15
The framework is designed to be opinionated, fast, and efficient, abstracting away the complexity of document processing, vector storage, and LLM integration so developers can focus on building their products. Source: README.md:27-30
Prerequisites
Before getting started with Quivr, ensure your environment meets the following requirements:
| Requirement | Version | Description |
|---|---|---|
| Python | 3.10+ | The programming language runtime |
| pip | Latest | Package installer for Python |
| API Key | - | Required for cloud LLM providers (OpenAI, Anthropic, Mistral) |
Source: README.md:44-50
Supported LLM Providers
Quivr supports multiple LLM providers:
| Provider | API Type | Notes |
|---|---|---|
| OpenAI | API Key | Set OPENAI_API_KEY environment variable |
| Anthropic | API Key | Supports Claude models |
| Mistral | API Key | Supports Mistral AI models |
| Ollama | Local | For running models locally |
Source: README.md:58-62
Installation
Package Installation
Install the core package using pip:
pip install quivr-core
Source: README.md:53-55
Verify Installation
To verify the installation worked correctly, you can import the package:
from quivr_core import Brain
print("Quivr-core installed successfully!")
Core Concepts
Brain
The Brain is the central entity in Quivr that manages document ingestion and querying. It encapsulates:
- LLM Integration: The language model used for generating answers
- Embedder: The embedding model for vectorizing documents
- Vector Database: Storage for document embeddings and retrieval
- File Processors: Components that parse various file formats
Source: core/quivr_core/brain/brain.py:1-50
QuivrFile
Files are represented as QuivrFile objects with the following attributes:
| Attribute | Type | Description |
|---|---|---|
id | UUID | Unique identifier for the file |
brain_id | UUID | ID of the brain this file belongs to |
path | Path | File system path |
original_filename | str | Original filename |
file_size | int | File size in bytes |
file_extension | FileExtension | File type enum |
file_sha1 | str | SHA1 hash for deduplication |
additional_metadata | dict | Custom metadata |
Source: core/quivr_core/files/file.py:50-70
Processor Pipeline
Documents go through a processing pipeline that:
- Parses the file content based on file type
- Splits content into manageable chunks
- Detects language for each chunk
- Embeds chunks for vector storage
- Stores in the vector database
Source: core/quivr_core/processor/processor_base.py:40-60
Quick Start Guide
Minimal Example (5 Lines of Code)
The fastest way to get started with Quivr:
import tempfile
from quivr_core import Brain
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
temp_file.write("Gold is a liquid of blue-like colour.")
temp_file.flush()
brain = Brain.from_files(
name="test_brain",
file_paths=[temp_file.name],
)
answer = brain.ask("what is gold? answer in french")
print("answer:", answer)
Source: examples/simple_question/simple_question.py:1-20
Interactive Chat Example
For a more complete example with a console-based chat interface:
import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
from quivr_core import Brain
from quivr_core.config import RetrievalConfig
brain = Brain.from_files(
name="my_smart_brain",
file_paths=["./document.pdf", "./notes.txt"],
)
answer = brain.ask(
"What is the main topic of these documents?",
retrieval_config=RetrievalConfig()
)
print(answer.answer)
Source: README.md:85-100
PDF Processing Example
For processing PDF files with custom LLM configuration:
from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.rag.entities.config import LLMEndpointConfig
from quivr_core.llm.llm_endpoint import LLMEndpoint
brain = Brain.from_files(
name="test_brain",
file_paths=["tests/processor/data/dummy.pdf"],
llm=LLMEndpoint(
llm=FakeListChatModel(responses=["good"]),
llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
),
embedder=DeterministicFakeEmbedding(size=20),
)
answer = brain.ask("What is this document about?")
print(answer.answer)
Source: examples/pdf_parsing_tika.py:1-25
Workflow Architecture
Basic RAG Workflow
The following diagram illustrates the basic RAG workflow in Quivr:
graph TD
A[User Query] --> B[Filter History]
B --> C[Query Rewrite]
C --> D[Retrieval]
D --> E[LLM Generation]
E --> F[Response]
G[Document Ingestion] --> H[File Processing]
H --> I[Chunking]
I --> J[Embedding]
J --> K[Vector Storage]
D --> KData Flow
graph LR
A[Files] -->|Parse| B[Documents]
B -->|Chunk| C[Chunks]
C -->|Embed| D[Vectors]
D -->|Store| E[Vector DB]
F[Query] -->|Embed| G[Query Vector]
G -->|Search| E
E -->|Results| H[Context]
H -->|Generate| I[Answer]Configuration
Environment Variables
Configure your API keys as environment variables:
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key" # Optional
export MISTRAL_API_KEY="your-mistral-key" # Optional
Custom Retrieval Configuration
Create a YAML configuration file for customized retrieval strategies:
workflow_config:
name: "standard RAG"
nodes:
- name: "START"
edges: ["filter_history"]
- name: "filter_history"
edges: ["rewrite"]
- name: "rewrite"
edges: ["retrieval"]
- name: "retrieval"
edges: ["generation"]
- name: "generation"
edges: ["END"]
llm:
temperature: 0.7
Source: README.md:65-85
RetrievalConfig Usage
from quivr_core.config import RetrievalConfig
# Load from YAML file
retrieval_config = RetrievalConfig.from_yaml("./basic_rag_workflow.yaml")
# Use with brain.ask()
answer = brain.ask(
question="Your question here",
retrieval_config=retrieval_config
)
Supported File Types
Quivr works with various file formats through pluggable processors:
| File Type | Extension | Processor |
|---|---|---|
| Plain Text | .txt | SimpleTxtProcessor |
.pdf | TikaProcessor | |
| Markdown | .md | MarkdownProcessor |
| CSV | .csv | CSVProcessor |
| JSON | .json | JSONProcessor |
Source: README.md:31-35
Advanced Usage
Streaming Responses
For real-time response streaming:
from quivr_core import Brain
brain = Brain.from_files(
name="streaming_brain",
file_paths=["./documents/"],
)
for chunk in brain.answer_astream("Explain the findings"):
if chunk.last_chunk:
break
print(chunk.answer, end="", flush=True)
Custom Processors
Implement your own processor by extending ProcessorBase:
from quivr_core.processor.processor_base import ProcessorBase, ProcessedDocument
class CustomProcessor(ProcessorBase):
supported_extensions = [".custom"]
async def process_file_inner(self, file: QuivrFile) -> ProcessedDocument:
# Implement custom parsing logic
pass
Project Structure
quivr/
βββ README.md # Main documentation
βββ core/
β βββ README.md # quivr-core package info
β βββ quivr_core/
β βββ brain/
β β βββ brain.py # Brain class implementation
β βββ files/
β β βββ file.py # QuivrFile dataclass
β βββ processor/
β β βββ processor_base.py
β β βββ implementations/
β βββ rag/
β βββ quivr_rag.py # RAG implementation
β βββ prompts.py # LLM prompts
βββ examples/
β βββ simple_question/ # Basic usage examples
β βββ chatbot/ # Chainlit chatbot example
β βββ pdf_parsing_tika.py # PDF processing example
Next Steps
After completing the Getting Started guide:
- Explore Examples: Check the
examples/directory for more use cases - Read Documentation: Visit core.quivr.com for full documentation
- Join Community: Connect with other users on Discord
- Contribute: Check open issues for contribution opportunities
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| ImportError: No module named 'quivr_core' | Run pip install quivr-core |
| API Key not found | Set environment variable before running |
| File parsing fails | Verify file format and size (max 20MB for examples) |
| Slow retrieval | Adjust n_results parameter or use local embeddings |
Getting Help
- Documentation: https://core.quivr.com/
- Discord: Join our community
- GitHub Issues: Report bugs
Source: https://github.com/QuivrHQ/quivr / Human Manual
Installation
Related topics: Getting Started, LLM Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Getting Started, LLM Integration
Installation
This guide covers all aspects of installing and setting up Quivr, including the core package, examples, and required dependencies.
Overview
Quivr is a RAG (Retrieval-Augmented Generation) framework that enables users to create AI-powered knowledge bases from various file types. The installation process varies depending on your use case:
- Core package installation for integrating Quivr into existing Python projects
- Example applications for testing and learning purposes
- Development setup for contributing to the project
Source: core/README.md
Prerequisites
System Requirements
| Requirement | Minimum | Recommended |
|---|---|---|
| Python Version | 3.8+ | 3.10+ |
| Operating System | Linux, macOS, Windows | Linux/macOS |
| RAM | 4 GB | 8 GB+ |
| Disk Space | 500 MB | 1 GB+ |
Note: While the core package officially requires Python 3.10 or newer for full compatibility, some examples (such as chatbot_voice) support Python 3.8 or higher.
Source: core/README.md, examples/chatbot_voice/README.md
API Keys
Quivr supports multiple LLM providers. You must configure at least one API key:
| Provider | Environment Variable | Required |
|---|---|---|
| OpenAI | OPENAI_API_KEY | Yes (if using OpenAI) |
| Anthropic | ANTHROPIC_API_KEY | Yes (if using Anthropic) |
| Mistral | MISTRAL_API_KEY | Yes (if using Mistral) |
Source: core/README.md
Installing the Core Package
Using pip (Recommended)
The simplest way to install Quivr Core is via pip:
pip install quivr-core
Verify the installation by checking that the package is importable:
from quivr_core import Brain
Source: core/README.md
Package Contents
The quivr-core package includes:
- Brain class: The main interface for creating and managing knowledge bases
- RAG components: Retrieval, processing, and answer generation pipelines
- Built-in processors: Support for PDF, TXT, Markdown, and other common formats
- Default configurations: Ready-to-use settings for quick setup
Source: core/README.md, core/quivr_core/processor/processor_base.py
Installation for Examples
Chatbot Example with Chainlit
The chatbot example demonstrates file upload and Q&A capabilities using Chainlit.
#### Using rye (Recommended)
# Clone or navigate to the chatbot directory
cd examples/chatbot
# Install dependencies with rye
rye sync
# Activate the virtual environment
source ./venv/bin/activate
#### Using pip
# Navigate to the chatbot directory
cd examples/chatbot
# Install from requirements
pip install -r requirements.txt
Source: examples/chatbot/README.md
Voice Chatbot Example
The voice chatbot example adds voice interaction capabilities.
# Navigate to the voice chatbot directory
cd examples/chatbot_voice
# Install dependencies
pip install -r requirements.lock
Source: examples/chatbot_voice/README.md
Flask-based Example (quivr-whisper)
This example uses Flask for a web server implementation.
# Install Flask and dependencies
pip install flask openai requests python-dotenv
Source: examples/quivr-whisper/README.md
Simple Question Example
The simplest example demonstrating basic usage:
# Create a Python script with the following content
import tempfile
from quivr_core import Brain
brain = Brain.from_files(
name="test_brain",
file_paths=["your_file.txt"],
)
answer = brain.ask("Your question here")
print(answer)
Requires python-dotenv for loading environment variables.
Source: examples/simple_question/simple_question.py
Environment Configuration
Required Environment Variables
Create a .env file in your project root:
# LLM API Keys (at least one required)
OPENAI_API_KEY=your_openai_api_key
# ANTHROPIC_API_KEY=your_anthropic_key
# MISTRAL_API_KEY=your_mistral_key
# Optional: For specific integrations
QUIVR_API_KEY=your_quivr_api_key
QUIVR_CHAT_ID=your_chat_id
QUIVR_BRAIN_ID=your_brain_id
QUIVR_URL=https://api.quivr.app
Loading Environment Variables
Use python-dotenv to load environment variables:
import dotenv
dotenv.load_dotenv()
Source: examples/quivr-whisper/README.md, examples/simple_question/simple_question.py
Installation Flow
graph TD
A[Start Installation] --> B{Choose Installation Type}
B --> C[Core Package Only]
B --> D[Examples & Demos]
B --> E[Development Setup]
C --> F[pip install quivr-core]
F --> G[Set API Keys]
G --> H[Ready to Integrate]
D --> I{Which Example?}
I --> J[Chatbot with Chainlit]
I --> K[Voice Chatbot]
I --> L[Flask App]
J --> M[rye sync or pip install]
K --> N[pip install -r requirements.lock]
L --> O[pip install flask openai requests]
M --> P[Run with chainlit run main.py]
N --> Q[Run with chainlit run main.py]
O --> R[Run with flask run]
E --> S[Clone Repository]
S --> T[Navigate to core/]
T --> U[Install from pyproject.toml]
U --> V[Run Tests]Verifying Installation
Quick Verification
After installation, verify that Quivr is correctly installed:
import quivr_core
print(quivr_core.__version__) # Should print the installed version
Full Installation Test
Create a test script to verify all components:
import tempfile
from quivr_core import Brain
# Create a temporary test file
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
f.write("Quivr is a RAG framework for building AI knowledge bases.")
temp_path = f.name
# Create a brain from the file
brain = Brain.from_files(
name="test_brain",
file_paths=[temp_path],
)
# Test the ask function
answer = brain.ask("What is Quivr?")
print(f"Answer: {answer}")
# Print brain info
brain.print_info()
Source: core/README.md, examples/simple_question/simple_question.py
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| ImportError: No module named 'quivr_core' | Package not installed | Run pip install quivr-core |
| AuthenticationError | Invalid API key | Verify your API key is correct |
| Version mismatch | Incompatible Python version | Ensure Python 3.10+ is used |
| File not found | Incorrect file path | Check the file path exists |
Checking Installed Version
The Quivr version is automatically tracked in document metadata:
from quivr_core.processor.processor_base import get_version
try:
from importlib.metadata import version
qvr_version = version("quivr-core")
except PackageNotFoundError:
qvr_version = "dev"
Source: core/quivr_core/processor/processor_base.py
Next Steps
After successful installation:
- Create your first Brain: Load documents and create a knowledge base
- Configure RAG: Customize retrieval strategies using YAML configuration files
- Explore Examples: Test different example applications to understand capabilities
- Read Documentation: Visit core.quivr.com for advanced usage
Source: core/README.md, examples/chatbot/README.md
Source: https://github.com/QuivrHQ/quivr / Human Manual
System Architecture Overview
Related topics: Core Components, Brain Class, RAG Implementation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Components, Brain Class, RAG Implementation
System Architecture Overview
Introduction
Quivr is an open-source framework for building RAG (Retrieval Augmented Generation) applications that enables users to create personal "brains" from various document types. The system leverages generative AI to provide intelligent question-answering capabilities over user-provided documents.
The architecture is designed around three core pillars:
- Document Processing: Extracting and chunking content from files
- Vector Storage: Storing embeddings for semantic search
- LLM-powered Answer Generation: Using large language models to generate answers based on retrieved context
High-Level Architecture
graph TD
A[User Files] --> B[File Processing]
B --> C[Chunking & Embedding]
C --> D[Vector Database]
E[User Query] --> F[Embedding]
F --> G[Semantic Search]
G --> H[Context Assembly]
H --> I[LLM Generation]
I --> J[Answer Response]
D --> GCore Components
Brain Class
The Brain class is the central orchestrator of the Quivr framework. It manages the entire lifecycle from file ingestion to query answering.
File: core/quivr_core/brain/brain.py
#### Key Responsibilities
| Responsibility | Description |
|---|---|
| File Ingestion | Processes files through various parsers |
| Vector Storage | Manages the vector database for embeddings |
| Query Processing | Handles search and retrieval operations |
| Answer Generation | Orchestrates LLM-based response generation |
#### Core Methods
| Method | Type | Purpose |
|---|---|---|
from_files | Synchronous | Create a Brain from file paths |
afrom_files | Async | Async creation from files |
afrom_langchain_documents | Async | Create from LangChain Document objects |
asearch | Async | Search for relevant documents |
ask_streaming | Async Generator | Stream answers to questions |
#### Initialization Parameters
class Brain:
def __init__(
self,
id: UUID,
name: str,
storage: BrainStorage | None,
llm: LLMEndpoint,
embedder: Embeddings,
vector_db: QuivrVectorStore,
)
Source: core/quivr_core/brain/brain.py:1-100
File Processing Pipeline
The processor system handles extraction and transformation of various file formats.
File: core/quivr_core/processor/processor_base.py
#### Processing Flow
graph LR
A[QuivrFile] --> B[process_file]
B --> C[process_file_inner]
C --> D[ProcessedDocument]
D --> E[Metadata Enrichment]
E --> F[Chunk Output]#### Metadata Enrichment
During processing, each chunk receives comprehensive metadata:
doc.metadata = {
"chunk_index": idx,
"quivr_core_version": qvr_version,
"language": detect_language(text=...),
**file.metadata,
**doc.metadata,
**self.processor_metadata,
}
Source: core/quivr_core/processor/processor_base.py:20-35
QuivrFile Data Model
File: core/quivr_core/files/file.py
The QuivrFile class represents uploaded files with their associated metadata.
class QuivrFile:
__slots__ = [
"id",
"brain_id",
"path",
"original_filename",
"file_size",
"file_extension",
"file_sha1",
"additional_metadata",
]
| Field | Type | Description |
|---|---|---|
id | UUID | Unique identifier |
brain_id | UUID | Associated brain identifier |
path | Path | File system path |
original_filename | str | Original file name |
file_size | int | File size in bytes |
file_extension | FileExtension | File type |
file_sha1 | str | SHA1 hash for deduplication |
Source: core/quivr_core/files/file.py:1-50
Retrieval and Answer Generation
RAG Architecture
The RAG (Retrieval Augmented Generation) system combines semantic search with LLM-powered answer generation.
File: core/quivr_core/rag/quivr_rag_langgraph.py
#### RAG Workflow
graph TD
A[User Question] --> B[Filter History]
B --> C[Query Rewrite]
C --> D[Retrieval]
D --> E[Context Assembly]
E --> F[LLM Generation]
F --> G[Streaming Response]
H[System Prompt] --> F
I[Chat History] --> B
J[File Filters] --> D#### Configuration
The system uses RetrievalConfig for configuring the RAG pipeline:
| Parameter | Type | Default | Description |
|---|---|---|---|
llm_config | LLMEndpointConfig | Brain's LLM | LLM model configuration |
temperature | float | 0.7 | Generation temperature |
n_results | int | 5 | Number of retrieval results |
Source: core/quivr_core/rag/quivr_rag.py:1-50
Streaming Answer Generation
The system supports streaming responses for real-time answer delivery:
async for response in rag_instance.answer_astream(
run_id=run_id,
question=question,
system_prompt=system_prompt or None,
history=chat_history,
list_files=list_files,
metadata=metadata,
):
if not response.last_chunk:
yield response
Source: core/quivr_core/brain/brain.py:150-170
Prompt Templates
File: core/quivr_core/rag/prompts.py
The system uses structured prompts with multiple context sections:
| Section | Purpose |
|---|---|
user_metadata | User-specific context |
ticket_metadata | Query-related metadata |
similar_tickets | Reference information |
ticket_history | Conversation history |
additional_information | Extra context |
client_query | The actual question |
Default instructions ensure consistent response quality:
- Verbosity matching similar responses
- Proper formatting with paragraphs, bold, italic
- Language consistency with the query
- No signature at end of response
LLM Integration
Default LLM Configuration
The system supports multiple LLM providers:
| Provider | Configuration |
|---|---|
| OpenAI | OPENAI_API_KEY environment variable |
| Anthropic | Anthropic API support |
| Mistral | Mistral API support |
| Ollama | Local model support |
Source: README.md - Configuration section
Embedder Abstraction
Embedders are abstracted to support multiple backends:
if embedder is None:
embedder = default_embedder()
Vector DB initialization with embeddings:
if vector_db is None:
vector_db = await build_default_vectordb(langchain_documents, embedder)
Source: core/quivr_core/brain/brain.py:60-70
Search Capabilities
Async Search Method
async def asearch(
self,
query: str | Document,
n_results: int = 5,
filter: Callable | Dict[str, Any] | None = None,
fetch_n_neighbors: int = 20,
) -> list[SearchResult]
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
query | str \ | Document | Required | Search query |
n_results | int | 5 | Number of results | |
filter | Callable \ | Dict | None | Custom filtering |
fetch_n_neighbors | int | 20 | Extended fetch for re-ranking |
Source: core/quivr_core/brain/brain.py:70-85
State Management
Brain ID Generation
Each brain receives a unique identifier:
brain_id = uuid4()
Workspace and Chat Context
The system tracks conversation context:
metadata = LangchainMetadata(
langfuse_trace_id=str(run_id),
langfuse_user_id=str(self.workspace_id),
langfuse_session_id=str(self.chat_id),
)
Installation and Dependencies
Core Package
pip install quivr-core
Quick Start Example
from quivr_core import Brain
brain = Brain.from_files(
name="my_smart_brain",
file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)
answer = brain.ask("What is the main topic?")
Summary
The Quivr architecture implements a modular RAG pipeline where:
- Files are processed and chunked with metadata enrichment
- Embeddings are generated and stored in a vector database
- Queries trigger semantic search with configurable filters
- LLMs generate contextual answers from retrieved content
- Streaming enables real-time response delivery
The design prioritizes flexibility through dependency injection, allowing custom LLM providers, embedders, and vector stores while maintaining a consistent API for brain creation and querying.
Source: https://github.com/QuivrHQ/quivr / Human Manual
Core Components
Related topics: System Architecture Overview, Brain Class, LLM Integration
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture Overview, Brain Class, LLM Integration
Core Components
Quivr is an open-source RAG (Retrieval-Augmented Generation) framework that enables users to create "brains" from various document types and query them using natural language. The core components form the architectural foundation that powers document processing, vector storage, retrieval, and LLM-based answer generation.
Architecture Overview
Quivr's architecture follows a modular design with clear separation of concerns across several key layers:
graph TD
A[User Query] --> B[Brain.ask]
B --> C[RetrievalConfig]
C --> D[QuivrQARAGLangGraph]
D --> E[Vector Store Retrieval]
E --> F[Context Chunks]
F --> G[LLM Generation]
G --> H[Streaming Response]
I[Files] --> J[Processor]
J --> K[ProcessedDocument]
K --> L[Vector DB Indexing]
L --> E
M[LLM Config] --> G
N[Embedder] --> LThe system is built around three primary abstractions:
| Component | Purpose | Key Files |
|---|---|---|
| Brain | Central orchestrator managing files, vectors, and LLM interactions | brain.py |
| RAG Engine | Handles retrieval and answer generation pipeline | quivr_rag.py |
| Processors | Parse and chunk various file formats | processor_base.py |
Source: https://github.com/QuivrHQ/quivr / Human Manual
Brain Class
Related topics: RAG Implementation, File Processing and Parsers, Storage System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: RAG Implementation, File Processing and Parsers, Storage System
Brain Class
The Brain class is the central abstraction in Quivr for building Retrieval Augmented Generation (RAG) systems. It serves as the primary interface for creating knowledge bases from documents, performing semantic search, and generating AI-powered answers based on retrieved context.
Overview
The Brain class encapsulates:
- Vector storage for semantic document indexing
- LLM integration for answer generation
- Embedder configuration for document vectorization
- File processing pipeline for document ingestion
- Chat history management for conversational context
A Brain can be created from files (PDF, TXT, Markdown, etc.) or from LangChain documents, then queried to generate context-aware responses.
Architecture
graph TD
A[Files / Documents] --> B[Brain Class]
B --> C[Vector DB]
B --> D[LLM]
B --> E[Embedder]
B --> F[Storage]
G[Query] --> B
B --> H[QuivrQARAGLangGraph]
H --> I[Answer]
C --> HCreating a Brain
From Files
The most common way to create a Brain is from a collection of files:
from quivr_core import Brain
brain = Brain.from_files(
name="my_smart_brain",
file_paths=["./my_first_doc.pdf", "./my_second_doc.txt"],
)
Source: examples/simple_question/simple_question.py:1-14
From LangChain Documents
For programmatic document creation:
from langchain_core.documents import Document
from quivr_core import Brain
documents = [Document(page_content="Hello, world!")]
brain = await Brain.afrom_langchain_documents(name="My Brain", langchain_documents=documents)
Source: core/quivr_core/brain/brain.py:1-150
With Custom LLM and Embedder
Override default configurations with custom implementations:
from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.rag.entities.config import LLMEndpointConfig
from quivr_core.llm.llm_endpoint import LLMEndpoint
brain = Brain.from_files(
name="test_brain",
file_paths=["tests/processor/data/dummy.pdf"],
llm=LLMEndpoint(
llm=FakeListChatModel(responses=["good"]),
llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
),
embedder=DeterministicFakeEmbedding(size=20),
)
Source: examples/pdf_parsing_tika.py:1-20
Core Methods
Asking Questions
#### Synchronous
answer = brain.ask(
"what is gold? answer in french",
retrieval_config=retrieval_config
)
print("answer:", answer)
#### Asynchronous Streaming
async for chunk in brain.ask_streaming("What is the meaning of life?"):
print(chunk.answer)
Source: core/quivr_core/brain/brain.py:150-250
Search
#### Async Search
results = await brain.asearch(
query="your search query",
n_results=5,
filter=None,
fetch_n_neighbors=20
)
| Parameter | Type | Default | Description | ||
|---|---|---|---|---|---|
query | `str \ | Document` | Required | The search query | |
n_results | int | 5 | Number of results to return | ||
filter | `Callable \ | Dict \ | None` | None | Optional filter for results |
fetch_n_neighbors | int | 20 | Number of neighbors to fetch |
Source: core/quivr_core/brain/brain.py:100-120
Storage and Persistence
Saving a Brain
Brains can be persisted to disk for later reuse:
save_path = await brain.save("/home/user/.local/quivr")
Source: examples/save_load_brain.py:1-22
Loading a Brain
brain_loaded = Brain.load(save_path)
brain_loaded.print_info()
Source: examples/save_load_brain.py:18
File Processing
QuivrFile Entity
The Brain processes files through the QuivrFile dataclass:
class QuivrFile:
__slots__ = [
"id",
"brain_id",
"path",
"original_filename",
"file_size",
"file_extension",
"file_sha1",
"additional_metadata",
]
Source: core/quivr_core/files/file.py:20-35
File Metadata
During processing, each chunk receives metadata including:
| Field | Description |
|---|---|
chunk_index | Position of the chunk in the document |
quivr_core_version | Version of quivr-core used |
language | Detected language of the content |
original_file_name | Source filename |
Source: core/quivr_core/processor/processor_base.py:1-50
Retrieval Configuration
The RetrievalConfig controls the RAG pipeline behavior:
from quivr_core.config import RetrievalConfig
config_file_name = "./basic_rag_workflow.yaml"
retrieval_config = RetrievalConfig.from_yaml(config_file_name)
YAML Configuration Example
workflow_config:
name: "standard RAG"
nodes:
- name: "START"
edges: ["filter_history"]
- name: "filter_history"
edges: ["rewrite"]
- name: "rewrite"
edges: [...]
Complete Usage Example
import tempfile
from quivr_core import Brain
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as temp_file:
temp_file.write("Gold is a liquid of blue-like colour.")
temp_file.flush()
brain = Brain.from_files(
name="test_brain",
file_paths=[temp_file.name],
)
answer = brain.ask("what is gold? answer in french")
print("answer:", answer)
Source: README.md
Attributes Summary
| Attribute | Type | Description |
|---|---|---|
id | UUID | Unique identifier for the brain |
name | str | Human-readable name |
vector_db | VectorStore | Vector storage for embeddings |
llm | LLMEndpoint | Language model for generation |
embedder | Embeddings | Embedding model for vectorization |
storage | BrainStorage | Persistence layer |
Module Export
The Brain class is exported from the main quivr_core package:
from quivr_core import Brain
Source: https://github.com/QuivrHQ/quivr / Human Manual
RAG Implementation
Related topics: Brain Class
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Brain Class
RAG Implementation
Overview
The RAG (Retrieval-Augmented Generation) implementation in Quivr provides a flexible, opinionated framework for building Retrieval-Augmented Generation pipelines. It combines vector-based document retrieval with Large Language Model (LLM) generation to enable question-answering over uploaded documents and files.
The implementation supports both synchronous and streaming responses, configurable retrieval strategies, and integration with multiple LLM providers including OpenAI, Anthropic, Mistral, and local models via Ollama.
Architecture
graph TD
A[User Query] --> B[Brain.ask]
B --> C[RAG Pipeline]
C --> D[Retrieval Phase]
C --> E[Generation Phase]
D --> F[Vector DB Search]
F --> G[Relevant Chunks]
E --> H[LLM Processing]
G --> H
H --> I[Streaming Response]
I --> J[ParsedRAGChunkResponse]Core Components
Brain Class
The Brain class serves as the main entry point for RAG operations. It orchestrates document processing, retrieval, and generation.
Key Methods:
| Method | Description | Return Type |
|---|---|---|
from_files() | Create a brain from file paths | Brain |
afrom_langchain_documents() | Create a brain from LangChain documents | Brain |
ask() | Synchronous question answering | RAGResponse |
asearch() | Search for relevant documents | list[SearchResult] |
Source: core/quivr_core/brain/brain.py:1-100
RAG Pipeline Flow
sequenceDiagram
participant User
participant Brain
participant Retriever
participant VectorDB
participant LLM
participant Response
User->>Brain: ask(question)
Brain->>Retriever: retrieve(query)
Retriever->>VectorDB: similarity_search()
VectorDB-->>Retriever: relevant_chunks
Retriever-->>Brain: context_chunks
Brain->>LLM: generate(context, question)
LLM-->>Response: ParsedRAGChunkResponse
Response-->>User: Streaming AnswerStreaming Response System
ParsedRAGChunkResponse
The streaming response is built around the ParsedRAGChunkResponse data class, which provides incremental chunks of the generated answer along with metadata.
Chunk Structure:
| Field | Type | Description |
|---|---|---|
answer | str | The partial or complete answer text |
metadata | dict | Sources and additional context |
last_chunk | bool | Indicates final chunk of response |
sources | list | Retrieved document sources |
chunk_id | int | Sequence number of the chunk |
Source: core/quivr_core/rag/quivr_rag.py:1-50
Answer Streaming Implementation
The answer_astream method implements asynchronous streaming of LLM responses:
async def answer_astream(self, query: str, ...) -> AsyncGenerator[ParsedRAGChunkResponse]:
# Processing logic yields ParsedRAGChunkResponse chunks
yield ParsedRAGChunkResponse(
answer="",
metadata=get_chunk_metadata(rolling_message, sources),
last_chunk=True,
)
Streaming Characteristics:
- Yields multiple
ParsedRAGChunkResponseobjects during generation - Each chunk contains accumulated
rolling_messagecontent - Uses
chunk_idfor tracking sequence order - Final chunk marked with
last_chunk=True - Metadata includes retrieved sources for citation
Source: core/quivr_core/rag/quivr_rag.py:50-100
Retrieval Configuration
RetrievalConfig
The retrieval behavior is controlled through RetrievalConfig which supports YAML-based configuration:
workflow_config:
name: "standard RAG"
nodes:
- name: "START"
edges: ["filter_history"]
- name: "filter_history"
edges: ["rewrite"]
- name: "rewrite"
Source: README.md
Configuration Parameters
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
n_results | int | 5 | Number of documents to retrieve | |
fetch_n_neighbors | int | 20 | Number of neighbors to fetch from vector DB | |
filter | `Callable \ | Dict` | None | Optional metadata filtering |
Source: core/quivr_core/rag/entities/config.py
Search Operation
Async Search Method
async def asearch(
self,
query: str | Document,
n_results: int = 5,
filter: Callable | Dict[str, Any] | None = None,
fetch_n_neighbors: int = 20,
) -> list[SearchResult]:
Parameters:
| Parameter | Type | Required | Description | |
|---|---|---|---|---|
query | `str \ | Document` | Yes | The search query |
n_results | int | No | Maximum results to return | |
filter | `Callable \ | Dict` | No | Metadata filter condition |
fetch_n_neighbors | int | No | Initial fetch size before re-ranking |
Source: core/quivr_core/brain/brain.py:50-80
Document Processing Pipeline
Processor Base
The processor_base.py handles document chunking and metadata enrichment:
async def process_file(file: QuivrFile) -> ProcessedDocument:
docs = await self.process_file_inner(file)
qvr_version = version("quivr-core")
for idx, doc in enumerate(docs.chunks, start=1):
doc.metadata = {
"chunk_index": idx,
"quivr_core_version": qvr_version,
"language": detect_language(text=...).value,
**file.metadata,
**doc.metadata,
}
Metadata Enrichment:
chunk_index: Sequential position of chunkquivr_core_version: Version of quivr-corelanguage: Auto-detected language of content- Original filename embedded in content for reference
Source: core/quivr_core/processor/processor_base.py:1-50
Prompt Templates
User Prompt Template
The prompt system uses structured templates with metadata injection:
<user_metadata>
{user_metadata}
</user_metadata>
<ticket_metadata>
{ticket_metadata}
</ticket_metadata>
<similar_tickets>
{similar_tickets}
</similar_tickets>
<ticket_history>
{ticket_history}
</ticket_history>
<additional_information>
{additional_information}
</additional_information>
<client_query>
{client_query}
</client_query>
Source: core/quivr_core/rag/prompts.py
Default Instructions
The system includes default instructions that guide response generation:
| Instruction | Description |
|---|---|
| Conciseness | Use same level of detail as similar responses |
| Formatting | Proper paragraphs, bold, italic for readability |
| Language | Respond in same language as user query |
| Consistency | Maintain terminology consistency |
| No Signature | Signature added separately after response |
LangGraph Integration
The implementation includes a LangGraph-based RAG pipeline (quivr_rag_langgraph.py) for more complex workflows:
graph LR
A[Query] --> B[History Filter]
B --> C[Query Rewrite]
C --> D[Retrieval]
D --> E[Answer Generation]
E --> F[Response]Features:
- Multi-step processing pipelines
- Conversation history integration
- Query rewriting for better retrieval
- Customizable workflow nodes
Source: core/quivr_core/rag/quivr_rag_langgraph.py
LLM Configuration
LLMEndpointConfig
| Parameter | Type | Description |
|---|---|---|
model | str | Model identifier |
llm_base_url | str | API endpoint URL |
temperature | float | Generation temperature (default: 0.7) |
Source: core/quivr_core/rag/entities/config.py
Supported Providers
- OpenAI: GPT-4, GPT-3.5-turbo
- Anthropic: Claude models
- Mistral: Mistral AI models
- Ollama: Local model support
Usage Example
from quivr_core import Brain
from quivr_core.config import RetrievalConfig
# Create brain from files
brain = Brain.from_files(
name="my_brain",
file_paths=["./document.pdf", "./notes.txt"],
)
# Configure retrieval
retrieval_config = RetrievalConfig.from_yaml("./workflow.yaml")
# Ask a question
answer = brain.ask(
"What is the main topic of these documents?",
retrieval_config=retrieval_config
)
print(answer.answer)
Source: examples/simple_question/simple_question.py
Key Design Patterns
| Pattern | Implementation |
|---|---|
| Async/Await | All I/O operations are asynchronous |
| Streaming | Responses streamed via async generators |
| Dependency Injection | LLM and embedder are configurable |
| Configuration-driven | Workflows defined via YAML |
| Metadata Enrichment | Automatic language detection and versioning |
Source: https://github.com/QuivrHQ/quivr / Human Manual
LLM Integration
Related topics: Core Components, System Architecture Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Components, System Architecture Overview
LLM Integration
Overview
The LLM Integration module in Quivr provides a flexible, abstracted interface for interacting with Large Language Models (LLMs) from various providers. This module enables Quivr's RAG (Retrieval-Augmented Generation) pipeline to leverage different LLM backends without requiring changes to the core business logic. The integration supports OpenAI, Anthropic, Mistral, Meta (Llama), Groq, and local models via Ollama.
The primary goals of the LLM Integration are:
- Provider Abstraction: Uniform API regardless of the underlying LLM provider
- Configuration Management: Centralized configuration for model parameters, token limits, and provider-specific settings
- Runtime Flexibility: Support for both synchronous and asynchronous operations
- Embeddings Integration: Seamless integration with embedding models for semantic search
Source: core/quivr_core/llm/llm_endpoint.py
Architecture
High-Level Components
The LLM Integration consists of several key components:
| Component | Purpose | Location |
|---|---|---|
LLMEndpoint | Main wrapper class for LLM interactions | quivr_core/llm/llm_endpoint.py |
LLMEndpointConfig | Configuration dataclass for LLM settings | quivr_core/rag/entities/config.py |
LLMConfig | Per-model configuration with token limits | quivr_core/rag/entities/config.py |
DefaultModelSuppliers | Enum for supported LLM providers | quivr_core/rag/entities/config.py |
Class Diagram
classDiagram
class LLMEndpoint {
+llm: BaseChatModel
+llm_config: LLMEndpointConfig
+__init__(llm, llm_config)
+get_client() BaseChatModel
}
class LLMEndpointConfig {
+model: str
+llm_base_url: str
+temperature: float
+max_output_tokens: int
+model_type: LLMEndpointType
}
class LLMConfig {
+max_context_tokens: int
+max_output_tokens: int
+tokenizer_hub: str
}
class DefaultModelSuppliers {
<<enumeration>>
OPENAI
ANTHROPIC
MISTRAL
META
GROQ
OLLAMA
}
LLMEndpoint --> LLMEndpointConfig
LLMEndpointConfig ..> DefaultModelSuppliersSource: core/quivr_core/llm/llm_endpoint.py Source: core/quivr_core/rag/entities/config.py
Configuration
LLMEndpointConfig Parameters
The LLMEndpointConfig class provides configuration for LLM endpoints:
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | Required | Model identifier (e.g., "gpt-4o", "claude-3-opus") |
llm_base_url | str | "local" | Base URL for the LLM API endpoint |
temperature | float | 0.7 | Sampling temperature for generation (0.0-2.0) |
max_output_tokens | int | 4096 | Maximum tokens in the generated response |
model_type | LLMEndpointType | LLMEndpointType.CHAT | Type of LLM endpoint |
Source: core/quivr_core/rag/entities/config.py
Default Model Configurations
Quivr provides pre-configured settings for various models through the DefaultModelSuppliers enum and associated LLMConfig dictionaries:
| Provider | Model | Max Context Tokens | Max Output Tokens | Tokenizer Hub |
|---|---|---|---|---|
| OpenAI | gpt-4o | 128,000 | 16,384 | Quivr/claude-tokenizer |
| OpenAI | gpt-4-turbo | 128,000 | 4,096 | Quivr/claude-tokenizer |
| Anthropic | claude-3.5-sonnet | 200,000 | 8,192 | Quivr/claude-tokenizer |
| Anthropic | claude-3-opus | 200,000 | 4,096 | Quivr/claude-tokenizer |
| Mistral | mistral-large | 32,000 | N/A | - |
| Meta | llama-3.1 | 128,000 | 4,096 | Quivr/Meta-Llama-3.1-Tokenizer |
| Meta | llama-3 | 8,192 | 2,048 | Quivr/llama3-tokenizer-new |
| Groq | llama-3.3-70b | 128,000 | 32,768 | Quivr/Meta-Llama-3.1-Tokenizer |
Source: core/quivr_core/rag/entities/config.py
Environment Variables
API keys should be set as environment variables before initializing the LLM:
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
Example usage in code:
import os
os.environ["OPENAI_API_KEY"] = "myopenai_apikey"
Source: README.md
Usage Patterns
Basic Integration with Brain
The most common pattern is to pass an LLMEndpoint instance when creating a Brain:
from langchain_openai import ChatOpenAI
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig
brain = Brain.from_files(
name="my_smart_brain",
file_paths=["./documents/*.pdf"],
llm=LLMEndpoint(
llm_config=LLMEndpointConfig(model="gpt-4o"),
llm=ChatOpenAI(model="gpt-4o", api_key=str(os.getenv("OPENAI_API_KEY"))),
),
)
Source: examples/simple_question_megaparse.py
Using Fake LLM for Testing
For testing purposes, Quivr supports fake LLM implementations:
from langchain_core.language_models import FakeListChatModel
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig
from langchain_core.embeddings import DeterministicFakeEmbedding
brain = Brain.from_files(
name="test_brain",
file_paths=["tests/processor/data/dummy.pdf"],
llm=LLMEndpoint(
llm=FakeListChatModel(responses=["good"]),
llm_config=LLMEndpointConfig(model="fake_model", llm_base_url="local"),
),
embedder=DeterministicFakeEmbedding(size=20),
)
Source: examples/pdf_parsing_tika.py
Custom Embeddings Integration
When using custom LLM configurations, you can also specify custom embedders:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from quivr_core import Brain
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig
# LLM Configuration
llm = LLMEndpoint(
llm_config=LLMEndpointConfig(model="gpt-4o"),
llm=ChatOpenAI(model="gpt-4o", api_key=str(os.getenv("OPENAI_API_KEY"))),
)
# Embedder Configuration
embedder = OpenAIEmbeddings(model="text-embedding-3-large")
brain = Brain.from_files(
name="custom_brain",
file_paths=["./data/documents/"],
llm=llm,
embedder=embedder,
)
Source: examples/simple_question_megaparse.py
Supported Providers
OpenAI
OpenAI models are supported through the langchain_openai package:
from langchain_openai import ChatOpenAI
from quivr_core.llm.llm_endpoint import LLMEndpoint
from quivr_core.rag.entities.config import LLMEndpointConfig
llm = LLMEndpoint(
llm=ChatOpenAI(model="gpt-4o", api_key=api_key),
llm_config=LLMEndpointConfig(
model="gpt-4o",
temperature=0.7,
),
)
Supported models: gpt-4o, gpt-4-turbo, gpt-3.5-turbo
Anthropic
Anthropic models require the langchain-anthropic package:
from langchain_anthropic import ChatAnthropic
from quivr_core.llm.llm_endpoint import LLMEndpoint
llm = LLMEndpoint(
llm=ChatAnthropic(model="claude-3-5-sonnet-20241022", anthropic_api_key=api_key),
llm_config=LLMEndpointConfig(model="claude-3.5-sonnet"),
)
Mistral
Mistral models are supported via their API:
from langchain_mistralai import ChatMistralAI
llm = LLMEndpoint(
llm=ChatMistralAI(model="mistral-large-latest", mistral_api_key=api_key),
llm_config=LLMEndpointConfig(model="mistral-large"),
)
Local Models (Ollama)
For local inference using Ollama:
from langchain_ollama import ChatOllama
llm = LLMEndpoint(
llm=ChatOllama(model="llama3.1", base_url="http://localhost:11434"),
llm_config=LLMEndpointConfig(model="llama-3.1", llm_base_url="http://localhost:11434"),
)
Source: README.md
RAG Workflow Integration
The LLM Integration is a core component of Quivr's RAG pipeline. When processing a user query, the LLM is responsible for generating the final answer based on retrieved context.
graph TD
A[User Query] --> B[Brain.ask]
B --> C[Vector Search]
C --> D[Retrieve Relevant Chunks]
D --> E[LLMEndpoint]
E --> F[Generate Answer]
F --> G[RAG Response]
H[LLMEndpointConfig] --> E
I[Chat History] --> E
J[System Prompt] --> EThe LLM receives contextual information from the retrieval step and generates responses that are then formatted and returned to the user. The quivr_rag_langgraph.py module orchestrates this workflow using LangGraph for complex graph-based processing.
Source: core/quivr_core/rag/quivr_rag.py Source: core/quivr_core/rag/quivr_rag_langgraph.py
Default LLM Fallback
If no LLM is explicitly provided when creating a Brain, Quivr automatically initializes a default LLM:
# From brain.py source code
if llm is None:
llm = default_llm()
This fallback mechanism ensures that users can get started quickly without explicit configuration, while still allowing advanced users to customize their LLM setup.
Source: core/quivr_core/brain/brain.py
Best Practices
API Key Security
- Store API keys in environment variables, never hardcode them
- Use
.envfiles with proper.gitignoreentries for local development - Consider using secret management services (AWS Secrets Manager, HashiCorp Vault) for production
Model Selection
- Choose models based on your use case requirements (speed vs. quality)
- Use smaller models for simple queries to reduce costs
- Reserve larger models (e.g., GPT-4o, Claude 3.5) for complex reasoning tasks
Token Management
- Monitor
max_context_tokensto avoid exceeding model limits - Set appropriate
max_output_tokensbased on expected response length - Use the
RetrievalConfigto control how many chunks are fed to the LLM
Temperature Settings
| Temperature | Use Case | Characteristics |
|---|---|---|
| 0.0 - 0.3 | Factual/Deterministic | More focused, consistent responses |
| 0.4 - 0.7 | General Purpose | Balanced creativity and consistency |
| 0.8 - 1.0 | Creative/Brainstorming | More varied, potentially surprising outputs |
Troubleshooting
Common Issues
`` Error: OPENAI_API_KEY environment variable not set ` Solution: Ensure the API key is set before running the script: `bash export OPENAI_API_KEY="your-key" ``
- API Key Not Found
Solution: Reduce the number of retrieved chunks or adjust chunk size in processing
- Model Context Limit Exceeded
Solution: Implement retry logic or use a rate limiting middleware
- Rate Limiting
Source: README.md
See Also
Source: https://github.com/QuivrHQ/quivr / Human Manual
File Processing and Parsers
Related topics: Storage System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Storage System
File Processing and Parsers
Quivr's file processing system provides a extensible architecture for ingesting, parsing, and chunking various document formats for use in Retrieval-Augmented Generation (RAG) workflows. The processor subsystem sits at the core of Quivr's data ingestion pipeline, transforming raw files into structured document chunks ready for embedding and retrieval.
Architecture Overview
The file processing architecture follows a plugin-based design pattern where processors are registered by file extension and lazily loaded on demand. This separation of concerns allows new file format support to be added without modifying core processing logic.
graph TD
A[QuivrFile Input] --> B[Processor Registry]
B --> C{File Extension}
C -->|.txt| D[SimpleTxtProcessor]
C -->|.pdf| E[MegaparseProcessor]
C -->|Other| F[Default Processors]
D --> G[ProcessedDocument]
E --> G
F --> G
G --> H[Document Chunks]
H --> I[RAG Pipeline]Core Components
| Component | Purpose | Location |
|---|---|---|
ProcessorBase | Abstract base class for all processors | processor_base.py |
ProcessedDocument | Generic container for processing results | processor_base.py |
ProcessorRegistry | Maps extensions to processor classes | registry.py |
SplitterConfig | Configuration for text chunking | splitter.py |
Source: processor_base.py:1-75, registry.py
Source: https://github.com/QuivrHQ/quivr / Human Manual
Storage System
Related topics: File Processing and Parsers, Brain Class
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: File Processing and Parsers, Brain Class
Storage System
Overview
The Storage System in Quivr is a modular abstraction layer responsible for managing file uploads, storage, and retrieval across different storage backends. It provides a clean separation between the RAG (Retrieval-Augmented Generation) logic and the underlying file persistence mechanisms, enabling Quivr to work with various storage implementations while maintaining a consistent API.
The system is designed to handle:
- File upload and deduplication using SHA-1 hashing
- Asynchronous file access
- Multiple storage backends (currently supporting local filesystem)
- File metadata tracking and organization by brain
Source: core/quivr_core/storage/storage_base.py:1-47
Architecture
High-Level Architecture
graph TD
A[Brain] --> B[StorageBase]
B --> C[LocalStorage]
B --> D[Future: CloudStorage]
B --> E[Future: S3Storage]
C --> F[QuivrFile]
C --> G[File System]
F --> H[Metadata]
F --> I[SHA-1 Hash]Component Hierarchy
classDiagram
class StorageBase {
<<abstract>>
+name: str
+nb_files() int
+get_files() List~QuivrFile~
+upload_file(file, exists_ok)*
}
class LocalStorage {
+name: str = "local_storage"
+files: List~QuivrFile~
+hashes: Set~str~
+copy_flag: bool
+dir_path: Path
}
class QuivrFile {
+id: UUID
+brain_id: UUID
+path: Path
+original_filename: str
+file_size: int
+file_extension: FileExtension
+file_sha1: str
+additional_metadata: dict
}
StorageBase <|-- LocalStorageCore Components
StorageBase (Abstract Base Class)
The StorageBase is an abstract base class that defines the contract for all storage implementations. It enforces a consistent interface across different storage backends through Python's ABC (Abstract Base Class) mechanism.
Source: core/quivr_core/storage/storage_base.py:19-47
#### Required Subclass Attributes
| Attribute | Type | Description |
|---|---|---|
name | str | Human-readable name of the storage type |
#### Abstract Methods
| Method | Return Type | Description |
|---|---|---|
nb_files() | int | Returns the total number of files stored |
get_files() | List[QuivrFile] | Asynchronously retrieves all files in storage |
upload_file(file, exists_ok) | None | Uploads a file to the storage |
The base class enforces that subclasses must define the name attribute through __init_subclass__:
def __init_subclass__(cls, **kwargs):
for required in ("name",):
if not getattr(cls, required):
raise TypeError(
f"Can't instantiate abstract class {cls.__name__} without {required} attribute defined"
)
return super().__init_subclass__(**kwargs)
Source: core/quivr_core/storage/storage_base.py:19-27
LocalStorage Implementation
LocalStorage is the concrete implementation of StorageBase that persists files to the local filesystem.
Source: core/quivr_core/storage/local_storage.py:1-50
#### Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
name | str | "local_storage" | Storage type identifier |
files | List[QuivrFile] | [] | In-memory list of stored files |
hashes | Set[str] | set() | Set of SHA-1 hashes for deduplication |
copy_flag | bool | True | If True, copy files; if False, create symlinks |
dir_path | Path | See below | Directory path for file storage |
#### Directory Resolution
The storage directory is resolved in the following order:
- Explicitly provided
dir_pathargument - Environment variable
QUIVR_LOCAL_STORAGE - Default path:
~/.cache/quivr/files
if dir_path is None:
self.dir_path = Path(
os.getenv("QUIVR_LOCAL_STORAGE", "~/.cache/quivr/files")
)
else:
self.dir_path = dir_path
os.makedirs(self.dir_path, exist_ok=True)
Source: core/quivr_core/storage/local_storage.py:38-46
QuivrFile Data Model
QuivrFile represents a file stored in the Quivr system with all associated metadata.
Source: core/quivr_core/files/file.py:48-74
#### Class Slots
The class uses __slots__ for memory-efficient attribute storage:
| Slot | Type | Description | |
|---|---|---|---|
id | UUID | Unique identifier for the file | |
brain_id | `UUID \ | None` | ID of the brain this file belongs to |
path | Path | Actual filesystem path to the file | |
original_filename | str | Original name of the uploaded file | |
file_size | `int \ | None` | Size of the file in bytes |
file_extension | `FileExtension \ | str` | File extension/type |
file_sha1 | str | SHA-1 hash of the file content | |
additional_metadata | dict | Custom metadata dictionary |
Source: core/quivr_core/files/file.py:49-57
#### Constructor Parameters
def __init__(
self,
id: UUID,
original_filename: str,
path: Path,
file_sha1: str,
file_extension: FileExtension | str,
brain_id: UUID | None = None,
file_size: int | None = None,
metadata: dict[str, Any] | None = None,
) -> None:
Source: core/quivr_core/files/file.py:59-70
#### Async File Access
The QuivrFile class provides an async context manager for reading file contents:
@asynccontextmanager
async def open(self) -> AsyncGenerator[AsyncIterable[bytes], None]:
f = await aiofiles.open(self.path, mode="rb")
try:
yield f
finally:
await f.close()
Source: core/quivr_core/files/file.py:76-83
File Upload Workflow
sequenceDiagram
participant Client
participant LocalStorage
participant FileSystem
Client->>LocalStorage: upload_file(QuivrFile)
LocalStorage->>LocalStorage: Check SHA-1 hash
alt file exists and exists_ok=False
LocalStorage-->>Client: FileExistsError
else file doesn't exist
LocalStorage->>FileSystem: Copy or Symlink
FileSystem-->>LocalStorage: Success
LocalStorage->>LocalStorage: Update files list
LocalStorage->>LocalStorage: Add hash to hashes set
endUpload Process Details
- File Path Construction: Files are stored at
{dir_path}/{brain_id}/{file_id}
- Deduplication: Before upload, the system checks if a file with the same SHA-1 hash already exists using
file_sha1:
``python if file.file_sha1 in self.hashes and not exists_ok: raise FileExistsError(...) ``
- Storage Strategy: Based on
copy_flag:
True: Copy file content to destinationFalse: Create symbolic link to original file
- File Registration: After successful upload, the file is added to the internal tracking list.
Source: core/quivr_core/storage/local_storage.py:48-70
File Creation Helper
The get_file_path() function creates a QuivrFile instance from a filesystem path:
def get_file_path(
path: Path,
brain_id: UUID | None = None
) -> QuivrFile:
Processing Steps
| Step | Operation |
|---|---|
| 1 | Get file size using os.path.getsize() |
| 2 | Compute SHA-1 hash by reading file bytes |
| 3 | Extract or generate UUID for file ID |
| 4 | Determine file extension |
| 5 | Create and return QuivrFile instance |
file_size = os.path.getsize(path)
with aiofiles.open(path, mode="rb") as f:
file_sha1 = hashlib.sha1(await f.read()).hexdigest()
try:
id = UUID(path.name)
except ValueError:
id = uuid4()
Source: core/quivr_core/storage/file.py:19-43
Configuration Options
LocalStorage Configuration
| Parameter | Type | Default | Description | |
|---|---|---|---|---|
dir_path | `Path \ | None` | None | Custom storage directory |
copy_flag | bool | True | File storage method |
Environment Variables
| Variable | Description |
|---|---|
QUIVR_LOCAL_STORAGE | Override default storage directory |
Usage Examples
Creating a LocalStorage Instance
from quivr_core.storage.local_storage import LocalStorage
from pathlib import Path
# Use default directory (~/.cache/quivr/files)
storage = LocalStorage()
# Use custom directory
storage = LocalStorage(dir_path=Path("/my/custom/storage"))
# Use symlinks instead of copying
storage = LocalStorage(dir_path=Path("/mnt/data"), copy_flag=False)
Checking Storage Information
from quivr_core.storage.local_storage import LocalStorage
storage = LocalStorage()
# Get number of files
file_count = storage.nb_files()
# Get storage info
info = storage.info()
# Returns: {"directory_path": "...", "name": "local_storage", "nb_files": N}
Uploading Files
import asyncio
from quivr_core.storage.local_storage import LocalStorage
from quivr_core.files.file import get_file_path
from pathlib import Path
async def upload_example():
storage = LocalStorage()
# Create QuivrFile from path
qfile = get_file_path(Path("./document.pdf"), brain_id=None)
# Upload with overwrite allowed
await storage.upload_file(qfile, exists_ok=True)
print(f"Total files: {storage.nb_files()}")
Listing All Files
import asyncio
from quivr_core.storage.local_storage import LocalStorage
async def list_files():
storage = LocalStorage()
files = await storage.get_files()
for f in files:
print(f"File: {f.original_filename}")
print(f" ID: {f.id}")
print(f" Size: {f.file_size} bytes")
print(f" SHA-1: {f.file_sha1}")
Memory Management
The QuivrFile class uses __slots__ to optimize memory usage by restricting attribute creation:
class QuivrFile:
__slots__ = [
"id",
"brain_id",
"path",
"original_filename",
"file_size",
"file_extension",
"file_sha1",
"additional_metadata",
]
This approach:
- Prevents arbitrary attribute assignment
- Reduces memory overhead per instance
- Improves attribute access speed
Source: core/quivr_core/files/file.py:48-57
Extending the Storage System
To create a custom storage backend, implement the StorageBase abstract class:
from quivr_core.storage.storage_base import StorageBase
from quivr_core.files.file import QuivrFile
class MyCustomStorage(StorageBase):
name = "my_custom_storage"
def __init__(self):
self._files = []
def nb_files(self) -> int:
return len(self._files)
async def get_files(self) -> list[QuivrFile]:
return self._files
async def upload_file(self, file: QuivrFile, exists_ok: bool = False) -> None:
# Custom implementation
pass
Best Practices
- Deduplication: Always compute and check SHA-1 hashes before uploading to avoid duplicate files.
- Async Operations: All file I/O operations are asynchronous. Use
awaitproperly in async contexts.
- Memory Efficiency: Use
__slots__for custom file classes to reduce memory footprint.
- Path Handling: Use
pathlib.Pathfor cross-platform path operations.
- Error Handling: Handle
FileExistsErrorwhen uploading files withexists_ok=False(default).
- Directory Management: Ensure storage directories exist before operations using
os.makedirs(path, exist_ok=True).
Related Components
| Component | Description |
|---|---|
| Brain | Uses Storage for file management |
| Processor System | Processes files from storage |
| RAG System | Retrieves documents from storage for query answering |
Source: https://github.com/QuivrHQ/quivr / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
The project may affect permissions, credentials, data exposure, or host boundaries.
The project may affect permissions, credentials, data exposure, or host boundaries.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Security or permission risk: EU AI Act Compliance Scan Results β Sharing Findings for Feedback
- Severity: high
- Finding: Security or permission risk is backed by a source signal: EU AI Act Compliance Scan Results β Sharing Findings for Feedback. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3667
2. Security or permission risk: [Bug]:
- Severity: high
- Finding: Security or permission risk is backed by a source signal: [Bug]:. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/2004
3. Installation risk: Integration idea: Screenpipe for screen/audio context
- Severity: medium
- Finding: Installation risk is backed by a source signal: Integration idea: Screenpipe for screen/audio context. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3658
4. Installation risk: [Bug]: RuntimeError: There is no current event loop in thread 'MainThread' when using Brain.from_files() in script
- Severity: medium
- Finding: Installation risk is backed by a source signal: [Bug]: RuntimeError: There is no current event loop in thread 'MainThread' when using Brain.from_files() in script. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3650
5. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:640079149 | https://github.com/QuivrHQ/quivr | README/documentation is current enough for a first validation pass.
6. Project risk: The garbage collector is trying to clean up non-checked-in connection <AdaptedConnection <asyncpg.connection.Connectionβ¦
- Severity: medium
- Finding: Project risk is backed by a source signal: The garbage collector is trying to clean up non-checked-in connection <AdaptedConnection <asyncpg.connection.Connectionβ¦. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/issues/3654
7. Project risk: core: v0.0.25
- Severity: medium
- Finding: Project risk is backed by a source signal: core: v0.0.25. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.25
8. Project risk: core: v0.0.29
- Severity: medium
- Finding: Project risk is backed by a source signal: core: v0.0.29. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.29
9. Project risk: core: v0.0.33
- Severity: medium
- Finding: Project risk is backed by a source signal: core: v0.0.33. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.33
10. Maintenance risk: core: v0.0.24
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: core: v0.0.24. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.24
11. Maintenance risk: core: v0.0.26
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: core: v0.0.26. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/QuivrHQ/quivr/releases/tag/core-0.0.26
12. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:640079149 | https://github.com/QuivrHQ/quivr | last_activity_observed missing
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using quivr with real data or production workflows.
- [[Bug]:](https://github.com/QuivrHQ/quivr/issues/2004) - github / github_issue
- Integration idea: Screenpipe for screen/audio context - github / github_issue
- [[Bug]: RuntimeError: There is no current event loop in thread 'MainThrea](https://github.com/QuivrHQ/quivr/issues/3650) - github / github_issue
- EU AI Act Compliance Scan Results β Sharing Findings for Feedback - github / github_issue
- The garbage collector is trying to clean up non-checked-in connection <A - github / github_issue
- core: v0.0.33 - github / github_release
- core: v0.0.29 - github / github_release
- core: v0.0.27 - github / github_release
- core: v0.0.26 - github / github_release
- core: v0.0.25 - github / github_release
- core: v0.0.24 - github / github_release
- README/documentation is current enough for a first validation pass. - GitHub / issue
Source: Project Pack community evidence and pitfall evidence