Doramagic Project Pack · Human Manual
haystack
Haystack follows a component-based architecture where pipelines serve as the foundational building blocks. Pipelines connect various components including document stores, retrievers, reade...
Introduction to Haystack
Related topics: Pipeline Architecture, Core Concepts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Pipeline Architecture, Core Concepts
Introduction to Haystack
Haystack is an end-to-end LLM framework that enables developers to build applications powered by Large Language Models (LLMs), Transformer models, vector search, and more. The framework provides a flexible architecture for orchestrating state-of-the-art embedding models and LLMs into pipelines to solve real-world NLP use cases.
What is Haystack?
Haystack is designed to facilitate the development of production-ready AI applications with a focus on context engineering—giving developers explicit control over how information is retrieved, ranked, filtered, combined, structured, and routed before it reaches the language model.
Sources: README.md:1()
Core Capabilities
| Capability | Description |
|---|---|
| Retrieval-Augmented Generation (RAG) | Combine vector search with LLMs for accurate, context-grounded responses |
| Document Search | Full-featured document indexing and semantic search |
| Question Answering | Extract answers from large document collections |
| Pipeline Orchestration | Build complex workflows with customizable components |
| Agent Integration | Deploy autonomous agents with tool-use capabilities |
Sources: docker/README.md:4-6()
Architecture Overview
Haystack follows a component-based architecture where pipelines serve as the foundational building blocks. Pipelines connect various components including document stores, retrievers, readers, generators, and custom tools.
graph TD
A[User Query] --> B[Pipeline]
B --> C[Retrievers]
B --> D[Document Stores]
C --> E[Rankers]
E --> F[LLM / Generator]
F --> G[Response]
H[Documents] --> D
style F fill:#e1f5fe
style D fill:#fff3e0
style C fill:#e8f5e9Pipeline Components
Pipelines in Haystack are composed of interconnected nodes that process data sequentially or in parallel. Each component handles a specific stage of the document processing or inference workflow.
| Component Type | Function |
|---|---|
| DocumentStore | Stores and indexes documents for retrieval |
| Retriever | Finds relevant documents from the store |
| Ranker | Reorders retrieved documents by relevance |
| Reader/Generator | Extracts answers or generates responses |
| Preprocessor | Cleans and splits documents before indexing |
| Custom Nodes | User-defined processing logic |
Sources: README.md:54-58()
Key Features
Built for Context Engineering
Haystack provides fine-grained control over the entire retrieval and generation pipeline. Developers can:
- Define custom retrieval strategies
- Implement multi-stage ranking pipelines
- Route queries to specialized processing branches
- Control how context is assembled before reaching the LLM
Flexible Pipeline Design
The framework supports both declarative and programmatic pipeline construction, allowing developers to define workflows through configuration files or Python code.
graph LR
A[Query Input] --> B[Retriever Node]
B --> C[Ranker Node]
C --> D[LLM Node]
D --> E[Output]
F[Documents] --> G[Document Store]
G --> BProduction-Ready Architecture
Haystack includes enterprise features such as:
- Telemetry: Anonymous usage statistics collection for component initialization tracking (opt-out available)
- Container Support: Docker images for consistent deployment environments
- CI/CD Integration: Automated testing with GitHub Actions workflows
- Type Checking: Full MyPy type annotation support
Sources: README.md:60-62()
Installation
Package Installation
The primary method for installing Haystack is via pip:
pip install haystack-ai
For testing pre-release features:
pip install --pre haystack-ai
Sources: README.md:28-34()
Docker Installation
Haystack provides Docker images for containerized deployments:
| Image | Description |
|---|---|
haystack:base-<version> | Base image with Haystack preinstalled for derivation |
Multi-platform builds are supported for various architectures including linux/arm64 and linux/amd64.
docker buildx bake base
Sources: docker/README.md:8-14()
Documentation Structure
The Haystack documentation is hosted at docs.haystack.deepset.ai and organized into several sections:
| Section | Content |
|---|---|
| Overview/Intro | Getting started guides and project introduction |
| Get Started | Quick-start guide for building first LLM applications |
| Tutorials | Step-by-step learning paths |
| Cookbook | Pre-built recipes and example implementations |
| API Reference | Auto-generated documentation from docstrings |
| Concepts | Core architectural concepts and design patterns |
Sources: docs-website/README.md:1-8()
Documentation Versioning
The documentation site supports multiple versions:
- Next (Unreleased): Documentation for upcoming features
- Current (Stable): Documentation for the latest stable release
- Past Versions: Archived documentation for previous releases
Sources: docs-website/src/pages/versions.js:1-25()
API Reference Generation
The API reference pages are automatically generated from docstrings using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes are merged.
Sources: pydoc/README.md:1-12()
Project Structure
haystack/
├── haystack/ # Main package source code
├── docs-website/ # Docusaurus documentation site
│ ├── docs/ # Main documentation content
│ ├── reference/ # Auto-generated API reference
│ └── versioned_docs/ # Versioned documentation snapshots
├── docker/ # Docker image configurations
├── pydoc/ # PyDoc configuration files
└── examples/ # Example implementations
Note: Example implementations have been moved to the haystack-cookbook repository.
Sources: examples/README.md:1-5()
Community and Contributing
Haystack is open to contributions from developers of all skill levels. There are multiple ways to contribute:
| Contribution Area | Repository |
|---|---|
| Core Framework | deepset-ai/haystack |
| Integrations | deepset-ai/haystack-core-integrations |
| Documentation | deepset-ai/haystack/tree/main/docs-website |
Community Resources
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: General questions and community support
- Discord: Real-time community engagement
- Stack Overflow: Tagged questions at
haystack - Twitter/X: Updates and announcements
Sources: README.md:89-95()
Organizations Using Haystack
Haystack is trusted by thousands of production AI teams across industries:
| Industry | Organizations |
|---|---|
| Technology & AI | Apple, Meta, Databricks, NVIDIA, Intel |
| Public Sector | European Commission |
Sources: README.md:78-85()
Licensing and Compliance
- License: Apache 2.0
- Type Checking: MyPy validated
- Coverage: Automated test coverage tracking
- License Compliance: Automated workflow verification
Sources: README.md:10-11()
Summary
Haystack provides a comprehensive framework for building production-ready LLM applications with emphasis on retrieval-augmented generation, flexible pipeline design, and context engineering. The framework's component-based architecture enables developers to customize every stage of the document processing and inference pipeline while maintaining production-grade reliability through integrated testing, documentation, and deployment tooling.
With support for Docker containerization, comprehensive documentation, and an active open-source community, Haystack serves as a robust foundation for teams implementing enterprise AI solutions across diverse industries.
Sources: README.md:1()
Pipeline Architecture
Related topics: Introduction to Haystack, Pipeline Component Types, Core Concepts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Haystack, Pipeline Component Types, Core Concepts
Pipeline Architecture
Overview
The Pipeline architecture is the foundational component of the Haystack framework, enabling developers to construct flexible, modular workflows for building LLM-powered applications. Pipelines orchestrate the execution of various components—including retrievers, readers, generators, and custom processors—into cohesive data processing flows.
Pipelines in Haystack 2.x provide a declarative approach to defining application workflows, allowing developers to:
- Connect multiple components in directed acyclic graphs (DAGs)
- Route data between components with explicit connections
- Handle both synchronous and asynchronous execution models
- Debug and inspect execution through breakpoints and hooks
- Persist and share pipeline configurations through serialization
Sources: docs-website/docs/concepts/pipelines.mdx
Core Concepts
Component Connections
Components in a Haystack Pipeline are connected through named input/output connections. Each component exposes specific input and output slots that define how data flows through the pipeline.
graph LR
A[Document Store] -->|query results| B[Retriever]
B -->|retrieved docs| C[Reader]
C -->|answers| D[Output]
style A fill:#e1f5fe
style B fill:#fff3e0
style C fill:#e8f5e9
style D fill:#fce4ecThe connection model requires that:
- Output types must be compatible with target input types
- Components can have multiple inputs and outputs
- Connections form a directed graph structure
Sources: docs-website/docs/concepts/pipelines.mdx:1-20
Pipeline Types
Haystack provides multiple pipeline implementations optimized for different use cases:
| Pipeline Type | Use Case | Execution Model |
|---|---|---|
| Standard Pipeline | General-purpose workflows | Synchronous |
| AsyncPipeline | High-throughput I/O operations | Asynchronous with async/await |
| SearchPipeline | Retrieval-focused workflows | Optimized for search |
| GenerativePipeline | LLM-centric applications | Optimized for generation |
Sources: docs-website/docs/concepts/pipelines.mdx
AsyncPipeline
The AsyncPipeline extends the standard Pipeline with asynchronous execution capabilities, making it suitable for applications requiring high concurrency and non-blocking I/O operations.
Key Features
- Non-blocking execution: Components can execute concurrently when dependencies are satisfied
- Streaming support: Better handling of streaming responses from LLMs
- Resource efficiency: Improved CPU and memory utilization for I/O-bound workloads
async def run_async_pipeline(pipeline, query):
result = await pipeline.run_async(query=query)
return result
Sources: docs-website/docs/concepts/pipelines/asyncpipeline.mdx
Execution Flow
graph TD
A[Start] --> B{AsyncPipeline.run_async}
B --> C[Execute Independent Components]
C --> D{Wait for Dependencies?}
D -->|No| E[Collect Results]
D -->|Yes| F[Await Dependency]
F --> E
E --> G[Return Unified Result]
style B fill:#bbdefb
style C fill:#c8e6c9
style G fill:#ffe0b2Serialization
Pipeline configurations can be serialized to YAML format, enabling:
- Persistence of pipeline definitions
- Sharing configurations across environments
- Version control for pipeline definitions
- Reproducible deployments
Serialization Format
version: '2.0'
components:
- name: MyRetriever
type: BM25Retriever
init_parameters:
document_store: MyDocumentStore
- name: MyReader
type: FARMReader
init_parameters:
model_name_or_path: deepset/roberta-base-squad2
edges: []
Sources: docs-website/docs/concepts/pipelines/serialization.mdx
Loading Serialized Pipelines
from haystack import Pipeline
# Load from YAML
pipeline = Pipeline.load_from_yaml(path="pipeline_config.yaml")
Debugging Pipelines
Haystack provides comprehensive debugging capabilities to inspect and troubleshoot pipeline execution.
Execution Tracing
The debugging system tracks:
- Component execution order
- Input/output data at each stage
- Execution timing and performance metrics
- Error locations and stack traces
from haystack import Pipeline
pipeline = Pipeline()
pipeline.debug = True # Enable debug mode
result = pipeline.run(query="What is Haystack?")
Sources: docs-website/docs/concepts/pipelines/debugging-pipelines.mdx
Pipeline Inspector
The Pipeline Inspector provides detailed visibility into:
| Inspection Target | Information Provided |
|---|---|
| Component Graph | Node and edge relationships |
| Data Flow | Input/output shapes and types |
| Execution State | Runtime values at breakpoints |
| Performance | Timing and memory profiles |
Pipeline Breakpoints
Breakpoints allow execution to pause at specific points, enabling detailed inspection of intermediate results.
graph LR
A[Pipeline Run] --> B{Breakpoint 1?}
B -->|Yes| C[Pause & Inspect]
C --> D{Continue?}
D -->|Yes| E{Breakpoint 2?}
D -->|No| Z[Abort]
E -->|Yes| F[Pause & Inspect]
E -->|No| G[Continue to End]
B -->|No| E
style C fill:#fff9c4
style F fill:#fff9c4
style Z fill:#ffcdd2Breakpoint Configuration
Breakpoints can be configured at:
- Component level: Pause before or after specific component execution
- Connection level: Inspect data flowing through specific connections
- Condition level: Pause only when certain conditions are met
Sources: docs-website/docs/concepts/pipelines/pipeline-breakpoints.mdx
Best Practices
Pipeline Design
- Modularity: Keep components focused on single responsibilities
- Clear naming: Use descriptive names for components and connections
- Error handling: Implement proper error handling at component boundaries
- Testing: Unit test individual components before integration
Performance Optimization
| Strategy | Description |
|---|---|
| Caching | Enable caching for expensive operations |
| Batching | Use batch processing for multiple queries |
| Async execution | Prefer AsyncPipeline for I/O-bound workflows |
| Resource limits | Set appropriate timeouts and memory limits |
Architecture Summary
graph TD
subgraph "Pipeline Layer"
A[Pipeline] --> B[AsyncPipeline]
A --> C[SearchPipeline]
A --> D[GenerativePipeline]
end
subgraph "Component Layer"
E[Retrievers] --> A
F[Readers] --> A
G[Generators] --> A
H[Custom Processors] --> A
end
subgraph "Data Layer"
I[Document Stores] --> E
J[Models] --> F
J --> G
end
subgraph "Infrastructure"
K[Serialization] -.-> A
L[Debugging] -.-> A
M[Breakpoints] -.-> A
endRelated Documentation
Core Concepts
Related topics: Pipeline Architecture, Pipeline Component Types, Introduction to Haystack
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Pipeline Architecture, Pipeline Component Types, Introduction to Haystack
Core Concepts
Haystack is an end-to-end LLM (Large Language Model) framework that enables developers to build applications powered by LLMs, Transformer models, vector search, and more. The framework orchestrates state-of-the-art embedding models and LLMs into pipelines to solve use cases such as retrieval-augmented generation (RAG), document search, question answering, and answer generation.
What is Haystack?
Haystack provides a flexible architecture for designing systems with explicit control over how information is retrieved, ranked, filtered, combined, structured, and routed before it reaches the model. The framework allows developers to define pipelines and agent workflows where retrieval, memory, tools, and other components work together seamlessly.
Sources: README.md
Architecture Overview
Haystack's architecture is built around the concept of pipelines that orchestrate various components. These pipelines provide explicit control over the data flow from input to output, enabling developers to build complex LLM applications with fine-grained control.
graph TD
A[Input Query] --> B[Pipeline]
B --> C[Components]
C --> D[Retrievers]
C --> E[Rankers]
C --> F[Memory]
C --> G[Tools]
D --> H[Document Store]
E --> I[LLM]
H --> J[Context Engineering]
I --> K[Generated Response]
J --> ISources: README.md
Installation
Haystack can be installed via pip using the main package:
pip install haystack-ai
For trying newest features, install nightly pre-releases:
pip install --pre haystack-ai
Sources: README.md
Docker Support
Haystack provides Docker images for containerized deployments. The base image haystack:base-<version> contains a working Python environment with Haystack preinstalled and is designed to be derived FROM.
Images are built with BuildKit and orchestrated using bake:
docker buildx bake base
Custom images can be built by overriding variables defined in the docker-bake.hcl file:
HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache
Sources: docker/README.md
Documentation System
Haystack maintains comprehensive documentation at docs.haystack.deepset.ai. The documentation is built with Docusaurus 3 and provides guides, tutorials, API references, and best practices.
Documentation Structure
| Directory | Purpose |
|---|---|
docs/ | Main documentation (guides, tutorials, concepts) |
docs/concepts/ | Core Haystack concepts |
docs/pipeline-components/ | Component documentation |
reference/ | API reference (auto-generated) |
versioned_docs/ | Versioned copies of docs |
src/ | React components and custom code |
Sources: docs-website/README.md
Versioning
Documentation versions are released alongside Haystack releases and are fully automated through GitHub workflows. The versioning process includes:
promote_unstable_docs.yml- Automatically triggered during Haystack releasesminor_version_release.yml- Creates new version directories and updates version configuration
Sources: docs-website/README.md
API Reference
The API reference is generated from docstrings in the codebase using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes.
To add documentation for a new module:
- Create a
.ymlfile in thepydocdirectory - Configure how haystack-pydoc-tools will generate the page
- Commit to main
All API reference updates are initially deployed to unstable docs and promoted to stable docs during releases.
Sources: pydoc/README.md
Documentation Website Development
The documentation site can be run locally for development:
git clone https://github.com/deepset-ai/haystack.git
cd haystack/docs-website
npm install
npm start
The site opens at http://localhost:3000 with live reload functionality.
Common development tasks include:
- Edit a page: update files under
docs/orversioned_docs/ - Add to sidebar: update
sidebars.jswith your doc ID - Production check:
npm run build && npm run serve
Sources: docs-website/README.md
Search Functionality
The documentation website includes a custom search bar that groups results by page and sorts them by relevance score. The search system supports filtering by category and provides snippets from matching documents.
Search Architecture
graph TD
A[User Query] --> B[Search Input]
B --> C[Debounced Search]
C --> D[Search Algorithm]
D --> E{Results Found?}
E -->|Yes| F[Group by Page]
E -->|No| G[No Results State]
F --> H[Sort by Score]
H --> I[Display Results]
G --> J[Show Error/Message]Sources: docs-website/src/theme/SearchBar.js
Documentation Export Features
The documentation site provides multiple ways to export and share content:
| Feature | Description |
|---|---|
| Copy as Markdown | Copy page content in Markdown format for LLMs |
| View as Markdown | View page as plain text |
| Export as PDF | Save page as PDF file |
| Ask AI | Open page in external AI assistants |
Sources: docs-website/src/components/CopyDropdown/index.tsx
Markdown Conversion Rules
The export feature uses custom Turndown rules:
- Code blocks: Wrapped in backticks
- Admonitions: Converted to blockquotes with type labels (NOTE, TIP, WARNING, etc.)
- Navigation elements: Removed from export
- Scripts and styles: Filtered out
Sources: docs-website/src/components/CopyDropdown/index.tsx
Examples and Cookbooks
Example code and cookbooks have been moved to a dedicated repository: haystack-cookbook
This separation allows for easier maintenance and discovery of example applications.
Sources: examples/README.md
CI/CD and Quality Assurance
Haystack maintains high code quality through automated workflows:
| Workflow | Purpose |
|---|---|
| tests.yml | Run test suite |
| types (Mypy) | Type checking |
| Coverage | Code coverage tracking |
| Ruff | Linting |
| license_compliance.yml | License verification |
Sources: README.md
Contributing to Haystack
Haystack welcomes community contributions in various forms:
- Main project: Contribute to the core Haystack repository
- Integrations: Contribute on haystack-core-integrations
- Documentation: Contribute to haystack/docs-website
The project provides a full list of issues open to contributions for both new and experienced contributors.
Sources: README.md
Organizations Using Haystack
Haystack is used in production by numerous organizations across industries:
| Industry | Organizations |
|---|---|
| Technology & AI | Apple, Meta, Databricks, NVIDIA, Intel |
| Public Sector | European Commission |
| Various | Thousands of teams building production AI systems |
Sources: README.md
Sources: README.md
Pipeline Component Types
Related topics: Pipeline Architecture, Data Processing Components, LLM and Embedder Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Pipeline Architecture, Data Processing Components, LLM and Embedder Integrations
Pipeline Component Types
Pipeline components are the fundamental building blocks of Haystack pipelines. They are modular units that perform specific operations such as retrieving documents, converting file formats, generating responses, and routing data between pipeline stages. Each component follows a consistent interface that enables seamless integration into pipeline workflows, allowing developers to compose complex LLM applications from reusable, interchangeable parts.
Overview
Haystack provides a comprehensive set of built-in pipeline components that cover the full lifecycle of LLM-powered applications. These components are designed to work together through a unified API, enabling developers to build retrieval-augmented generation (RAG) systems, question-answering pipelines, document processing workflows, and agent-based applications with minimal configuration.
The architecture follows a modular pattern where each component receives inputs, performs a specific transformation or operation, and produces outputs that can be consumed by subsequent components in the pipeline. This design philosophy ensures that components remain loosely coupled and highly reusable across different use cases.
Components in Haystack are categorized based on their primary function within the data flow. Some components handle input preparation (converters, preprocessors), others manage information retrieval (retrievers, embedders), some optimize result ordering (rankers), and others control program flow (routers, joiners). Understanding these categories is essential for designing effective pipelines that balance performance, accuracy, and resource utilization.
Component Architecture
Component Lifecycle
Components in Haystack follow a standardized lifecycle that includes initialization, execution, and optional teardown phases. During initialization, components receive their configuration parameters and prepare any required resources such as model weights, API connections, or index data. The execution phase processes input data through the component's core logic, while the teardown phase releases resources when the component is no longer needed.
graph TD
A[Initialize Component] --> B[Load Resources]
B --> C[Receive Input Data]
C --> D[Process Data]
D --> E[Produce Output]
E --> F{Check Pipeline Status}
F -->|Continue| C
F -->|Complete| G[Release Resources]
G --> H[Component Lifecycle End]Data Flow Patterns
Haystack pipelines support multiple data flow patterns that determine how information moves between components. Linear flow passes output directly to the next component, while branching flow sends data to multiple paths based on conditions. Parallel flow distributes work across multiple components simultaneously, and feedback flow allows outputs to influence earlier pipeline stages.
Input Processing Components
Input processing components prepare raw data for use by downstream pipeline stages. These components handle the transformation of unstructured or heterogeneous data sources into standardized formats that can be processed consistently throughout the pipeline.
Converters
Converters transform documents from various file formats into Haystack's internal document representation. They handle the extraction of text content from source files while preserving metadata that may be useful for subsequent processing or retrieval operations.
| Converter Type | Supported Formats | Primary Use Case |
|---|---|---|
| PDF Converter | Extract text from PDF documents | |
| Text Converter | TXT, MD | Plain text and markdown files |
| DOCX Converter | DOCX | Microsoft Word documents |
| HTML Converter | HTML | Web page content extraction |
Converters are typically placed at the beginning of indexing pipelines where they process source documents before the content is split, embedded, and stored. The output of converters feeds directly into preprocessors that further refine the content.
Sources: docs-website/docs/pipeline-components/converters.mdx
Preprocessors
Preprocessors clean, normalize, and transform document content to improve retrieval quality and downstream processing. They apply transformations such as text cleaning, language detection, and content segmentation to prepare documents for embedding and storage.
graph LR
A[Raw Document] --> B[Clean Text]
B --> C[Detect Language]
C --> D[Split Document]
D --> E[Normalize Content]
E --> F[Processed Document]Key preprocessing operations include removing unnecessary whitespace, normalizing unicode characters, splitting long documents into manageable chunks, and filtering out low-quality content. These operations significantly impact the quality of retrieval results and should be configured based on the specific characteristics of your data.
Preprocessors work closely with converters to form the input preparation stage of indexing pipelines. The processed output is then passed to embedders or directly to storage depending on the pipeline configuration.
Sources: docs-website/docs/pipeline-components/preprocessors.mdx
Builders
Builders construct specialized data structures or artifacts that support pipeline operations. Unlike converters that handle file formats, builders create complex objects such as prompt templates, search indexes, or custom data representations required by other components.
Builders enable the composition of reusable building blocks that can be shared across multiple pipelines. They abstract away the complexity of constructing complex objects, allowing pipeline developers to focus on workflow design rather than implementation details.
Sources: docs-website/docs/pipeline-components/builders.mdx
Information Retrieval Components
Information retrieval components locate and retrieve relevant content from data stores. These components form the core of RAG systems and document search applications, enabling pipelines to find the most relevant information based on query semantics or keywords.
Retrievers
Retrievers search document stores to find content relevant to a given query. Haystack supports multiple retrieval strategies ranging from keyword-based sparse retrieval to semantic dense retrieval, enabling developers to choose the approach that best fits their use case.
| Retrieval Type | Description | Best For |
|---|---|---|
| Dense Retrieval | Uses neural embeddings for semantic matching | Conceptual queries, semantic similarity |
| Sparse Retrieval | Traditional keyword-based matching | Exact matches, specific terminology |
| Hybrid Retrieval | Combines dense and sparse methods | Balanced performance across query types |
Retrievers are fundamental to RAG pipelines where they identify the documents or passages most likely to contain information relevant to the user's question. The retrieved content is then passed to generators that synthesize the final response.
Sources: docs-website/docs/pipeline-components/retrievers.mdx
Embedders
Embedders convert text content into vector representations that capture semantic meaning. These vectors enable semantic similarity searches where documents are matched based on meaning rather than exact keyword occurrence.
graph TD
A[Text Input] --> B[Embedding Model]
B --> C[Vector Representation]
C --> D[Vector Store]
E[Query] --> F[Same Embedding Model]
F --> G[Query Vector]
G --> D
D --> H[Similarity Search]
H --> I[Ranked Results]Embedders are used both during indexing (to create document vectors) and at query time (to create query vectors). The choice of embedding model significantly impacts retrieval quality, and Haystack supports integration with various embedding providers including OpenAI, Hugging Face, and local models.
Sources: docs-website/docs/pipeline-components/embedders.mdx
Rankers
Rankers improve retrieval results by reordering documents based on additional relevance signals. While retrievers perform the initial candidate selection, rankers apply more sophisticated scoring models to identify the most relevant results.
Rankers typically use cross-encoder models that jointly analyze query-document pairs to produce relevance scores. This approach is computationally more expensive than bi-encoder retrieval but provides higher accuracy for tasks where precision is critical.
The typical pipeline arrangement places rankers after retrievers, with retrievers performing the broad candidate selection and rankers performing the refined reordering. This two-stage approach balances computational efficiency with result quality.
Sources: docs-website/docs/pipeline-components/rankers.mdx
Output Generation Components
Output generation components synthesize final responses or artifacts from the information retrieved and processed by earlier pipeline stages. These components transform raw retrieved content into user-facing outputs.
Generators
Generators produce final outputs such as text responses, summaries, or structured data from retrieved context and user queries. In RAG systems, generators receive relevant documents and formulate answers that incorporate information from the retrieved content.
graph TD
A[User Query] --> E[Generator]
B[Retrieved Context] --> E
E --> F[Generate Response]
F --> G[Response Output]
H[LLM Provider] <--> E
H --> |API Key| EGenerators integrate with various LLM providers including OpenAI, Anthropic, Cohere, Hugging Face, and local models. Configuration options control parameters such as temperature, max tokens, and response format to customize generator behavior for specific applications.
Sources: docs-website/docs/pipeline-components/generators.mdx
Flow Control Components
Flow control components manage how data moves through pipelines, enabling conditional logic, parallel processing, and result aggregation. These components add flexibility to pipeline design beyond simple linear data flow.
Routers
Routers direct input data to different pipeline branches based on conditions or classifications. They enable conditional execution where different components handle different types of inputs or queries.
| Router Type | Decision Basis | Use Case |
|---|---|---|
| Conditional Router | User-defined rules | Route queries to appropriate handlers |
| Semantic Router | Query classification | Direct to specialized pipelines |
| Custom Router | Any Python logic | Flexible routing strategies |
Routers are essential for building multi-stage pipelines that handle diverse input types or implement complex query routing strategies. They enable pipelines to adapt their behavior based on the specific requirements of each input.
Sources: docs-website/docs/pipeline-components/routers.mdx
Joiners
Joiners combine outputs from multiple pipeline branches into unified inputs for downstream components. They handle the aggregation of results from parallel processing paths or the merging of different data streams.
graph TD
A[Input] --> B[Branch 1]
A --> C[Branch 2]
A --> D[Branch N]
B --> E[Joiner]
C --> E
D --> E
E --> F[Combined Output]Joiners implement various combination strategies including concatenation, interleaving, and weighted merging. The appropriate strategy depends on the data types being combined and the requirements of downstream components.
Sources: docs-website/docs/pipeline-components/joiners.mdx
Component Configuration Patterns
Initialization Parameters
Components accept configuration during initialization that determines their behavior, resource connections, and operational parameters. Common configuration categories include model selection, connection settings, and behavioral parameters.
Default Parameters
Components provide sensible defaults for most parameters, enabling quick pipeline construction while allowing customization when needed. Default values are documented in each component's reference documentation.
Runtime Parameters
Some components accept parameters at runtime (during pipeline execution) in addition to initialization-time configuration. Runtime parameters enable dynamic behavior adjustment based on input characteristics or pipeline state.
Building Custom Components
Haystack's component architecture supports extension through custom implementations. Custom components follow the same interface patterns as built-in components, ensuring compatibility with existing pipeline infrastructure.
Component Interface Requirements
Custom components must implement the standard component methods including initialization, execution, and any component-specific lifecycle hooks. The exact interface depends on the component type, but all components must be serializable for pipeline persistence.
Integration with Pipeline
Custom components integrate seamlessly with built-in components through the unified pipeline interface. They can receive inputs from and produce outputs for any other component type, enabling flexible composition of custom and built-in functionality.
Best Practices
Component Selection
Choose components based on your specific use case requirements including accuracy needs, latency constraints, and resource availability. Consider the trade-offs between different retrieval strategies, embedding models, and generation approaches.
Pipeline Design
Design pipelines with clear separation of concerns between components. Input processing, retrieval, and generation should be logically separated to enable independent optimization and testing.
Performance Optimization
Optimize component ordering based on computational cost. Place computationally expensive operations later in the pipeline where they operate on reduced candidate sets. Use rankers selectively based on the required result quality.
Summary
Pipeline components form the foundation of Haystack's architecture, enabling modular construction of LLM-powered applications. The component taxonomy spans input processing (converters, preprocessors, builders), information retrieval (retrievers, embedders, rankers), output generation (generators), and flow control (routers, joiners). Each component category serves a distinct purpose in the pipeline data flow, and understanding these roles enables effective pipeline design and customization.
Sources: docs-website/docs/pipeline-components/converters.mdx
Data Processing Components
Related topics: Document Stores and Retrievers, Pipeline Component Types
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Document Stores and Retrievers, Pipeline Component Types
Data Processing Components
Data Processing Components are fundamental pipeline elements in Haystack that transform, clean, and prepare documents for downstream operations such as retrieval, indexing, and LLM processing. These components operate on Document objects, enabling structured manipulation of content while preserving metadata integrity throughout the processing pipeline.
Overview
Data Processing Components in Haystack serve as the preprocessing layer that bridges raw document ingestion with semantic retrieval and generation tasks. They are designed to handle various document formats, split long content into manageable chunks, and ensure data quality through cleaning operations.
The architecture follows a modular design pattern where each component type specializes in a specific transformation task:
- Document Splitters: Divide documents into smaller, semantically coherent chunks
- Document Cleaners: Remove noise, normalize text, and enhance readability
- Converters: Transform external file formats into Haystack
Documentobjects
Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx
Architecture and Processing Flow
graph TD
A[Raw Document Input] --> B[Converters]
B --> C[Document Objects]
C --> D[Document Cleaners]
D --> E[Document Splitters]
E --> F[Processed Chunks]
F --> G[Embedding Stores]
G --> H[Retrieval Pipelines]
B -.->|File Types| I[TXT]
B -.->|File Types| J[PDF]
B -.->|File Types| K[Markdown]
B -.->|File Types| L[HTML]
B -.->|File Types| M[Docx]
D -.->|Operations| N[Text Normalization]
D -.->|Operations| O[Whitespace Cleaning]
D -.->|Operations| P[Metadata Preservation]
E -.->|Strategies| Q[Character Split]
E -.->|Strategies| R[Recursive Split]
E -.->|Strategies| S[Hierarchical Split]Document Splitters
Document splitters are preprocessors that divide long documents into smaller, manageable chunks while attempting to preserve semantic coherence. This is critical for effective retrieval since chunk size directly impacts retrieval precision and context window utilization.
Sources: docs-website/docs/pipeline-components/preprocessors/recursivesplitter.mdx
Splitter Types
| Splitter Type | Use Case | Splitting Strategy |
|---|---|---|
DocumentSplitter | Basic character or token-based splitting | Fixed-length chunks |
RecursiveSplitter | Hierarchical splitting by delimiters | Recursive character/separator traversal |
HierarchicalDocumentSplitter | Multi-level document structure | Preserves headings and sections |
DocumentSplitter
The base DocumentSplitter provides fundamental splitting capabilities using either character count or token count as the primary division criterion.
Key Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
split_length | int | Required | Target size of each chunk |
split_overlap | int | 0 | Number of overlapping elements between chunks |
split_by | str | "word" | Splitting criterion: "word", "sentence", "passage", or "token" |
Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx
RecursiveSplitter
The RecursiveSplitter implements an intelligent multi-level splitting strategy that attempts to split documents at natural boundaries before falling back to smaller units.
from haystack.components.preprocessors import RecursiveSplitter
splitter = RecursiveSplitter(
split_by="sentence",
split_length=5,
split_overlap=2,
separators=["\n\n", "\n", ". ", " ", ""]
)
The splitter iterates through the separators list, attempting to split at each level. If a split produces chunks larger than split_length, it moves to the next (smaller) separator in the list.
Sources: docs-website/docs/pipeline-components/preprocessors/recursivesplitter.mdx
Separator Priority:
| Priority | Separator | Context |
|---|---|---|
| 1 | "\n\n" | Paragraph breaks |
| 2 | "\n" | Line breaks |
| 3 | ". " | Sentence endings |
| 4 | " " | Word boundaries |
| 5 | "" | Character-level fallback |
HierarchicalDocumentSplitter
The HierarchicalDocumentSplitter is designed for structured documents that contain hierarchical headings and section markers. It preserves document structure by splitting at heading boundaries first.
Key Features:
- Detects heading patterns (e.g.,
#,##,###in Markdown) - Splits at the highest heading level available
- Maintains hierarchical relationships between sections and subsections
- Ideal for technical documentation and Markdown-based content
from haystack.components.preprocessors import HierarchicalDocumentSplitter
splitter = HierarchicalDocumentSplitter(
split_by="sentence",
split_length=10,
split_overlap=3
)
Sources: docs-website/docs/pipeline-components/preprocessors/hierarchicaldocumentsplitter.mdx
Document Cleaners
Document cleaners are preprocessing components that normalize and sanitize text content while preserving essential structure and metadata. They remove unwanted artifacts, standardize formatting, and enhance downstream processing quality.
Sources: docs-website/docs/pipeline-components/preprocessors/documentcleaner.mdx
Core Cleaning Operations
| Operation | Description | Example |
|---|---|---|
| Whitespace normalization | Collapse multiple spaces, trim line breaks | " Hello\n\n World " → "Hello World" |
| Character removal | Strip control characters and special symbols | Removes \x00 to \x1f except \n, \t |
| Quote normalization | Standardize quote characters | Smart quotes → straight quotes |
| Heading normalization | Clean heading markers | Removes # from Markdown headings |
Common Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
remove_empty_lines | bool | True | Remove lines with no content |
remove_extra_whitespace | bool | True | Normalize whitespace between words |
remove_repeated_substrings | bool | False | Eliminate duplicate consecutive substrings |
Converters
Converters are components that transform external file formats into Haystack Document objects. They handle the ingestion pipeline by parsing various document formats and extracting both content and metadata.
Sources: docs-website/docs/pipeline-components/converters.mdx
Supported Formats
| Format | Converter Class | Features |
|---|---|---|
| Plain Text | TextConverter | Direct text extraction |
PdfToDocumentConverter | Text and table extraction | |
| Markdown | MarkdownToDocumentConverter | Preserves structure and headings |
| HTML | HtmlToDocumentConverter | Extracts text from HTML elements |
| Microsoft Word | DocxToDocumentConverter | Document and paragraph parsing |
Converter Architecture
graph LR
A[Input File] --> B[Format Detection]
B --> C[Format-Specific Parser]
C --> D[Content Extraction]
D --> E[Metadata Enrichment]
E --> F[Haystack Document]
G[File Path] -.->|Direct Input| D
H[Binary Content] -.->|Raw Data| CCommon Converter Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
encoding | str | "utf-8" | Text encoding for file reading |
encoding_errors | str | "strict" | How to handle encoding errors |
id_hash_keys | List[str] | ["content"] | Keys for document ID generation |
meta | Dict[str, Any] | {} | Additional metadata to attach |
Sources: docs-website/docs/pipeline-components/converters.mdx
Integration with Pipelines
Data Processing Components integrate seamlessly into Haystack pipelines as standard pipeline nodes. They can be composed in any order to create custom preprocessing workflows.
Typical Pipeline Configuration
from haystack import Pipeline
from haystack.components.preprocessors import DocumentCleaner, RecursiveSplitter
from haystack.components.converters import TextConverter
pipeline = Pipeline()
pipeline.add_component("converter", TextConverter())
pipeline.add_component("cleaner", DocumentCleaner())
pipeline.add_component("splitter", RecursiveSplitter(split_length=200, split_by="word"))
pipeline.connect("converter", "cleaner")
pipeline.connect("cleaner", "splitter")
Processing Order Recommendation
While components can be connected in various orders, the recommended processing sequence is:
- Convert - Transform source files into
Documentobjects - Clean - Normalize and sanitize the text content
- Split - Divide documents into retrieval-optimized chunks
This sequence ensures that cleaning operations apply to the complete document before splitting, maintaining consistency across chunks.
Metadata Preservation
All Data Processing Components preserve and propagate document metadata throughout the processing pipeline. Metadata added during conversion is carried through cleaning and splitting operations.
Automatic Metadata Fields:
| Field | Source | Description |
|---|---|---|
source | Converter | Original file path or URI |
file_type | Converter | Document format (pdf, txt, etc.) |
page_number | PDF Converter | Page number for page-level tracking |
split_id | Splitter | Unique identifier for each chunk |
split_idx_start | Splitter | Character offset where chunk begins |
Best Practices
Chunk Size Selection
| Chunk Size | Recommended Use Case |
|---|---|
| 50-100 tokens | High-precision queries, precise fact extraction |
| 200-300 tokens | Balanced retrieval, general Q&A |
| 500+ tokens | Complex reasoning, multi-document synthesis |
Cleaning Configuration
- Enable
remove_extra_whitespacefor all text-based content - Use
remove_empty_lineswhen building dense indexes - Disable cleaning for Markdown/HTML if structure preservation is critical
Overlap Strategy
When configuring split_overlap, consider:
- Low overlap (0-10%): Maximizes diversity, suitable for unique content
- Medium overlap (10-20%): Balances context preservation and diversity
- High overlap (20%+: Essential for documents with continuous context
Related Components
- Embedding Generators: Process chunks to create vector representations
- Document Stores: Store and index processed chunks for retrieval
- Rankers: Reorder retrieved chunks by relevance
- Prompt Engineers: Combine chunks for LLM context windows
Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx
LLM and Embedder Integrations
Related topics: Document Stores and Retrievers, Pipeline Component Types, Development Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Document Stores and Retrievers, Pipeline Component Types, Development Guide
LLM and Embedder Integrations
Overview
LLM and Embedder Integrations in Haystack provide the core components for interfacing with Large Language Models and embedding services. These integrations enable developers to build production-ready applications powered by LLMs, Transformer models, and vector search capabilities.
Sources: README.md:1-10
Architecture
Haystack's integration architecture follows a modular pipeline design where Generators (LLMs) and Embedders serve as fundamental building blocks within the orchestration framework.
graph TD
A[Haystack Pipeline] --> B[Retrieval Components]
A --> C[Generator Components]
A --> D[Embedder Components]
C --> E[LLM Providers]
D --> F[Embedding Models]
B --> F
E --> G[API Services]
F --> GGenerator Integration
Purpose
Generators in Haystack are components that interact with Large Language Models to generate responses based on prompts and retrieved context. They serve as the core reasoning engine within RAG (Retrieval-Augmented Generation) pipelines.
Sources: docs-website/docs/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx:1-15
Supported Providers
Haystack supports multiple LLM providers through its integration system. The framework provides standardized interfaces for:
| Provider | Integration Type | API Access |
|---|---|---|
| OpenAI | Chat Completions API | API Key |
| Anthropic | Claude API | API Key |
| Azure OpenAI | Azure OpenAI Service | Azure Credentials |
| Hugging Face | Inference API / Local | API Key / Local |
| Ollama | Local Models | Local Host |
Component Configuration
Generator components in Haystack follow a consistent initialization pattern:
from haystack import Pipeline
from haystack.components.generators import OpenAIChatGenerator
generator = OpenAIChatGenerator(
api_key="your-api-key",
model="gpt-4",
streaming_callback=None,
generation_kwargs={"temperature": 0.7, "max_tokens": 500}
)
Embedder Integration
Purpose
Embedders are components that convert text into vector representations (embeddings) suitable for semantic search and similarity comparisons. They are essential for the retrieval portion of RAG pipelines.
Sources: docs-website/docs/pipeline-components/embedders/choosing-the-right-embedder.mdx:1-20
Embedder Types
| Type | Use Case | Deployment |
|---|---|---|
| Sentence Transformers | General text embeddings | Local / API |
| OpenAI Embeddings | API-based generation | Remote |
| Hugging Face | Transformer models | Local / Inference API |
| Cohere | Multi-lingual support | API |
Integration with Retrievers
Embedders work in conjunction with document stores to enable semantic search:
graph LR
A[Documents] --> B[Embedder]
B --> C[Vector Store]
C --> D[Retriever]
E[Query] --> F[Query Embedder]
F --> D
D --> G[Retrieved Docs]Function Calling
Function calling extends LLM integrations to enable structured interactions between LLMs and external tools. This feature allows Generators to produce structured outputs that can trigger specific actions.
Sources: docs-website/docs/pipeline-components/generators/guides-to-generators/function-calling.mdx:1-30
Workflow
sequenceDiagram
participant User
participant Pipeline
participant LLM
participant Tool
User->>Pipeline: Query with function definitions
Pipeline->>LLM: Send prompt + function specs
LLM->>LLM: Analyze request
LLM-->>Pipeline: Function call + parameters
Pipeline->>Tool: Execute function
Tool-->>Pipeline: Function result
Pipeline->>LLM: Send result + original context
LLM-->>Pipeline: Final response
Pipeline-->>User: Return answerIntegration Configuration
Environment Setup
Integrations in Haystack typically require API credentials which can be configured via environment variables:
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export HUGGINGFACE_TOKEN="your-hf-token"
Sources: docs-website/docs/concepts/integrations.mdx:1-25
Configuration Options
| Parameter | Description | Default |
|---|---|---|
api_key | Provider API key | Environment variable |
model | Model identifier | Provider default |
timeout | Request timeout in seconds | 60 |
max_retries | Number of retry attempts | 3 |
Pipeline Integration Example
from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.generators import OpenAIChatGenerator
from haystack.document_stores import InMemoryDocumentStore
# Initialize components
document_store = InMemoryDocumentStore()
retriever = InMemoryBM25Retriever(document_store=document_store)
generator = OpenAIChatGenerator(model="gpt-4")
# Build pipeline
pipeline = Pipeline()
pipeline.add_component("retriever", retriever)
pipeline.add_component("generator", generator)
pipeline.connect("retriever", "generator")
Installation
To use LLM and Embedder integrations, install the appropriate Haystack packages:
# Core package
pip install haystack-ai
# For specific integrations
pip install "haystack-ai[openai]" # OpenAI models
pip install "haystack-ai[anthropic]" # Anthropic Claude
pip install "haystack-ai[transformers]" # Hugging Face
Additional Resources
Sources: README.md:1-10
Document Stores and Retrievers
Related topics: LLM and Embedder Integrations, Data Processing Components
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LLM and Embedder Integrations, Data Processing Components
Document Stores and Retrievers
Document Stores and Retrievers are fundamental components in the Haystack framework that enable efficient storage, indexing, and retrieval of documents for LLM-powered applications. These components form the backbone of retrieval-augmented generation (RAG) pipelines and semantic search systems.
Overview
Haystack provides a unified abstraction layer for document storage and retrieval, allowing developers to work with different backend technologies through a consistent interface. The framework supports multiple document store implementations, each optimized for different use cases, scales, and deployment requirements.
Document Stores in Haystack handle the persistence and indexing of documents, while Retrievers are specialized components that query these stores to find relevant documents based on user queries. This separation of concerns allows for flexible pipeline composition and easy swapping of storage backends.
Architecture
graph TD
A[User Query] --> B[Retriever]
B --> C[Document Store]
C --> D[(Vector Index)]
C --> E[(Document DB)]
F[Documents] --> C
G[Embedding Model] --> D
B --> H[Query Embedding]
H --> D
D --> I[Relevant Documents]
I --> J[RAG Pipeline]The architecture separates concerns between storage and retrieval, enabling optimized implementations for each layer.
Document Store Types
Haystack supports multiple document store implementations, each with distinct characteristics:
| Document Store | Type | Use Case | Scalability |
|---|---|---|---|
| InMemoryDocumentStore | In-memory | Development, testing, small datasets | Single machine, limited scale |
| ElasticsearchDocumentStore | Distributed search | Production, full-text search | Horizontal scaling |
| QdrantDocumentStore | Vector database | Semantic search, embeddings | High-dimensional vectors |
| PineconeDocumentStore | Managed vector DB | Cloud-native, managed infrastructure | Global distribution |
InMemoryDocumentStore
The InMemoryDocumentStore is the simplest document store implementation, storing all data in memory. It is primarily used for development, testing, and prototyping scenarios where persistence is not required.
Key Characteristics:
- No external dependencies required
- Fast read/write operations for small datasets
- Data lost on application restart
- Not suitable for production deployments with large volumes
Sources: docs-website/docs/document-stores/inmemorydocumentstore.mdx
ElasticsearchDocumentStore
Elasticsearch provides a mature, production-ready document store with powerful full-text search capabilities. It is well-suited for applications requiring sophisticated text analysis, faceted search, and scalable infrastructure.
Key Characteristics:
- Distributed architecture for high availability
- Rich query DSL for complex search operations
- BM25 ranking algorithm for relevance scoring
- Supports millions of documents
Sources: docs-website/docs/document-stores/elasticsearch-document-store.mdx
QdrantDocumentStore
Qdrant is a vector database optimized for similarity search and high-dimensional embeddings. It provides efficient nearest neighbor search operations essential for semantic retrieval.
Key Characteristics:
- Optimized for vector similarity search
- Supports payload filtering
- Hybrid sparse-dense vector search
- gRPC-based API for performance
Sources: docs-website/docs/document-stores/qdrant-document-store.mdx
PineconeDocumentStore
Pinecone is a managed vector database service that eliminates infrastructure management overhead. It provides global distribution and automatic scaling for production deployments.
Key Characteristics:
- Fully managed cloud service
- Automatic scaling and sharding
- Multi-tenancy support
- Low-latency querying at scale
Sources: docs-website/docs/document-stores/pinecone-document-store.mdx
Choosing a Document Store
Selecting the appropriate document store depends on several factors including scale, performance requirements, deployment environment, and feature needs.
Sources: docs-website/docs/concepts/document-store/choosing-a-document-store.mdx
Decision Criteria
| Factor | InMemory | Elasticsearch | Qdrant | Pinecone |
|---|---|---|---|---|
| Dataset Size | < 100K docs | Unlimited | Unlimited | Unlimited |
| Latency | Very low | Medium | Low | Low |
| Persistence | None | Full | Full | Full |
| Full-text Search | Basic | Advanced | Limited | Limited |
| Vector Search | Basic | Plugin required | Native | Native |
| Managed Service | No | Self-hosted/Cloud | Self-hosted/Cloud | Yes (managed) |
| Cost | Free | Infrastructure | Infrastructure | Usage-based |
Recommendations
Development and Testing: Use InMemoryDocumentStore for rapid prototyping and unit testing. It requires no setup and provides immediate feedback.
Production with Full-text Search: Choose ElasticsearchDocumentStore when your application requires complex text queries, aggregations, or you already have an Elasticsearch infrastructure.
Semantic Search at Scale: Select QdrantDocumentStore or PineconeDocumentStore for applications primarily relying on embedding-based similarity search. Both provide native vector operations with efficient indexing.
Document Model
Documents in Haystack follow a standardized data model that captures content, metadata, and embedding vectors.
classDiagram
class Document {
+str id
+str content
+dict meta
+List[float] embedding
+str blob
+str blob_mime_type
}Core Document Fields:
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier for the document |
content | string | Main text content of the document |
meta | dict | Arbitrary metadata (source, author, date, etc.) |
embedding | list[float] | Vector representation for semantic search |
Sources: docs-website/docs/concepts/document-store.mdx
Retriever Types
Retrievers query document stores to find the most relevant documents for a given query. Haystack provides multiple retriever implementations optimized for different search strategies.
Dense Retrievers
Dense retrievers use neural network models to encode queries and documents into dense vector representations. They excel at capturing semantic meaning and handling synonyms.
Sparse Retrievers
Sparse retrievers use traditional information retrieval techniques like BM25 or TF-IDF. They are effective for exact term matching and keyword-based queries.
Hybrid Retrievers
Hybrid retrievers combine both dense and sparse approaches, leveraging the strengths of each to provide robust retrieval across different query types.
Pipeline Integration
graph LR
A[Query] --> B[Retriever]
B --> C[Document Store]
C --> D[Top-K Documents]
D --> E[Ranker]
E --> F[Reader/Generator]
F --> G[Answer]Document Stores and Retrievers integrate seamlessly into Haystack pipelines, typically appearing early in the pipeline to fetch candidate documents before passing them to downstream components like Readers or Generators.
Basic Usage Example
from haystack import Document
from haystack.document_stores import InMemoryDocumentStore
from haystack.nodes import BM25Retriever
# Initialize document store
document_store = InMemoryDocumentStore()
# Write documents
documents = [
Document(content="Haystack is an open-source NLP framework", meta={"source": "docs"}),
Document(content="It supports retrieval-augmented generation", meta={"source": "blog"}),
]
document_store.write_documents(documents)
# Initialize retriever
retriever = BM25Retriever(document_store=document_store)
# Query
results = retriever.retrieve(query="What is Haystack?", top_k=10)
Performance Considerations
Indexing Performance
| Store | Indexing Speed | Memory Usage |
|---|---|---|
| InMemory | Very Fast | Proportional to dataset |
| Elasticsearch | Medium | Distributed across nodes |
| Qdrant | Fast | Optimized for vectors |
| Pinecone | Fast | Managed externally |
Query Performance
Query latency depends on the number of documents, vector dimensions, and the complexity of filters applied. Vector databases like Qdrant and Pinecone use specialized indexing structures (HNSW, IVF) to achieve sub-millisecond query times on large datasets.
See Also
- Document Store Concepts - Detailed conceptual overview
- Choosing a Document Store - Selection guide
- Pipeline Components - How retrievers fit into pipelines
- Embedding Models - Generating document embeddings
Sources: docs-website/docs/document-stores/inmemorydocumentstore.mdx
Agent Systems
Related topics: Introduction to Haystack, Pipeline Architecture, LLM and Embedder Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Haystack, Pipeline Architecture, LLM and Embedder Integrations
Agent Systems
Agent systems in Haystack represent a powerful paradigm for building autonomous and semi-autonomous AI applications that can perceive, reason, act, and interact with their environment. Haystack's agent framework enables developers to create sophisticated LLM-powered applications where agents can use tools, maintain state, collaborate with other agents, and incorporate human feedback into their decision-making processes.
Overview
Haystack agents are designed to extend beyond simple prompt-response interactions by providing a structured mechanism for Large Language Models to take actions, make decisions, and execute multi-step workflows. The agent system in Haystack is built with flexibility and modularity in mind, allowing developers to customize every aspect of agent behavior from the underlying model to the specific tools available and the logic governing agent decisions.
The framework supports a variety of agent types and architectures, ranging from single-agent systems that handle specific tasks to complex multi-agent ecosystems where multiple specialized agents collaborate to solve problems. This flexibility makes Haystack suitable for a wide range of use cases, from simple question-answering applications to sophisticated autonomous systems that can browse the web, execute code, and coordinate with other agents to complete complex tasks.
Core Architecture
The agent architecture in Haystack is built around a pipeline-based model that connects perception, reasoning, action selection, and execution into a cohesive workflow. At its core, an agent consists of several key components that work together to enable autonomous behavior.
Agent Components
| Component | Purpose | Description |
|---|---|---|
| LLM | Reasoning Engine | The underlying language model that drives decision-making |
| Tools | Action Interface | Capabilities that allow the agent to interact with external systems |
| Prompt Builder | Instruction Assembly | Constructs prompts that guide agent behavior |
| Output Handler | Response Processing | Interprets and executes agent decisions |
| Memory | State Management | Maintains conversation history and context |
Sources: docs-website/docs/pipeline-components/agents-1/agent.mdx
Execution Flow
graph TD
A[User Input] --> B[Agent Receives Task]
B --> C[LLM Reasoning]
C --> D{Tool Selection?}
D -->|Yes| E[Execute Tool]
E --> F[Process Result]
D -->|No| G[Generate Response]
F --> C
G --> H[Return to User]
C --> I{Human Input Needed?}
I -->|Yes| J[Pause for Human Feedback]
J --> C
I -->|No| DThe execution flow demonstrates how Haystack agents operate in a loop, continuously reasoning about the best course of action until the task is complete. The agent receives input, reasons about what to do, selects and executes tools as needed, and continues until it can provide a final response or requires additional input from the user or human overseer.
State Management
State management is a critical aspect of agent systems, enabling agents to maintain context across multiple interactions and track the progress of complex, multi-step tasks. Haystack provides a flexible state management system that allows agents to store, retrieve, and update information throughout their execution lifecycle.
State Structure
The state system in Haystack agents typically includes several key elements that together form a comprehensive view of the agent's current situation and history. These elements enable the agent to maintain awareness of what has happened previously, what actions have been taken, and what information has been gathered.
| State Element | Type | Description |
|---|---|---|
| Conversation History | List | Previous messages and interactions |
| Tool Usage Log | List | Record of tools called and results |
| Intermediate Results | Dict | Data collected during task execution |
| User Preferences | Dict | Learned user preferences and feedback |
| Task Progress | Dict | Current status of ongoing tasks |
Sources: docs-website/docs/pipeline-components/agents-1/state.mdx
State Persistence
Agents in Haystack can maintain state across sessions, enabling persistent memory and long-term learning. This is particularly valuable for applications where the agent needs to build relationships with users over time or maintain knowledge about specific domains or tasks. The state management system supports various backends for persistence, from simple in-memory storage to distributed databases for production deployments.
Multi-Agent Systems
Haystack supports the creation of sophisticated multi-agent systems where multiple specialized agents work together to solve problems. This architectural pattern enables the decomposition of complex tasks into smaller, manageable subtasks that can be handled by agents with specialized capabilities.
Agent Collaboration Patterns
graph TD
subgraph Coordinator Agent
A[Task Received] --> B{Analyze Task}
B --> C[Decompose into Subtasks]
end
subgraph Specialized Agents
D[Agent A: Research]
E[Agent B: Analysis]
F[Agent C: Synthesis]
end
C --> D
C --> E
C --> F
D --> G[Results Aggregation]
E --> G
F --> G
G --> H[Final Response]Multi-agent systems in Haystack can be configured with various collaboration patterns. In the supervisor pattern, a single coordinating agent directs the work of subordinate agents, assigning tasks and collecting results. In the collaborative pattern, agents work together as equals, sharing information and contributing their expertise to solve problems collectively.
Communication Protocols
Agents in a multi-agent system communicate through well-defined interfaces that specify how messages are passed between agents, how responses are aggregated, and how conflicts are resolved. This structured approach to agent communication ensures reliable operation even in complex agent ecosystems with many participants.
Sources: docs-website/docs/concepts/agents/multi-agent-systems.mdx
Human-in-the-Loop
Haystack agents support human-in-the-loop workflows, enabling humans to provide guidance, approval, or corrections during agent execution. This capability is essential for applications where autonomous operation must be balanced with human oversight and control.
Interaction Modes
| Mode | Description | Use Case |
|---|---|---|
| Approval | Human approves agent actions before execution | High-stakes decisions |
| Feedback | Human provides corrective feedback during execution | Fine-tuning agent behavior |
| Escalation | Agent defers to human when uncertain | Handling edge cases |
| Validation | Human validates agent outputs before completion | Quality assurance |
Sources: docs-website/docs/pipeline-components/agents-1/human-in-the-loop.mdx
Workflow Integration
graph TD
A[Agent Task] --> B{Requires Human Input?}
B -->|Yes| C[Pause Execution]
C --> D[Notify Human]
D --> E[Await Response]
E --> F{Human Action}
F -->|Approve| G[Continue Execution]
F -->|Reject| H[Abort or Retry]
F -->|Modify| I[Apply Modifications]
B -->|No| G
I --> G
G --> J[Task Complete]The human-in-the-loop system is designed to be non-intrusive, minimizing the cognitive load on human overseers while ensuring that critical decisions receive appropriate human review. Agents can be configured to automatically escalate certain types of decisions based on predefined rules, such as actions that affect sensitive data or exceed specified cost thresholds.
Tool Integration
A defining characteristic of Haystack agents is their ability to use tools to interact with external systems and perform actions beyond text generation. The tool integration system provides a standardized interface for defining, registering, and invoking tools that extend agent capabilities.
Available Tool Categories
| Category | Examples | Capabilities |
|---|---|---|
| Web Search | Google Search, Bing Search | Internet research, fact checking |
| API Clients | REST, GraphQL | External service integration |
| Code Execution | Python, Shell | Computation, automation |
| Document Processing | PDF, CSV parsers | Information extraction |
| Database | SQL, Vector DB | Data retrieval, storage |
Tools in Haystack follow a consistent interface that makes it easy to create custom tools for domain-specific applications. Each tool is defined with a name, description, input schema, and implementation, and the agent automatically learns when and how to use tools based on their descriptions.
Configuration Options
Haystack agents expose a wide range of configuration options that allow developers to customize agent behavior for specific use cases. These options control aspects ranging from the underlying model selection to detailed parameters governing agent decision-making.
Core Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model | String | Required | The LLM to use for reasoning |
max_iterations | Integer | 10 | Maximum tool-calling loops |
tools | List | Empty | Available tools for the agent |
prompt_template | String | Default | Custom instruction template |
verbose | Boolean | False | Enable detailed logging |
Advanced configuration options allow developers to customize how the agent reasons, how it selects tools, and how it handles errors and edge cases. These options can be set at the agent level or overridden for specific use cases.
Best Practices
When building agent systems with Haystack, several best practices can help ensure reliable and maintainable applications. Careful attention to prompt design, tool definitions, and error handling will significantly improve agent performance and user experience.
Clear and specific tool descriptions are essential for guiding agent behavior. Tools should have descriptive names and comprehensive descriptions that explain not just what the tool does, but when and why an agent should consider using it. This helps the underlying LLM make informed decisions about tool selection.
State management should be designed with the target use case in mind. For simple single-turn interactions, minimal state management is appropriate. For complex multi-step tasks, comprehensive state tracking ensures the agent maintains context and can recover from errors gracefully.
Human-in-the-loop integration should be thoughtfully designed to balance autonomy with oversight. Critical decisions should require human approval, while routine operations can proceed autonomously. The escalation criteria should be clearly defined and regularly reviewed.
Summary
Haystack's agent systems provide a comprehensive framework for building LLM-powered applications that can perceive, reason, and act. The architecture supports everything from simple single-agent applications to complex multi-agent ecosystems with human oversight. Key features include flexible state management, extensive tool integration, human-in-the-loop workflows, and configurable agent behavior.
Sources: docs-website/docs/pipeline-components/agents-1/agent.mdx
Development Guide
Related topics: Deployment and Infrastructure, Introduction to Haystack
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Deployment and Infrastructure, Introduction to Haystack
Development Guide
This guide provides comprehensive information for developers who want to contribute to Haystack or extend its functionality. Haystack is an end-to-end LLM framework that enables building applications powered by Large Language Models, Transformer models, and vector search capabilities.
Overview
Haystack is an open-source framework maintained by deepset that allows developers to build production-ready AI applications. The framework supports retrieval-augmented generation (RAG), document search, question answering, and answer generation by orchestrating state-of-the-art embedding models and LLMs into pipelines.
Sources: README.md:1-10
Project Structure
The Haystack repository is organized into several main directories, each serving a specific purpose in the overall project ecosystem.
graph TD
A[haystack/ root] --> B[Main Package]
A --> C[docs-website/]
A --> D[docker/]
A --> E[pydoc/]
A --> F[examples/]
B --> G[Core Framework Code]
C --> H[Documentation Site]
D --> I[Docker Images]
E --> J[API Reference Generation]
F --> K[Example Cookbooks]Directory Breakdown
| Directory | Purpose |
|---|---|
haystack/ | Main Python package containing core framework code |
docs-website/ | Docusaurus-powered documentation site |
docker/ | Docker image definitions and build configurations |
pydoc/ | YAML configurations for API reference generation |
examples/ | Example applications and cookbooks (moved to haystack-cookbook) |
Sources: docs-website/README.md:40-55
Installation for Development
Standard Installation
To set up Haystack for development, install the package via pip:
pip install haystack-ai
Nightly Pre-releases
For trying the newest features before official releases:
pip install --pre haystack-ai
Docker-based Development
Haystack provides Docker images for development environments. The base image contains a working Python environment with Haystack preinstalled and is designed to be derived FROM.
docker buildx bake base
To build custom images with specific branches or tags:
HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache
Sources: docker/README.md:15-30
Multi-Platform Docker Builds
Haystack images support multiple architectures. To limit builds to your local architecture:
# For Apple M1 (ARM)
docker buildx bake base --set "*.platform=linux/arm64"
Sources: docker/README.md:40-45
Documentation Development
The documentation website is built with Docusaurus 3 and provides comprehensive guides, tutorials, API references, and best practices for using Haystack.
Prerequisites
- Node.js 18 or higher
- npm (included with Node.js) or Yarn
Setting Up the Documentation Site
# Clone the repository and navigate to docs-website
git clone https://github.com/deepset-ai/haystack.git
cd haystack/docs-website
# Install dependencies
npm install
# Start the development server
npm start
# The site opens at http://localhost:3000 with live reload
Common Documentation Tasks
| Task | Command | Location |
|---|---|---|
| Edit a page | Update files under docs/ or versioned_docs/ | Preview at http://localhost:3000 |
| Add to sidebar | Update sidebars.js with doc ID | docs-website/ |
| Production check | npm run build && npm run serve | docs-website/ |
Sources: docs-website/README.md:20-35
Documentation Project Structure
docs-website/
├── docs/ # Main documentation (guides, tutorials, concepts)
│ ├── _templates/ # Authoring templates (excluded from build)
│ ├── concepts/ # Core Haystack concepts
│ ├── pipeline-components/ # Component documentation
│ └── ...
├── reference/ # API reference (auto-generated, do not edit manually)
├── versioned_docs/ # Versioned copies of docs/
├── reference_versioned_docs/ # Versioned copies of reference/
├── src/ # React components and custom code
│ ├── components/ # Custom React components
│ ├── css/ # Global styles
│ ├── pages/ # Custom pages
│ ├── remark/ # Remark plugins
│ └── theme/ # Docusaurus theme customization
Sources: docs-website/README.md:45-60
API Reference Development
The API reference is generated automatically from docstrings in the code using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes.
How API Reference Works
- Create a
.ymlfile in thepydocdirectory - Configure how haystack-pydoc-tools will generate the page
- Commit the configuration to the main branch
- The GitHub workflow automatically generates the Markdown files
Version Management
All updates to API reference live in unstable docs version and are promoted to stable docs version when a new version is released.
Sources: pydoc/README.md:1-20
Contributing to Haystack
Haystack welcomes community contributions ranging from quick fixes like typo corrections to entirely new features.
Contribution Areas
| Area | Repository | Description |
|---|---|---|
| Main Haystack | deepset-ai/haystack | Core framework development |
| Integrations | deepset-ai/haystack-core-integrations | Integration components |
| Documentation | haystack/docs-website | Documentation content |
Getting Started
- Review the Contributor Guidelines in CONTRIBUTING.md
- Check the full list of open issues available for contributions
- You don't need to be a Haystack expert to provide meaningful improvements
CI/CD and Quality Standards
The project maintains high quality standards through automated checks:
| Check | Badge | Description |
|---|---|---|
| Tests | GitHub Actions | Automated test suite |
| Type Checking | Mypy | Static type analysis |
| Code Coverage | Coverage Badge | Test coverage reporting |
| Linting | Ruff | Code style enforcement |
| License Compliance | License Check | Dependency license verification |
Sources: README.md:30-55
Development Workflow
graph TD
A[Start Development] --> B[Clone Repository]
B --> C[Set Up Environment]
C --> D[Install Dependencies]
D --> E[Make Changes]
E --> F[Run Tests]
F --> G{Tests Pass?}
G -->|No| H[Fix Issues]
H --> E
G -->|Yes| I[Run Linters]
I --> J{Code Quality OK?}
J -->|No| K[Address Linter Issues]
K --> E
J -->|Yes| L[Submit Pull Request]
L --> M[Review Process]
M --> N[Merge to Main]Examples and Cookbooks
Example applications have been moved to a dedicated repository. All example cookbooks are now located at:
Repository: https://github.com/deepset-ai/haystack-cookbook/
This separation allows for more focused development and easier discovery of example applications.
Sources: examples/README.md:1-10
License and Compliance
All contributions must comply with the project's license. View license information at:
The project includes automated license compliance checking through GitHub workflows.
Sources: docker/README.md:50-60
Quick Reference Commands
| Command | Purpose |
|---|---|
pip install haystack-ai | Install Haystack |
pip install --pre haystack-ai | Install pre-release version |
npm install | Install documentation dependencies |
npm start | Start documentation dev server |
npm run build | Build documentation site |
docker buildx bake base | Build Docker base image |
Additional Resources
- Documentation Site: https://docs.haystack.deepset.ai
- GitHub Repository: https://github.com/deepset-ai/haystack
- Community: GitHub Discussions and Stack Overflow
- Discord: Join the Haystack Discord community
Sources: README.md:1-10
Deployment and Infrastructure
Related topics: Development Guide, Introduction to Haystack
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Development Guide, Introduction to Haystack
Deployment and Infrastructure
Overview
Haystack provides a comprehensive deployment infrastructure designed for production-ready LLM applications. The framework supports multiple deployment strategies including Docker containers, Kubernetes orchestration, and cloud platform integrations. This documentation covers the core deployment mechanisms, containerization approach, GPU acceleration support, and production best practices.
The deployment system is built around Docker images using BuildKit for efficient multi-platform builds, enabling deployment across x86_64 and ARM64 architectures. The infrastructure supports both development environments and production-grade deployments with high availability requirements.
Docker Containerization
Base Images
Haystack provides pre-built Docker images that serve as the foundation for custom deployments. The base images contain a working Python environment with Haystack preinstalled and are intended to be extended with application-specific configurations.
The primary image variant available is:
| Image Tag | Description | Use Case |
|---|---|---|
haystack:base-<version> | Base Python environment with Haystack | Custom image derivation |
All images are published to Docker Hub and can be pulled directly for use in production environments. The images follow semantic versioning and align with Haystack releases.
Building Custom Images
Custom images can be built using Docker BuildKit and the bake command orchestrator. This approach allows for:
- Custom Haystack versions or branches
- Pre-installed dependencies
- Application-specific configurations
- Multi-platform support
The build process uses the docker-bake.hcl configuration file which defines build targets, platforms, and variable substitutions.
#### Basic Build Command
docker buildx bake base
#### Building with Custom Variables
To build with a custom Haystack version or branch, override the HAYSTACK_VERSION variable:
HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache
This mechanism enables CI/CD pipelines to build images from specific commits, branches, or release tags without modifying the underlying Dockerfile.
Multi-Platform Builds
Haystack Docker images support multiple architectures including:
linux/amd64(x86_64)linux/arm64(ARM64)
#### Platform Limitations
Depending on the operating system and Docker environment, building all platforms locally may not be possible. If encountering the following error:
multiple platforms feature is currently not supported for docker driver. Please switch to a different driver
(eg. "docker buildx create --use")
The platform option must be overridden to match the local architecture. For example, on Apple M1 (ARM64):
docker buildx bake base --set "*.platform=linux/arm64"
#### Cross-Platform Considerations
When deploying multi-platform images, consider the following:
- CPU Compatibility: Ensure target nodes match the built architecture
- Performance: Native architecture builds perform optimally
- Registry Support: Use registries that support multi-platform manifests
GPU Acceleration
Hardware Acceleration Support
Haystack supports GPU acceleration for compute-intensive operations including:
- Model inference
- Embedding generation
- Tokenization
- Custom model operations
GPU acceleration significantly improves throughput for LLM-based pipelines and embedding-heavy workloads.
Enabling GPU Support
#### NVIDIA GPUs (CUDA)
For NVIDIA GPU support, use CUDA-enabled base images and ensure the nvidia-container-toolkit is installed on the host system.
Docker Compose Example:
services:
haystack:
image: haystack:base-latest
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
#### AMD GPUs (ROCm)
AMD GPU support requires ROCm-enabled images and appropriate runtime configuration.
GPU Memory Management
For production deployments, configure memory limits based on model size:
| Model Size | Recommended GPU Memory | Configuration |
|---|---|---|
| Small (<1B params) | 8 GB | CUDA_VISIBLE_DEVICES=0 |
| Medium (1-7B params) | 16 GB | CUDA_VISIBLE_DEVICES=0,1 |
| Large (7-70B params) | 32+ GB | Multi-GPU / quantization |
Quantization Options
To reduce GPU memory requirements, consider model quantization:
- 4-bit quantization: Reduces memory by ~75%
- 8-bit quantization: Reduces memory by ~50%
- Dynamic quantization: Trade-off between speed and accuracy
Kubernetes Deployment
Container Orchestration
Haystack can be deployed on Kubernetes for production environments requiring:
- Horizontal scaling
- High availability
- Rolling updates
- Resource management
- Service discovery
Resource Configuration
#### Resource Limits
Configure CPU and memory limits based on workload:
resources:
limits:
cpu: "4"
memory: "16Gi"
requests:
cpu: "2"
memory: "8Gi"
#### GPU Resource Allocation
For GPU workloads, define accelerator resources:
resources:
limits:
nvidia.com/gpu: "2"
requests:
nvidia.com/gpu: "1"
High Availability Configuration
For production deployments, implement:
- Replica Sets: Deploy multiple replicas for fault tolerance
- Health Checks: Configure liveness and readiness probes
- Pod Disruption Budgets: Ensure availability during updates
- Anti-Affinity Rules: Distribute pods across nodes
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
Service Configuration
Expose Haystack services using Kubernetes Services:
apiVersion: v1
kind: Service
metadata:
name: haystack-api
spec:
selector:
app: haystack
ports:
- protocol: TCP
port: 80
targetPort: 8000
type: LoadBalancer
Production Best Practices
Security Considerations
| Practice | Implementation |
|---|---|
| Non-root execution | Configure USER directive in Dockerfile |
| Secret management | Use Kubernetes Secrets or external secret stores |
| Network policies | Restrict pod-to-pod communication |
| Image scanning | Scan images for vulnerabilities before deployment |
| TLS termination | Configure ingress with TLS certificates |
Monitoring and Observability
Implement monitoring using:
- Metrics: Prometheus exporter for pipeline metrics
- Logging: Centralized logging with ELK/Graylog
- Tracing: OpenTelemetry for request tracing
- Alerts: Configure alerts for error rates and latency
Performance Optimization
- Connection Pooling: Reuse database and API connections
- Caching: Implement caching for frequently accessed data
- Batch Processing: Process multiple requests in batches
- Async Processing: Use async/await for I/O operations
CI/CD Integration
Automated Builds
Haystack supports automated Docker image builds through:
- GitHub Actions workflows
- BuildKit with bake files
- Multi-stage Docker builds
Deployment Workflows
graph TD
A[Code Change] --> B[Run Tests]
B --> C[Build Docker Image]
C --> D[Push to Registry]
D --> E[Update Deployment]
E --> F[Health Check]
F --> G{Healthy?}
G -->|Yes| H[Deployment Complete]
G -->|No| I[Rollback]Registry Configuration
Popular registry options for Haystack images:
| Registry | Use Case | Authentication |
|---|---|---|
| Docker Hub | Public deployments | Optional |
| AWS ECR | AWS infrastructure | IAM roles |
| GCR | GCP infrastructure | Service accounts |
| Azure ACR | Azure infrastructure | Service principals |
| Private Registry | Enterprise deployments | Username/password |
License and Compliance
The Haystack Docker images contain:
- Haystack framework code under the Apache 2.0 license
- Python runtime components
- Base distribution software with their respective licenses
Users are responsible for ensuring compliance with all software licenses contained within deployed images. For enterprise deployments, review the license implications of all included components.
Related Documentation
Summary
Haystack provides a flexible and production-ready deployment infrastructure supporting Docker containerization, Kubernetes orchestration, and GPU acceleration. The multi-platform Docker images enable deployment across diverse infrastructure, while Kubernetes support facilitates enterprise-grade deployments with high availability and scalability requirements. GPU acceleration support enables high-performance inference for LLM-powered applications, with quantization options for resource-constrained environments.
Source: https://github.com/deepset-ai/haystack / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: RFC: Signed receipts for Haystack pipeline component calls
- Severity: high
- Finding: Installation risk is backed by a source signal: RFC: Signed receipts for Haystack pipeline component calls. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11039
2. Installation risk: feat: Add `run_async` to `MultiQueryEmbeddingRetriever`, `MultiQueryTextRetriever`, and `TextEmbeddingRetriever`
- Severity: high
- Finding: Installation risk is backed by a source signal: feat: Add
run_asynctoMultiQueryEmbeddingRetriever,MultiQueryTextRetriever, andTextEmbeddingRetriever. Treat it as a review item until the current version is checked. - User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11358
3. Installation risk: feat: add INTERSECTION join mode to DocumentJoiner
- Severity: high
- Finding: Installation risk is backed by a source signal: feat: add INTERSECTION join mode to DocumentJoiner. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11365
4. Maintenance risk: docs: Update Ragas docs
- Severity: high
- Finding: Maintenance risk is backed by a source signal: docs: Update Ragas docs. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11178
5. Security or permission risk: EnvVarSecrets: add multi-tenant context support (ContextVar / pipeline-run context)
- Severity: high
- Finding: Security or permission risk is backed by a source signal: EnvVarSecrets: add multi-tenant context support (ContextVar / pipeline-run context). Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11366
6. Security or permission risk: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
- Severity: high
- Finding: Security or permission risk is backed by a source signal: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11311
7. Security or permission risk: feat: support token-based budget in LostInTheMiddleRanker
- Severity: high
- Finding: Security or permission risk is backed by a source signal: feat: support token-based budget in LostInTheMiddleRanker. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11351
8. Installation risk: Developers should check this installation risk before relying on the project: Proposal: Transaction Protocol for idempotent, auditable agent pipelines
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Proposal: Transaction Protocol for idempotent, auditable agent pipelines
- User impact: Developers may fail before the first successful local run: Proposal: Transaction Protocol for idempotent, auditable agent pipelines
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Proposal: Transaction Protocol for idempotent, auditable agent pipelines. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | fmev_58038e9b6373edf9376049b42d4b7bb4 | https://github.com/deepset-ai/haystack/issues/11266 | Proposal: Transaction Protocol for idempotent, auditable agent pipelines
9. Installation risk: Developers should check this installation risk before relying on the project: RFC: Signed receipts for Haystack pipeline component calls
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: RFC: Signed receipts for Haystack pipeline component calls
- User impact: Developers may fail before the first successful local run: RFC: Signed receipts for Haystack pipeline component calls
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: RFC: Signed receipts for Haystack pipeline component calls. Context: Observed when using node, python
- Evidence: failure_mode_cluster:github_issue | fmev_ce0b9c65d21126dcf11ede12120e154f | https://github.com/deepset-ai/haystack/issues/11039 | RFC: Signed receipts for Haystack pipeline component calls
10. Installation risk: Developers should check this installation risk before relying on the project: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
- User impact: Developers may fail before the first successful local run: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | fmev_4d3276b6b9938595cb2dbb864a5509da | https://github.com/deepset-ai/haystack/issues/11311 | Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
11. Installation risk: Developers should check this installation risk before relying on the project: [FEATURE] Support for code syntax-aware Document Splitters
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: [FEATURE] Support for code syntax-aware Document Splitters
- User impact: Developers may fail before the first successful local run: [FEATURE] Support for code syntax-aware Document Splitters
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [FEATURE] Support for code syntax-aware Document Splitters. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | fmev_997b84068ae32409b1d8d55daaddd984 | https://github.com/deepset-ai/haystack/issues/11354 | [FEATURE] Support for code syntax-aware Document Splitters
12. Installation risk: MCP Server for Haystack docs
- Severity: medium
- Finding: Installation risk is backed by a source signal: MCP Server for Haystack docs. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11346
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using haystack with real data or production workflows.
- EnvVarSecrets: add multi-tenant context support (ContextVar / pipeline-r - github / github_issue
- feat: add INTERSECTION join mode to DocumentJoiner - github / github_issue
- DocumentJoiner concatenate mode incorrectly drops documents with score=0 - github / github_issue
- feat: Add
run_asynctoMultiQueryEmbeddingRetriever, `MultiQueryText - github / github_issue - MCP Server for Haystack docs - github / github_issue
- RFC: Signed receipts for Haystack pipeline component calls - github / github_issue
- [[FEATURE] Support for code syntax-aware Document Splitters](https://github.com/deepset-ai/haystack/issues/11354) - github / github_issue
- Security: OWASP Agent Memory Guard for pipeline memory poisoning defense - github / github_issue
- feat: support token-based budget in LostInTheMiddleRanker - github / github_issue
- docs: Update Ragas docs - GitHub / issue
- Developers should check this installation risk before relying on the project: Proposal: Transaction Protocol for idempotent, auditable agent pipelines - GitHub / issue
- v2.25.2 - GitHub / issue
Source: Project Pack community evidence and pitfall evidence