Doramagic Project Pack · Human Manual

haystack

Haystack follows a component-based architecture where pipelines serve as the foundational building blocks. Pipelines connect various components including document stores, retrievers, reade...

Introduction to Haystack

Related topics: Pipeline Architecture, Core Concepts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Capabilities

Continue reading this section for the full explanation and source context.

Section Pipeline Components

Continue reading this section for the full explanation and source context.

Section Built for Context Engineering

Continue reading this section for the full explanation and source context.

Related topics: Pipeline Architecture, Core Concepts

Introduction to Haystack

Haystack is an end-to-end LLM framework that enables developers to build applications powered by Large Language Models (LLMs), Transformer models, vector search, and more. The framework provides a flexible architecture for orchestrating state-of-the-art embedding models and LLMs into pipelines to solve real-world NLP use cases.

What is Haystack?

Haystack is designed to facilitate the development of production-ready AI applications with a focus on context engineering—giving developers explicit control over how information is retrieved, ranked, filtered, combined, structured, and routed before it reaches the language model.

Sources: README.md:1()

Core Capabilities

CapabilityDescription
Retrieval-Augmented Generation (RAG)Combine vector search with LLMs for accurate, context-grounded responses
Document SearchFull-featured document indexing and semantic search
Question AnsweringExtract answers from large document collections
Pipeline OrchestrationBuild complex workflows with customizable components
Agent IntegrationDeploy autonomous agents with tool-use capabilities

Sources: docker/README.md:4-6()

Architecture Overview

Haystack follows a component-based architecture where pipelines serve as the foundational building blocks. Pipelines connect various components including document stores, retrievers, readers, generators, and custom tools.

graph TD
    A[User Query] --> B[Pipeline]
    B --> C[Retrievers]
    B --> D[Document Stores]
    C --> E[Rankers]
    E --> F[LLM / Generator]
    F --> G[Response]
    
    H[Documents] --> D
    
    style F fill:#e1f5fe
    style D fill:#fff3e0
    style C fill:#e8f5e9

Pipeline Components

Pipelines in Haystack are composed of interconnected nodes that process data sequentially or in parallel. Each component handles a specific stage of the document processing or inference workflow.

Component TypeFunction
DocumentStoreStores and indexes documents for retrieval
RetrieverFinds relevant documents from the store
RankerReorders retrieved documents by relevance
Reader/GeneratorExtracts answers or generates responses
PreprocessorCleans and splits documents before indexing
Custom NodesUser-defined processing logic

Sources: README.md:54-58()

Key Features

Built for Context Engineering

Haystack provides fine-grained control over the entire retrieval and generation pipeline. Developers can:

  • Define custom retrieval strategies
  • Implement multi-stage ranking pipelines
  • Route queries to specialized processing branches
  • Control how context is assembled before reaching the LLM

Flexible Pipeline Design

The framework supports both declarative and programmatic pipeline construction, allowing developers to define workflows through configuration files or Python code.

graph LR
    A[Query Input] --> B[Retriever Node]
    B --> C[Ranker Node]
    C --> D[LLM Node]
    D --> E[Output]
    
    F[Documents] --> G[Document Store]
    G --> B

Production-Ready Architecture

Haystack includes enterprise features such as:

  • Telemetry: Anonymous usage statistics collection for component initialization tracking (opt-out available)
  • Container Support: Docker images for consistent deployment environments
  • CI/CD Integration: Automated testing with GitHub Actions workflows
  • Type Checking: Full MyPy type annotation support

Sources: README.md:60-62()

Installation

Package Installation

The primary method for installing Haystack is via pip:

pip install haystack-ai

For testing pre-release features:

pip install --pre haystack-ai

Sources: README.md:28-34()

Docker Installation

Haystack provides Docker images for containerized deployments:

ImageDescription
haystack:base-<version>Base image with Haystack preinstalled for derivation

Multi-platform builds are supported for various architectures including linux/arm64 and linux/amd64.

docker buildx bake base

Sources: docker/README.md:8-14()

Documentation Structure

The Haystack documentation is hosted at docs.haystack.deepset.ai and organized into several sections:

SectionContent
Overview/IntroGetting started guides and project introduction
Get StartedQuick-start guide for building first LLM applications
TutorialsStep-by-step learning paths
CookbookPre-built recipes and example implementations
API ReferenceAuto-generated documentation from docstrings
ConceptsCore architectural concepts and design patterns

Sources: docs-website/README.md:1-8()

Documentation Versioning

The documentation site supports multiple versions:

  • Next (Unreleased): Documentation for upcoming features
  • Current (Stable): Documentation for the latest stable release
  • Past Versions: Archived documentation for previous releases

Sources: docs-website/src/pages/versions.js:1-25()

API Reference Generation

The API reference pages are automatically generated from docstrings using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes are merged.

Sources: pydoc/README.md:1-12()

Project Structure

haystack/
├── haystack/                    # Main package source code
├── docs-website/                # Docusaurus documentation site
│   ├── docs/                    # Main documentation content
│   ├── reference/               # Auto-generated API reference
│   └── versioned_docs/           # Versioned documentation snapshots
├── docker/                      # Docker image configurations
├── pydoc/                       # PyDoc configuration files
└── examples/                    # Example implementations
Note: Example implementations have been moved to the haystack-cookbook repository.

Sources: examples/README.md:1-5()

Community and Contributing

Haystack is open to contributions from developers of all skill levels. There are multiple ways to contribute:

Contribution AreaRepository
Core Frameworkdeepset-ai/haystack
Integrationsdeepset-ai/haystack-core-integrations
Documentationdeepset-ai/haystack/tree/main/docs-website

Community Resources

  • GitHub Issues: Bug reports and feature requests
  • GitHub Discussions: General questions and community support
  • Discord: Real-time community engagement
  • Stack Overflow: Tagged questions at haystack
  • Twitter/X: Updates and announcements

Sources: README.md:89-95()

Organizations Using Haystack

Haystack is trusted by thousands of production AI teams across industries:

IndustryOrganizations
Technology & AIApple, Meta, Databricks, NVIDIA, Intel
Public SectorEuropean Commission

Sources: README.md:78-85()

Licensing and Compliance

  • License: Apache 2.0
  • Type Checking: MyPy validated
  • Coverage: Automated test coverage tracking
  • License Compliance: Automated workflow verification

Sources: README.md:10-11()

Summary

Haystack provides a comprehensive framework for building production-ready LLM applications with emphasis on retrieval-augmented generation, flexible pipeline design, and context engineering. The framework's component-based architecture enables developers to customize every stage of the document processing and inference pipeline while maintaining production-grade reliability through integrated testing, documentation, and deployment tooling.

With support for Docker containerization, comprehensive documentation, and an active open-source community, Haystack serves as a robust foundation for teams implementing enterprise AI solutions across diverse industries.

Sources: README.md:1()

Pipeline Architecture

Related topics: Introduction to Haystack, Pipeline Component Types, Core Concepts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Connections

Continue reading this section for the full explanation and source context.

Section Pipeline Types

Continue reading this section for the full explanation and source context.

Section Key Features

Continue reading this section for the full explanation and source context.

Related topics: Introduction to Haystack, Pipeline Component Types, Core Concepts

Pipeline Architecture

Overview

The Pipeline architecture is the foundational component of the Haystack framework, enabling developers to construct flexible, modular workflows for building LLM-powered applications. Pipelines orchestrate the execution of various components—including retrievers, readers, generators, and custom processors—into cohesive data processing flows.

Pipelines in Haystack 2.x provide a declarative approach to defining application workflows, allowing developers to:

  • Connect multiple components in directed acyclic graphs (DAGs)
  • Route data between components with explicit connections
  • Handle both synchronous and asynchronous execution models
  • Debug and inspect execution through breakpoints and hooks
  • Persist and share pipeline configurations through serialization

Sources: docs-website/docs/concepts/pipelines.mdx

Core Concepts

Component Connections

Components in a Haystack Pipeline are connected through named input/output connections. Each component exposes specific input and output slots that define how data flows through the pipeline.

graph LR
    A[Document Store] -->|query results| B[Retriever]
    B -->|retrieved docs| C[Reader]
    C -->|answers| D[Output]
    
    style A fill:#e1f5fe
    style B fill:#fff3e0
    style C fill:#e8f5e9
    style D fill:#fce4ec

The connection model requires that:

  • Output types must be compatible with target input types
  • Components can have multiple inputs and outputs
  • Connections form a directed graph structure

Sources: docs-website/docs/concepts/pipelines.mdx:1-20

Pipeline Types

Haystack provides multiple pipeline implementations optimized for different use cases:

Pipeline TypeUse CaseExecution Model
Standard PipelineGeneral-purpose workflowsSynchronous
AsyncPipelineHigh-throughput I/O operationsAsynchronous with async/await
SearchPipelineRetrieval-focused workflowsOptimized for search
GenerativePipelineLLM-centric applicationsOptimized for generation

Sources: docs-website/docs/concepts/pipelines.mdx

AsyncPipeline

The AsyncPipeline extends the standard Pipeline with asynchronous execution capabilities, making it suitable for applications requiring high concurrency and non-blocking I/O operations.

Key Features

  • Non-blocking execution: Components can execute concurrently when dependencies are satisfied
  • Streaming support: Better handling of streaming responses from LLMs
  • Resource efficiency: Improved CPU and memory utilization for I/O-bound workloads
async def run_async_pipeline(pipeline, query):
    result = await pipeline.run_async(query=query)
    return result

Sources: docs-website/docs/concepts/pipelines/asyncpipeline.mdx

Execution Flow

graph TD
    A[Start] --> B{AsyncPipeline.run_async}
    B --> C[Execute Independent Components]
    C --> D{Wait for Dependencies?}
    D -->|No| E[Collect Results]
    D -->|Yes| F[Await Dependency]
    F --> E
    E --> G[Return Unified Result]
    
    style B fill:#bbdefb
    style C fill:#c8e6c9
    style G fill:#ffe0b2

Serialization

Pipeline configurations can be serialized to YAML format, enabling:

  • Persistence of pipeline definitions
  • Sharing configurations across environments
  • Version control for pipeline definitions
  • Reproducible deployments

Serialization Format

version: '2.0'
components:
  - name: MyRetriever
    type: BM25Retriever
    init_parameters:
      document_store: MyDocumentStore
  - name: MyReader
    type: FARMReader
    init_parameters:
      model_name_or_path: deepset/roberta-base-squad2
edges: []

Sources: docs-website/docs/concepts/pipelines/serialization.mdx

Loading Serialized Pipelines

from haystack import Pipeline

# Load from YAML
pipeline = Pipeline.load_from_yaml(path="pipeline_config.yaml")

Debugging Pipelines

Haystack provides comprehensive debugging capabilities to inspect and troubleshoot pipeline execution.

Execution Tracing

The debugging system tracks:

  • Component execution order
  • Input/output data at each stage
  • Execution timing and performance metrics
  • Error locations and stack traces
from haystack import Pipeline

pipeline = Pipeline()
pipeline.debug = True  # Enable debug mode
result = pipeline.run(query="What is Haystack?")

Sources: docs-website/docs/concepts/pipelines/debugging-pipelines.mdx

Pipeline Inspector

The Pipeline Inspector provides detailed visibility into:

Inspection TargetInformation Provided
Component GraphNode and edge relationships
Data FlowInput/output shapes and types
Execution StateRuntime values at breakpoints
PerformanceTiming and memory profiles

Pipeline Breakpoints

Breakpoints allow execution to pause at specific points, enabling detailed inspection of intermediate results.

graph LR
    A[Pipeline Run] --> B{Breakpoint 1?}
    B -->|Yes| C[Pause & Inspect]
    C --> D{Continue?}
    D -->|Yes| E{Breakpoint 2?}
    D -->|No| Z[Abort]
    E -->|Yes| F[Pause & Inspect]
    E -->|No| G[Continue to End]
    B -->|No| E
    
    style C fill:#fff9c4
    style F fill:#fff9c4
    style Z fill:#ffcdd2

Breakpoint Configuration

Breakpoints can be configured at:

  • Component level: Pause before or after specific component execution
  • Connection level: Inspect data flowing through specific connections
  • Condition level: Pause only when certain conditions are met

Sources: docs-website/docs/concepts/pipelines/pipeline-breakpoints.mdx

Best Practices

Pipeline Design

  1. Modularity: Keep components focused on single responsibilities
  2. Clear naming: Use descriptive names for components and connections
  3. Error handling: Implement proper error handling at component boundaries
  4. Testing: Unit test individual components before integration

Performance Optimization

StrategyDescription
CachingEnable caching for expensive operations
BatchingUse batch processing for multiple queries
Async executionPrefer AsyncPipeline for I/O-bound workflows
Resource limitsSet appropriate timeouts and memory limits

Architecture Summary

graph TD
    subgraph "Pipeline Layer"
        A[Pipeline] --> B[AsyncPipeline]
        A --> C[SearchPipeline]
        A --> D[GenerativePipeline]
    end
    
    subgraph "Component Layer"
        E[Retrievers] --> A
        F[Readers] --> A
        G[Generators] --> A
        H[Custom Processors] --> A
    end
    
    subgraph "Data Layer"
        I[Document Stores] --> E
        J[Models] --> F
        J --> G
    end
    
    subgraph "Infrastructure"
        K[Serialization] -.-> A
        L[Debugging] -.-> A
        M[Breakpoints] -.-> A
    end

Sources: docs-website/docs/concepts/pipelines.mdx

Core Concepts

Related topics: Pipeline Architecture, Pipeline Component Types, Introduction to Haystack

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Documentation Structure

Continue reading this section for the full explanation and source context.

Section Versioning

Continue reading this section for the full explanation and source context.

Section Search Architecture

Continue reading this section for the full explanation and source context.

Related topics: Pipeline Architecture, Pipeline Component Types, Introduction to Haystack

Core Concepts

Haystack is an end-to-end LLM (Large Language Model) framework that enables developers to build applications powered by LLMs, Transformer models, vector search, and more. The framework orchestrates state-of-the-art embedding models and LLMs into pipelines to solve use cases such as retrieval-augmented generation (RAG), document search, question answering, and answer generation.

What is Haystack?

Haystack provides a flexible architecture for designing systems with explicit control over how information is retrieved, ranked, filtered, combined, structured, and routed before it reaches the model. The framework allows developers to define pipelines and agent workflows where retrieval, memory, tools, and other components work together seamlessly.

Sources: README.md

Architecture Overview

Haystack's architecture is built around the concept of pipelines that orchestrate various components. These pipelines provide explicit control over the data flow from input to output, enabling developers to build complex LLM applications with fine-grained control.

graph TD
    A[Input Query] --> B[Pipeline]
    B --> C[Components]
    C --> D[Retrievers]
    C --> E[Rankers]
    C --> F[Memory]
    C --> G[Tools]
    D --> H[Document Store]
    E --> I[LLM]
    H --> J[Context Engineering]
    I --> K[Generated Response]
    J --> I

Sources: README.md

Installation

Haystack can be installed via pip using the main package:

pip install haystack-ai

For trying newest features, install nightly pre-releases:

pip install --pre haystack-ai

Sources: README.md

Docker Support

Haystack provides Docker images for containerized deployments. The base image haystack:base-<version> contains a working Python environment with Haystack preinstalled and is designed to be derived FROM.

Images are built with BuildKit and orchestrated using bake:

docker buildx bake base

Custom images can be built by overriding variables defined in the docker-bake.hcl file:

HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache

Sources: docker/README.md

Documentation System

Haystack maintains comprehensive documentation at docs.haystack.deepset.ai. The documentation is built with Docusaurus 3 and provides guides, tutorials, API references, and best practices.

Documentation Structure

DirectoryPurpose
docs/Main documentation (guides, tutorials, concepts)
docs/concepts/Core Haystack concepts
docs/pipeline-components/Component documentation
reference/API reference (auto-generated)
versioned_docs/Versioned copies of docs
src/React components and custom code

Sources: docs-website/README.md

Versioning

Documentation versions are released alongside Haystack releases and are fully automated through GitHub workflows. The versioning process includes:

Sources: docs-website/README.md

API Reference

The API reference is generated from docstrings in the codebase using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes.

To add documentation for a new module:

  1. Create a .yml file in the pydoc directory
  2. Configure how haystack-pydoc-tools will generate the page
  3. Commit to main

All API reference updates are initially deployed to unstable docs and promoted to stable docs during releases.

Sources: pydoc/README.md

Documentation Website Development

The documentation site can be run locally for development:

git clone https://github.com/deepset-ai/haystack.git
cd haystack/docs-website
npm install
npm start

The site opens at http://localhost:3000 with live reload functionality.

Common development tasks include:

  • Edit a page: update files under docs/ or versioned_docs/
  • Add to sidebar: update sidebars.js with your doc ID
  • Production check: npm run build && npm run serve

Sources: docs-website/README.md

Search Functionality

The documentation website includes a custom search bar that groups results by page and sorts them by relevance score. The search system supports filtering by category and provides snippets from matching documents.

Search Architecture

graph TD
    A[User Query] --> B[Search Input]
    B --> C[Debounced Search]
    C --> D[Search Algorithm]
    D --> E{Results Found?}
    E -->|Yes| F[Group by Page]
    E -->|No| G[No Results State]
    F --> H[Sort by Score]
    H --> I[Display Results]
    G --> J[Show Error/Message]

Sources: docs-website/src/theme/SearchBar.js

Documentation Export Features

The documentation site provides multiple ways to export and share content:

FeatureDescription
Copy as MarkdownCopy page content in Markdown format for LLMs
View as MarkdownView page as plain text
Export as PDFSave page as PDF file
Ask AIOpen page in external AI assistants

Sources: docs-website/src/components/CopyDropdown/index.tsx

Markdown Conversion Rules

The export feature uses custom Turndown rules:

  • Code blocks: Wrapped in backticks
  • Admonitions: Converted to blockquotes with type labels (NOTE, TIP, WARNING, etc.)
  • Navigation elements: Removed from export
  • Scripts and styles: Filtered out

Sources: docs-website/src/components/CopyDropdown/index.tsx

Examples and Cookbooks

Example code and cookbooks have been moved to a dedicated repository: haystack-cookbook

This separation allows for easier maintenance and discovery of example applications.

Sources: examples/README.md

CI/CD and Quality Assurance

Haystack maintains high code quality through automated workflows:

WorkflowPurpose
tests.ymlRun test suite
types (Mypy)Type checking
CoverageCode coverage tracking
RuffLinting
license_compliance.ymlLicense verification

Sources: README.md

Contributing to Haystack

Haystack welcomes community contributions in various forms:

The project provides a full list of issues open to contributions for both new and experienced contributors.

Sources: README.md

Organizations Using Haystack

Haystack is used in production by numerous organizations across industries:

IndustryOrganizations
Technology & AIApple, Meta, Databricks, NVIDIA, Intel
Public SectorEuropean Commission
VariousThousands of teams building production AI systems

Sources: README.md

Sources: README.md

Pipeline Component Types

Related topics: Pipeline Architecture, Data Processing Components, LLM and Embedder Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Lifecycle

Continue reading this section for the full explanation and source context.

Section Data Flow Patterns

Continue reading this section for the full explanation and source context.

Section Converters

Continue reading this section for the full explanation and source context.

Related topics: Pipeline Architecture, Data Processing Components, LLM and Embedder Integrations

Pipeline Component Types

Pipeline components are the fundamental building blocks of Haystack pipelines. They are modular units that perform specific operations such as retrieving documents, converting file formats, generating responses, and routing data between pipeline stages. Each component follows a consistent interface that enables seamless integration into pipeline workflows, allowing developers to compose complex LLM applications from reusable, interchangeable parts.

Overview

Haystack provides a comprehensive set of built-in pipeline components that cover the full lifecycle of LLM-powered applications. These components are designed to work together through a unified API, enabling developers to build retrieval-augmented generation (RAG) systems, question-answering pipelines, document processing workflows, and agent-based applications with minimal configuration.

The architecture follows a modular pattern where each component receives inputs, performs a specific transformation or operation, and produces outputs that can be consumed by subsequent components in the pipeline. This design philosophy ensures that components remain loosely coupled and highly reusable across different use cases.

Components in Haystack are categorized based on their primary function within the data flow. Some components handle input preparation (converters, preprocessors), others manage information retrieval (retrievers, embedders), some optimize result ordering (rankers), and others control program flow (routers, joiners). Understanding these categories is essential for designing effective pipelines that balance performance, accuracy, and resource utilization.

Component Architecture

Component Lifecycle

Components in Haystack follow a standardized lifecycle that includes initialization, execution, and optional teardown phases. During initialization, components receive their configuration parameters and prepare any required resources such as model weights, API connections, or index data. The execution phase processes input data through the component's core logic, while the teardown phase releases resources when the component is no longer needed.

graph TD
    A[Initialize Component] --> B[Load Resources]
    B --> C[Receive Input Data]
    C --> D[Process Data]
    D --> E[Produce Output]
    E --> F{Check Pipeline Status}
    F -->|Continue| C
    F -->|Complete| G[Release Resources]
    G --> H[Component Lifecycle End]

Data Flow Patterns

Haystack pipelines support multiple data flow patterns that determine how information moves between components. Linear flow passes output directly to the next component, while branching flow sends data to multiple paths based on conditions. Parallel flow distributes work across multiple components simultaneously, and feedback flow allows outputs to influence earlier pipeline stages.

Input Processing Components

Input processing components prepare raw data for use by downstream pipeline stages. These components handle the transformation of unstructured or heterogeneous data sources into standardized formats that can be processed consistently throughout the pipeline.

Converters

Converters transform documents from various file formats into Haystack's internal document representation. They handle the extraction of text content from source files while preserving metadata that may be useful for subsequent processing or retrieval operations.

Converter TypeSupported FormatsPrimary Use Case
PDF ConverterPDFExtract text from PDF documents
Text ConverterTXT, MDPlain text and markdown files
DOCX ConverterDOCXMicrosoft Word documents
HTML ConverterHTMLWeb page content extraction

Converters are typically placed at the beginning of indexing pipelines where they process source documents before the content is split, embedded, and stored. The output of converters feeds directly into preprocessors that further refine the content.

Sources: docs-website/docs/pipeline-components/converters.mdx

Preprocessors

Preprocessors clean, normalize, and transform document content to improve retrieval quality and downstream processing. They apply transformations such as text cleaning, language detection, and content segmentation to prepare documents for embedding and storage.

graph LR
    A[Raw Document] --> B[Clean Text]
    B --> C[Detect Language]
    C --> D[Split Document]
    D --> E[Normalize Content]
    E --> F[Processed Document]

Key preprocessing operations include removing unnecessary whitespace, normalizing unicode characters, splitting long documents into manageable chunks, and filtering out low-quality content. These operations significantly impact the quality of retrieval results and should be configured based on the specific characteristics of your data.

Preprocessors work closely with converters to form the input preparation stage of indexing pipelines. The processed output is then passed to embedders or directly to storage depending on the pipeline configuration.

Sources: docs-website/docs/pipeline-components/preprocessors.mdx

Builders

Builders construct specialized data structures or artifacts that support pipeline operations. Unlike converters that handle file formats, builders create complex objects such as prompt templates, search indexes, or custom data representations required by other components.

Builders enable the composition of reusable building blocks that can be shared across multiple pipelines. They abstract away the complexity of constructing complex objects, allowing pipeline developers to focus on workflow design rather than implementation details.

Sources: docs-website/docs/pipeline-components/builders.mdx

Information Retrieval Components

Information retrieval components locate and retrieve relevant content from data stores. These components form the core of RAG systems and document search applications, enabling pipelines to find the most relevant information based on query semantics or keywords.

Retrievers

Retrievers search document stores to find content relevant to a given query. Haystack supports multiple retrieval strategies ranging from keyword-based sparse retrieval to semantic dense retrieval, enabling developers to choose the approach that best fits their use case.

Retrieval TypeDescriptionBest For
Dense RetrievalUses neural embeddings for semantic matchingConceptual queries, semantic similarity
Sparse RetrievalTraditional keyword-based matchingExact matches, specific terminology
Hybrid RetrievalCombines dense and sparse methodsBalanced performance across query types

Retrievers are fundamental to RAG pipelines where they identify the documents or passages most likely to contain information relevant to the user's question. The retrieved content is then passed to generators that synthesize the final response.

Sources: docs-website/docs/pipeline-components/retrievers.mdx

Embedders

Embedders convert text content into vector representations that capture semantic meaning. These vectors enable semantic similarity searches where documents are matched based on meaning rather than exact keyword occurrence.

graph TD
    A[Text Input] --> B[Embedding Model]
    B --> C[Vector Representation]
    C --> D[Vector Store]
    
    E[Query] --> F[Same Embedding Model]
    F --> G[Query Vector]
    G --> D
    D --> H[Similarity Search]
    H --> I[Ranked Results]

Embedders are used both during indexing (to create document vectors) and at query time (to create query vectors). The choice of embedding model significantly impacts retrieval quality, and Haystack supports integration with various embedding providers including OpenAI, Hugging Face, and local models.

Sources: docs-website/docs/pipeline-components/embedders.mdx

Rankers

Rankers improve retrieval results by reordering documents based on additional relevance signals. While retrievers perform the initial candidate selection, rankers apply more sophisticated scoring models to identify the most relevant results.

Rankers typically use cross-encoder models that jointly analyze query-document pairs to produce relevance scores. This approach is computationally more expensive than bi-encoder retrieval but provides higher accuracy for tasks where precision is critical.

The typical pipeline arrangement places rankers after retrievers, with retrievers performing the broad candidate selection and rankers performing the refined reordering. This two-stage approach balances computational efficiency with result quality.

Sources: docs-website/docs/pipeline-components/rankers.mdx

Output Generation Components

Output generation components synthesize final responses or artifacts from the information retrieved and processed by earlier pipeline stages. These components transform raw retrieved content into user-facing outputs.

Generators

Generators produce final outputs such as text responses, summaries, or structured data from retrieved context and user queries. In RAG systems, generators receive relevant documents and formulate answers that incorporate information from the retrieved content.

graph TD
    A[User Query] --> E[Generator]
    B[Retrieved Context] --> E
    E --> F[Generate Response]
    F --> G[Response Output]
    
    H[LLM Provider] <--> E
    H --> |API Key| E

Generators integrate with various LLM providers including OpenAI, Anthropic, Cohere, Hugging Face, and local models. Configuration options control parameters such as temperature, max tokens, and response format to customize generator behavior for specific applications.

Sources: docs-website/docs/pipeline-components/generators.mdx

Flow Control Components

Flow control components manage how data moves through pipelines, enabling conditional logic, parallel processing, and result aggregation. These components add flexibility to pipeline design beyond simple linear data flow.

Routers

Routers direct input data to different pipeline branches based on conditions or classifications. They enable conditional execution where different components handle different types of inputs or queries.

Router TypeDecision BasisUse Case
Conditional RouterUser-defined rulesRoute queries to appropriate handlers
Semantic RouterQuery classificationDirect to specialized pipelines
Custom RouterAny Python logicFlexible routing strategies

Routers are essential for building multi-stage pipelines that handle diverse input types or implement complex query routing strategies. They enable pipelines to adapt their behavior based on the specific requirements of each input.

Sources: docs-website/docs/pipeline-components/routers.mdx

Joiners

Joiners combine outputs from multiple pipeline branches into unified inputs for downstream components. They handle the aggregation of results from parallel processing paths or the merging of different data streams.

graph TD
    A[Input] --> B[Branch 1]
    A --> C[Branch 2]
    A --> D[Branch N]
    B --> E[Joiner]
    C --> E
    D --> E
    E --> F[Combined Output]

Joiners implement various combination strategies including concatenation, interleaving, and weighted merging. The appropriate strategy depends on the data types being combined and the requirements of downstream components.

Sources: docs-website/docs/pipeline-components/joiners.mdx

Component Configuration Patterns

Initialization Parameters

Components accept configuration during initialization that determines their behavior, resource connections, and operational parameters. Common configuration categories include model selection, connection settings, and behavioral parameters.

Default Parameters

Components provide sensible defaults for most parameters, enabling quick pipeline construction while allowing customization when needed. Default values are documented in each component's reference documentation.

Runtime Parameters

Some components accept parameters at runtime (during pipeline execution) in addition to initialization-time configuration. Runtime parameters enable dynamic behavior adjustment based on input characteristics or pipeline state.

Building Custom Components

Haystack's component architecture supports extension through custom implementations. Custom components follow the same interface patterns as built-in components, ensuring compatibility with existing pipeline infrastructure.

Component Interface Requirements

Custom components must implement the standard component methods including initialization, execution, and any component-specific lifecycle hooks. The exact interface depends on the component type, but all components must be serializable for pipeline persistence.

Integration with Pipeline

Custom components integrate seamlessly with built-in components through the unified pipeline interface. They can receive inputs from and produce outputs for any other component type, enabling flexible composition of custom and built-in functionality.

Best Practices

Component Selection

Choose components based on your specific use case requirements including accuracy needs, latency constraints, and resource availability. Consider the trade-offs between different retrieval strategies, embedding models, and generation approaches.

Pipeline Design

Design pipelines with clear separation of concerns between components. Input processing, retrieval, and generation should be logically separated to enable independent optimization and testing.

Performance Optimization

Optimize component ordering based on computational cost. Place computationally expensive operations later in the pipeline where they operate on reduced candidate sets. Use rankers selectively based on the required result quality.

Summary

Pipeline components form the foundation of Haystack's architecture, enabling modular construction of LLM-powered applications. The component taxonomy spans input processing (converters, preprocessors, builders), information retrieval (retrievers, embedders, rankers), output generation (generators), and flow control (routers, joiners). Each component category serves a distinct purpose in the pipeline data flow, and understanding these roles enables effective pipeline design and customization.

Sources: docs-website/docs/pipeline-components/converters.mdx

Data Processing Components

Related topics: Document Stores and Retrievers, Pipeline Component Types

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Splitter Types

Continue reading this section for the full explanation and source context.

Section DocumentSplitter

Continue reading this section for the full explanation and source context.

Section RecursiveSplitter

Continue reading this section for the full explanation and source context.

Related topics: Document Stores and Retrievers, Pipeline Component Types

Data Processing Components

Data Processing Components are fundamental pipeline elements in Haystack that transform, clean, and prepare documents for downstream operations such as retrieval, indexing, and LLM processing. These components operate on Document objects, enabling structured manipulation of content while preserving metadata integrity throughout the processing pipeline.

Overview

Data Processing Components in Haystack serve as the preprocessing layer that bridges raw document ingestion with semantic retrieval and generation tasks. They are designed to handle various document formats, split long content into manageable chunks, and ensure data quality through cleaning operations.

The architecture follows a modular design pattern where each component type specializes in a specific transformation task:

  • Document Splitters: Divide documents into smaller, semantically coherent chunks
  • Document Cleaners: Remove noise, normalize text, and enhance readability
  • Converters: Transform external file formats into Haystack Document objects

Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx

Architecture and Processing Flow

graph TD
    A[Raw Document Input] --> B[Converters]
    B --> C[Document Objects]
    C --> D[Document Cleaners]
    D --> E[Document Splitters]
    E --> F[Processed Chunks]
    F --> G[Embedding Stores]
    G --> H[Retrieval Pipelines]
    
    B -.->|File Types| I[TXT]
    B -.->|File Types| J[PDF]
    B -.->|File Types| K[Markdown]
    B -.->|File Types| L[HTML]
    B -.->|File Types| M[Docx]
    
    D -.->|Operations| N[Text Normalization]
    D -.->|Operations| O[Whitespace Cleaning]
    D -.->|Operations| P[Metadata Preservation]
    
    E -.->|Strategies| Q[Character Split]
    E -.->|Strategies| R[Recursive Split]
    E -.->|Strategies| S[Hierarchical Split]

Document Splitters

Document splitters are preprocessors that divide long documents into smaller, manageable chunks while attempting to preserve semantic coherence. This is critical for effective retrieval since chunk size directly impacts retrieval precision and context window utilization.

Sources: docs-website/docs/pipeline-components/preprocessors/recursivesplitter.mdx

Splitter Types

Splitter TypeUse CaseSplitting Strategy
DocumentSplitterBasic character or token-based splittingFixed-length chunks
RecursiveSplitterHierarchical splitting by delimitersRecursive character/separator traversal
HierarchicalDocumentSplitterMulti-level document structurePreserves headings and sections

DocumentSplitter

The base DocumentSplitter provides fundamental splitting capabilities using either character count or token count as the primary division criterion.

Key Parameters:

ParameterTypeDefaultDescription
split_lengthintRequiredTarget size of each chunk
split_overlapint0Number of overlapping elements between chunks
split_bystr"word"Splitting criterion: "word", "sentence", "passage", or "token"

Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx

RecursiveSplitter

The RecursiveSplitter implements an intelligent multi-level splitting strategy that attempts to split documents at natural boundaries before falling back to smaller units.

from haystack.components.preprocessors import RecursiveSplitter

splitter = RecursiveSplitter(
    split_by="sentence",
    split_length=5,
    split_overlap=2,
    separators=["\n\n", "\n", ". ", " ", ""]
)

The splitter iterates through the separators list, attempting to split at each level. If a split produces chunks larger than split_length, it moves to the next (smaller) separator in the list.

Sources: docs-website/docs/pipeline-components/preprocessors/recursivesplitter.mdx

Separator Priority:

PrioritySeparatorContext
1"\n\n"Paragraph breaks
2"\n"Line breaks
3". "Sentence endings
4" "Word boundaries
5""Character-level fallback

HierarchicalDocumentSplitter

The HierarchicalDocumentSplitter is designed for structured documents that contain hierarchical headings and section markers. It preserves document structure by splitting at heading boundaries first.

Key Features:

  • Detects heading patterns (e.g., #, ##, ### in Markdown)
  • Splits at the highest heading level available
  • Maintains hierarchical relationships between sections and subsections
  • Ideal for technical documentation and Markdown-based content
from haystack.components.preprocessors import HierarchicalDocumentSplitter

splitter = HierarchicalDocumentSplitter(
    split_by="sentence",
    split_length=10,
    split_overlap=3
)

Sources: docs-website/docs/pipeline-components/preprocessors/hierarchicaldocumentsplitter.mdx

Document Cleaners

Document cleaners are preprocessing components that normalize and sanitize text content while preserving essential structure and metadata. They remove unwanted artifacts, standardize formatting, and enhance downstream processing quality.

Sources: docs-website/docs/pipeline-components/preprocessors/documentcleaner.mdx

Core Cleaning Operations

OperationDescriptionExample
Whitespace normalizationCollapse multiple spaces, trim line breaks" Hello\n\n World ""Hello World"
Character removalStrip control characters and special symbolsRemoves \x00 to \x1f except \n, \t
Quote normalizationStandardize quote charactersSmart quotes → straight quotes
Heading normalizationClean heading markersRemoves # from Markdown headings

Common Parameters

ParameterTypeDefaultDescription
remove_empty_linesboolTrueRemove lines with no content
remove_extra_whitespaceboolTrueNormalize whitespace between words
remove_repeated_substringsboolFalseEliminate duplicate consecutive substrings

Converters

Converters are components that transform external file formats into Haystack Document objects. They handle the ingestion pipeline by parsing various document formats and extracting both content and metadata.

Sources: docs-website/docs/pipeline-components/converters.mdx

Supported Formats

FormatConverter ClassFeatures
Plain TextTextConverterDirect text extraction
PDFPdfToDocumentConverterText and table extraction
MarkdownMarkdownToDocumentConverterPreserves structure and headings
HTMLHtmlToDocumentConverterExtracts text from HTML elements
Microsoft WordDocxToDocumentConverterDocument and paragraph parsing

Converter Architecture

graph LR
    A[Input File] --> B[Format Detection]
    B --> C[Format-Specific Parser]
    C --> D[Content Extraction]
    D --> E[Metadata Enrichment]
    E --> F[Haystack Document]
    
    G[File Path] -.->|Direct Input| D
    H[Binary Content] -.->|Raw Data| C

Common Converter Parameters

ParameterTypeDefaultDescription
encodingstr"utf-8"Text encoding for file reading
encoding_errorsstr"strict"How to handle encoding errors
id_hash_keysList[str]["content"]Keys for document ID generation
metaDict[str, Any]{}Additional metadata to attach

Sources: docs-website/docs/pipeline-components/converters.mdx

Integration with Pipelines

Data Processing Components integrate seamlessly into Haystack pipelines as standard pipeline nodes. They can be composed in any order to create custom preprocessing workflows.

Typical Pipeline Configuration

from haystack import Pipeline
from haystack.components.preprocessors import DocumentCleaner, RecursiveSplitter
from haystack.components.converters import TextConverter

pipeline = Pipeline()
pipeline.add_component("converter", TextConverter())
pipeline.add_component("cleaner", DocumentCleaner())
pipeline.add_component("splitter", RecursiveSplitter(split_length=200, split_by="word"))

pipeline.connect("converter", "cleaner")
pipeline.connect("cleaner", "splitter")

Processing Order Recommendation

While components can be connected in various orders, the recommended processing sequence is:

  1. Convert - Transform source files into Document objects
  2. Clean - Normalize and sanitize the text content
  3. Split - Divide documents into retrieval-optimized chunks

This sequence ensures that cleaning operations apply to the complete document before splitting, maintaining consistency across chunks.

Metadata Preservation

All Data Processing Components preserve and propagate document metadata throughout the processing pipeline. Metadata added during conversion is carried through cleaning and splitting operations.

Automatic Metadata Fields:

FieldSourceDescription
sourceConverterOriginal file path or URI
file_typeConverterDocument format (pdf, txt, etc.)
page_numberPDF ConverterPage number for page-level tracking
split_idSplitterUnique identifier for each chunk
split_idx_startSplitterCharacter offset where chunk begins

Best Practices

Chunk Size Selection

Chunk SizeRecommended Use Case
50-100 tokensHigh-precision queries, precise fact extraction
200-300 tokensBalanced retrieval, general Q&A
500+ tokensComplex reasoning, multi-document synthesis

Cleaning Configuration

  • Enable remove_extra_whitespace for all text-based content
  • Use remove_empty_lines when building dense indexes
  • Disable cleaning for Markdown/HTML if structure preservation is critical

Overlap Strategy

When configuring split_overlap, consider:

  • Low overlap (0-10%): Maximizes diversity, suitable for unique content
  • Medium overlap (10-20%): Balances context preservation and diversity
  • High overlap (20%+: Essential for documents with continuous context
  • Embedding Generators: Process chunks to create vector representations
  • Document Stores: Store and index processed chunks for retrieval
  • Rankers: Reorder retrieved chunks by relevance
  • Prompt Engineers: Combine chunks for LLM context windows

Sources: docs-website/docs/pipeline-components/preprocessors/documentsplitter.mdx

LLM and Embedder Integrations

Related topics: Document Stores and Retrievers, Pipeline Component Types, Development Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Purpose

Continue reading this section for the full explanation and source context.

Section Supported Providers

Continue reading this section for the full explanation and source context.

Section Component Configuration

Continue reading this section for the full explanation and source context.

Related topics: Document Stores and Retrievers, Pipeline Component Types, Development Guide

LLM and Embedder Integrations

Overview

LLM and Embedder Integrations in Haystack provide the core components for interfacing with Large Language Models and embedding services. These integrations enable developers to build production-ready applications powered by LLMs, Transformer models, and vector search capabilities.

Sources: README.md:1-10

Architecture

Haystack's integration architecture follows a modular pipeline design where Generators (LLMs) and Embedders serve as fundamental building blocks within the orchestration framework.

graph TD
    A[Haystack Pipeline] --> B[Retrieval Components]
    A --> C[Generator Components]
    A --> D[Embedder Components]
    C --> E[LLM Providers]
    D --> F[Embedding Models]
    B --> F
    E --> G[API Services]
    F --> G

Generator Integration

Purpose

Generators in Haystack are components that interact with Large Language Models to generate responses based on prompts and retrieved context. They serve as the core reasoning engine within RAG (Retrieval-Augmented Generation) pipelines.

Sources: docs-website/docs/pipeline-components/generators/guides-to-generators/choosing-the-right-generator.mdx:1-15

Supported Providers

Haystack supports multiple LLM providers through its integration system. The framework provides standardized interfaces for:

ProviderIntegration TypeAPI Access
OpenAIChat Completions APIAPI Key
AnthropicClaude APIAPI Key
Azure OpenAIAzure OpenAI ServiceAzure Credentials
Hugging FaceInference API / LocalAPI Key / Local
OllamaLocal ModelsLocal Host

Component Configuration

Generator components in Haystack follow a consistent initialization pattern:

from haystack import Pipeline
from haystack.components.generators import OpenAIChatGenerator

generator = OpenAIChatGenerator(
    api_key="your-api-key",
    model="gpt-4",
    streaming_callback=None,
    generation_kwargs={"temperature": 0.7, "max_tokens": 500}
)

Embedder Integration

Purpose

Embedders are components that convert text into vector representations (embeddings) suitable for semantic search and similarity comparisons. They are essential for the retrieval portion of RAG pipelines.

Sources: docs-website/docs/pipeline-components/embedders/choosing-the-right-embedder.mdx:1-20

Embedder Types

TypeUse CaseDeployment
Sentence TransformersGeneral text embeddingsLocal / API
OpenAI EmbeddingsAPI-based generationRemote
Hugging FaceTransformer modelsLocal / Inference API
CohereMulti-lingual supportAPI

Integration with Retrievers

Embedders work in conjunction with document stores to enable semantic search:

graph LR
    A[Documents] --> B[Embedder]
    B --> C[Vector Store]
    C --> D[Retriever]
    E[Query] --> F[Query Embedder]
    F --> D
    D --> G[Retrieved Docs]

Function Calling

Function calling extends LLM integrations to enable structured interactions between LLMs and external tools. This feature allows Generators to produce structured outputs that can trigger specific actions.

Sources: docs-website/docs/pipeline-components/generators/guides-to-generators/function-calling.mdx:1-30

Workflow

sequenceDiagram
    participant User
    participant Pipeline
    participant LLM
    participant Tool
    
    User->>Pipeline: Query with function definitions
    Pipeline->>LLM: Send prompt + function specs
    LLM->>LLM: Analyze request
    LLM-->>Pipeline: Function call + parameters
    Pipeline->>Tool: Execute function
    Tool-->>Pipeline: Function result
    Pipeline->>LLM: Send result + original context
    LLM-->>Pipeline: Final response
    Pipeline-->>User: Return answer

Integration Configuration

Environment Setup

Integrations in Haystack typically require API credentials which can be configured via environment variables:

export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export HUGGINGFACE_TOKEN="your-hf-token"

Sources: docs-website/docs/concepts/integrations.mdx:1-25

Configuration Options

ParameterDescriptionDefault
api_keyProvider API keyEnvironment variable
modelModel identifierProvider default
timeoutRequest timeout in seconds60
max_retriesNumber of retry attempts3

Pipeline Integration Example

from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.generators import OpenAIChatGenerator
from haystack.document_stores import InMemoryDocumentStore

# Initialize components
document_store = InMemoryDocumentStore()
retriever = InMemoryBM25Retriever(document_store=document_store)
generator = OpenAIChatGenerator(model="gpt-4")

# Build pipeline
pipeline = Pipeline()
pipeline.add_component("retriever", retriever)
pipeline.add_component("generator", generator)
pipeline.connect("retriever", "generator")

Installation

To use LLM and Embedder integrations, install the appropriate Haystack packages:

# Core package
pip install haystack-ai

# For specific integrations
pip install "haystack-ai[openai]"    # OpenAI models
pip install "haystack-ai[anthropic]"  # Anthropic Claude
pip install "haystack-ai[transformers]" # Hugging Face

Additional Resources

Sources: README.md:1-10

Document Stores and Retrievers

Related topics: LLM and Embedder Integrations, Data Processing Components

Section Related Pages

Continue reading this section for the full explanation and source context.

Section InMemoryDocumentStore

Continue reading this section for the full explanation and source context.

Section ElasticsearchDocumentStore

Continue reading this section for the full explanation and source context.

Section QdrantDocumentStore

Continue reading this section for the full explanation and source context.

Related topics: LLM and Embedder Integrations, Data Processing Components

Document Stores and Retrievers

Document Stores and Retrievers are fundamental components in the Haystack framework that enable efficient storage, indexing, and retrieval of documents for LLM-powered applications. These components form the backbone of retrieval-augmented generation (RAG) pipelines and semantic search systems.

Overview

Haystack provides a unified abstraction layer for document storage and retrieval, allowing developers to work with different backend technologies through a consistent interface. The framework supports multiple document store implementations, each optimized for different use cases, scales, and deployment requirements.

Document Stores in Haystack handle the persistence and indexing of documents, while Retrievers are specialized components that query these stores to find relevant documents based on user queries. This separation of concerns allows for flexible pipeline composition and easy swapping of storage backends.

Architecture

graph TD
    A[User Query] --> B[Retriever]
    B --> C[Document Store]
    C --> D[(Vector Index)]
    C --> E[(Document DB)]
    F[Documents] --> C
    G[Embedding Model] --> D
    B --> H[Query Embedding]
    H --> D
    D --> I[Relevant Documents]
    I --> J[RAG Pipeline]

The architecture separates concerns between storage and retrieval, enabling optimized implementations for each layer.

Document Store Types

Haystack supports multiple document store implementations, each with distinct characteristics:

Document StoreTypeUse CaseScalability
InMemoryDocumentStoreIn-memoryDevelopment, testing, small datasetsSingle machine, limited scale
ElasticsearchDocumentStoreDistributed searchProduction, full-text searchHorizontal scaling
QdrantDocumentStoreVector databaseSemantic search, embeddingsHigh-dimensional vectors
PineconeDocumentStoreManaged vector DBCloud-native, managed infrastructureGlobal distribution

InMemoryDocumentStore

The InMemoryDocumentStore is the simplest document store implementation, storing all data in memory. It is primarily used for development, testing, and prototyping scenarios where persistence is not required.

Key Characteristics:

  • No external dependencies required
  • Fast read/write operations for small datasets
  • Data lost on application restart
  • Not suitable for production deployments with large volumes

Sources: docs-website/docs/document-stores/inmemorydocumentstore.mdx

ElasticsearchDocumentStore

Elasticsearch provides a mature, production-ready document store with powerful full-text search capabilities. It is well-suited for applications requiring sophisticated text analysis, faceted search, and scalable infrastructure.

Key Characteristics:

  • Distributed architecture for high availability
  • Rich query DSL for complex search operations
  • BM25 ranking algorithm for relevance scoring
  • Supports millions of documents

Sources: docs-website/docs/document-stores/elasticsearch-document-store.mdx

QdrantDocumentStore

Qdrant is a vector database optimized for similarity search and high-dimensional embeddings. It provides efficient nearest neighbor search operations essential for semantic retrieval.

Key Characteristics:

  • Optimized for vector similarity search
  • Supports payload filtering
  • Hybrid sparse-dense vector search
  • gRPC-based API for performance

Sources: docs-website/docs/document-stores/qdrant-document-store.mdx

PineconeDocumentStore

Pinecone is a managed vector database service that eliminates infrastructure management overhead. It provides global distribution and automatic scaling for production deployments.

Key Characteristics:

  • Fully managed cloud service
  • Automatic scaling and sharding
  • Multi-tenancy support
  • Low-latency querying at scale

Sources: docs-website/docs/document-stores/pinecone-document-store.mdx

Choosing a Document Store

Selecting the appropriate document store depends on several factors including scale, performance requirements, deployment environment, and feature needs.

Sources: docs-website/docs/concepts/document-store/choosing-a-document-store.mdx

Decision Criteria

FactorInMemoryElasticsearchQdrantPinecone
Dataset Size< 100K docsUnlimitedUnlimitedUnlimited
LatencyVery lowMediumLowLow
PersistenceNoneFullFullFull
Full-text SearchBasicAdvancedLimitedLimited
Vector SearchBasicPlugin requiredNativeNative
Managed ServiceNoSelf-hosted/CloudSelf-hosted/CloudYes (managed)
CostFreeInfrastructureInfrastructureUsage-based

Recommendations

Development and Testing: Use InMemoryDocumentStore for rapid prototyping and unit testing. It requires no setup and provides immediate feedback.

Production with Full-text Search: Choose ElasticsearchDocumentStore when your application requires complex text queries, aggregations, or you already have an Elasticsearch infrastructure.

Semantic Search at Scale: Select QdrantDocumentStore or PineconeDocumentStore for applications primarily relying on embedding-based similarity search. Both provide native vector operations with efficient indexing.

Document Model

Documents in Haystack follow a standardized data model that captures content, metadata, and embedding vectors.

classDiagram
    class Document {
        +str id
        +str content
        +dict meta
        +List[float] embedding
        +str blob
        +str blob_mime_type
    }

Core Document Fields:

FieldTypeDescription
idstringUnique identifier for the document
contentstringMain text content of the document
metadictArbitrary metadata (source, author, date, etc.)
embeddinglist[float]Vector representation for semantic search

Sources: docs-website/docs/concepts/document-store.mdx

Retriever Types

Retrievers query document stores to find the most relevant documents for a given query. Haystack provides multiple retriever implementations optimized for different search strategies.

Dense Retrievers

Dense retrievers use neural network models to encode queries and documents into dense vector representations. They excel at capturing semantic meaning and handling synonyms.

Sparse Retrievers

Sparse retrievers use traditional information retrieval techniques like BM25 or TF-IDF. They are effective for exact term matching and keyword-based queries.

Hybrid Retrievers

Hybrid retrievers combine both dense and sparse approaches, leveraging the strengths of each to provide robust retrieval across different query types.

Pipeline Integration

graph LR
    A[Query] --> B[Retriever]
    B --> C[Document Store]
    C --> D[Top-K Documents]
    D --> E[Ranker]
    E --> F[Reader/Generator]
    F --> G[Answer]

Document Stores and Retrievers integrate seamlessly into Haystack pipelines, typically appearing early in the pipeline to fetch candidate documents before passing them to downstream components like Readers or Generators.

Basic Usage Example

from haystack import Document
from haystack.document_stores import InMemoryDocumentStore
from haystack.nodes import BM25Retriever

# Initialize document store
document_store = InMemoryDocumentStore()

# Write documents
documents = [
    Document(content="Haystack is an open-source NLP framework", meta={"source": "docs"}),
    Document(content="It supports retrieval-augmented generation", meta={"source": "blog"}),
]
document_store.write_documents(documents)

# Initialize retriever
retriever = BM25Retriever(document_store=document_store)

# Query
results = retriever.retrieve(query="What is Haystack?", top_k=10)

Performance Considerations

Indexing Performance

StoreIndexing SpeedMemory Usage
InMemoryVery FastProportional to dataset
ElasticsearchMediumDistributed across nodes
QdrantFastOptimized for vectors
PineconeFastManaged externally

Query Performance

Query latency depends on the number of documents, vector dimensions, and the complexity of filters applied. Vector databases like Qdrant and Pinecone use specialized indexing structures (HNSW, IVF) to achieve sub-millisecond query times on large datasets.

See Also

Sources: docs-website/docs/document-stores/inmemorydocumentstore.mdx

Agent Systems

Related topics: Introduction to Haystack, Pipeline Architecture, LLM and Embedder Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Agent Components

Continue reading this section for the full explanation and source context.

Section Execution Flow

Continue reading this section for the full explanation and source context.

Section State Structure

Continue reading this section for the full explanation and source context.

Related topics: Introduction to Haystack, Pipeline Architecture, LLM and Embedder Integrations

Agent Systems

Agent systems in Haystack represent a powerful paradigm for building autonomous and semi-autonomous AI applications that can perceive, reason, act, and interact with their environment. Haystack's agent framework enables developers to create sophisticated LLM-powered applications where agents can use tools, maintain state, collaborate with other agents, and incorporate human feedback into their decision-making processes.

Overview

Haystack agents are designed to extend beyond simple prompt-response interactions by providing a structured mechanism for Large Language Models to take actions, make decisions, and execute multi-step workflows. The agent system in Haystack is built with flexibility and modularity in mind, allowing developers to customize every aspect of agent behavior from the underlying model to the specific tools available and the logic governing agent decisions.

The framework supports a variety of agent types and architectures, ranging from single-agent systems that handle specific tasks to complex multi-agent ecosystems where multiple specialized agents collaborate to solve problems. This flexibility makes Haystack suitable for a wide range of use cases, from simple question-answering applications to sophisticated autonomous systems that can browse the web, execute code, and coordinate with other agents to complete complex tasks.

Core Architecture

The agent architecture in Haystack is built around a pipeline-based model that connects perception, reasoning, action selection, and execution into a cohesive workflow. At its core, an agent consists of several key components that work together to enable autonomous behavior.

Agent Components

ComponentPurposeDescription
LLMReasoning EngineThe underlying language model that drives decision-making
ToolsAction InterfaceCapabilities that allow the agent to interact with external systems
Prompt BuilderInstruction AssemblyConstructs prompts that guide agent behavior
Output HandlerResponse ProcessingInterprets and executes agent decisions
MemoryState ManagementMaintains conversation history and context

Sources: docs-website/docs/pipeline-components/agents-1/agent.mdx

Execution Flow

graph TD
    A[User Input] --> B[Agent Receives Task]
    B --> C[LLM Reasoning]
    C --> D{Tool Selection?}
    D -->|Yes| E[Execute Tool]
    E --> F[Process Result]
    D -->|No| G[Generate Response]
    F --> C
    G --> H[Return to User]
    C --> I{Human Input Needed?}
    I -->|Yes| J[Pause for Human Feedback]
    J --> C
    I -->|No| D

The execution flow demonstrates how Haystack agents operate in a loop, continuously reasoning about the best course of action until the task is complete. The agent receives input, reasons about what to do, selects and executes tools as needed, and continues until it can provide a final response or requires additional input from the user or human overseer.

State Management

State management is a critical aspect of agent systems, enabling agents to maintain context across multiple interactions and track the progress of complex, multi-step tasks. Haystack provides a flexible state management system that allows agents to store, retrieve, and update information throughout their execution lifecycle.

State Structure

The state system in Haystack agents typically includes several key elements that together form a comprehensive view of the agent's current situation and history. These elements enable the agent to maintain awareness of what has happened previously, what actions have been taken, and what information has been gathered.

State ElementTypeDescription
Conversation HistoryListPrevious messages and interactions
Tool Usage LogListRecord of tools called and results
Intermediate ResultsDictData collected during task execution
User PreferencesDictLearned user preferences and feedback
Task ProgressDictCurrent status of ongoing tasks

Sources: docs-website/docs/pipeline-components/agents-1/state.mdx

State Persistence

Agents in Haystack can maintain state across sessions, enabling persistent memory and long-term learning. This is particularly valuable for applications where the agent needs to build relationships with users over time or maintain knowledge about specific domains or tasks. The state management system supports various backends for persistence, from simple in-memory storage to distributed databases for production deployments.

Multi-Agent Systems

Haystack supports the creation of sophisticated multi-agent systems where multiple specialized agents work together to solve problems. This architectural pattern enables the decomposition of complex tasks into smaller, manageable subtasks that can be handled by agents with specialized capabilities.

Agent Collaboration Patterns

graph TD
    subgraph Coordinator Agent
        A[Task Received] --> B{Analyze Task}
        B --> C[Decompose into Subtasks]
    end
    
    subgraph Specialized Agents
        D[Agent A: Research]
        E[Agent B: Analysis]
        F[Agent C: Synthesis]
    end
    
    C --> D
    C --> E
    C --> F
    D --> G[Results Aggregation]
    E --> G
    F --> G
    G --> H[Final Response]

Multi-agent systems in Haystack can be configured with various collaboration patterns. In the supervisor pattern, a single coordinating agent directs the work of subordinate agents, assigning tasks and collecting results. In the collaborative pattern, agents work together as equals, sharing information and contributing their expertise to solve problems collectively.

Communication Protocols

Agents in a multi-agent system communicate through well-defined interfaces that specify how messages are passed between agents, how responses are aggregated, and how conflicts are resolved. This structured approach to agent communication ensures reliable operation even in complex agent ecosystems with many participants.

Sources: docs-website/docs/concepts/agents/multi-agent-systems.mdx

Human-in-the-Loop

Haystack agents support human-in-the-loop workflows, enabling humans to provide guidance, approval, or corrections during agent execution. This capability is essential for applications where autonomous operation must be balanced with human oversight and control.

Interaction Modes

ModeDescriptionUse Case
ApprovalHuman approves agent actions before executionHigh-stakes decisions
FeedbackHuman provides corrective feedback during executionFine-tuning agent behavior
EscalationAgent defers to human when uncertainHandling edge cases
ValidationHuman validates agent outputs before completionQuality assurance

Sources: docs-website/docs/pipeline-components/agents-1/human-in-the-loop.mdx

Workflow Integration

graph TD
    A[Agent Task] --> B{Requires Human Input?}
    B -->|Yes| C[Pause Execution]
    C --> D[Notify Human]
    D --> E[Await Response]
    E --> F{Human Action}
    F -->|Approve| G[Continue Execution]
    F -->|Reject| H[Abort or Retry]
    F -->|Modify| I[Apply Modifications]
    B -->|No| G
    I --> G
    G --> J[Task Complete]

The human-in-the-loop system is designed to be non-intrusive, minimizing the cognitive load on human overseers while ensuring that critical decisions receive appropriate human review. Agents can be configured to automatically escalate certain types of decisions based on predefined rules, such as actions that affect sensitive data or exceed specified cost thresholds.

Tool Integration

A defining characteristic of Haystack agents is their ability to use tools to interact with external systems and perform actions beyond text generation. The tool integration system provides a standardized interface for defining, registering, and invoking tools that extend agent capabilities.

Available Tool Categories

CategoryExamplesCapabilities
Web SearchGoogle Search, Bing SearchInternet research, fact checking
API ClientsREST, GraphQLExternal service integration
Code ExecutionPython, ShellComputation, automation
Document ProcessingPDF, CSV parsersInformation extraction
DatabaseSQL, Vector DBData retrieval, storage

Tools in Haystack follow a consistent interface that makes it easy to create custom tools for domain-specific applications. Each tool is defined with a name, description, input schema, and implementation, and the agent automatically learns when and how to use tools based on their descriptions.

Configuration Options

Haystack agents expose a wide range of configuration options that allow developers to customize agent behavior for specific use cases. These options control aspects ranging from the underlying model selection to detailed parameters governing agent decision-making.

Core Configuration Parameters

ParameterTypeDefaultDescription
modelStringRequiredThe LLM to use for reasoning
max_iterationsInteger10Maximum tool-calling loops
toolsListEmptyAvailable tools for the agent
prompt_templateStringDefaultCustom instruction template
verboseBooleanFalseEnable detailed logging

Advanced configuration options allow developers to customize how the agent reasons, how it selects tools, and how it handles errors and edge cases. These options can be set at the agent level or overridden for specific use cases.

Best Practices

When building agent systems with Haystack, several best practices can help ensure reliable and maintainable applications. Careful attention to prompt design, tool definitions, and error handling will significantly improve agent performance and user experience.

Clear and specific tool descriptions are essential for guiding agent behavior. Tools should have descriptive names and comprehensive descriptions that explain not just what the tool does, but when and why an agent should consider using it. This helps the underlying LLM make informed decisions about tool selection.

State management should be designed with the target use case in mind. For simple single-turn interactions, minimal state management is appropriate. For complex multi-step tasks, comprehensive state tracking ensures the agent maintains context and can recover from errors gracefully.

Human-in-the-loop integration should be thoughtfully designed to balance autonomy with oversight. Critical decisions should require human approval, while routine operations can proceed autonomously. The escalation criteria should be clearly defined and regularly reviewed.

Summary

Haystack's agent systems provide a comprehensive framework for building LLM-powered applications that can perceive, reason, and act. The architecture supports everything from simple single-agent applications to complex multi-agent ecosystems with human oversight. Key features include flexible state management, extensive tool integration, human-in-the-loop workflows, and configurable agent behavior.

Sources: docs-website/docs/concepts/agents.mdx

Sources: docs-website/docs/pipeline-components/agents-1/agent.mdx

Development Guide

Related topics: Deployment and Infrastructure, Introduction to Haystack

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Directory Breakdown

Continue reading this section for the full explanation and source context.

Section Standard Installation

Continue reading this section for the full explanation and source context.

Section Nightly Pre-releases

Continue reading this section for the full explanation and source context.

Related topics: Deployment and Infrastructure, Introduction to Haystack

Development Guide

This guide provides comprehensive information for developers who want to contribute to Haystack or extend its functionality. Haystack is an end-to-end LLM framework that enables building applications powered by Large Language Models, Transformer models, and vector search capabilities.

Overview

Haystack is an open-source framework maintained by deepset that allows developers to build production-ready AI applications. The framework supports retrieval-augmented generation (RAG), document search, question answering, and answer generation by orchestrating state-of-the-art embedding models and LLMs into pipelines.

Sources: README.md:1-10

Project Structure

The Haystack repository is organized into several main directories, each serving a specific purpose in the overall project ecosystem.

graph TD
    A[haystack/ root] --> B[Main Package]
    A --> C[docs-website/]
    A --> D[docker/]
    A --> E[pydoc/]
    A --> F[examples/]
    
    B --> G[Core Framework Code]
    C --> H[Documentation Site]
    D --> I[Docker Images]
    E --> J[API Reference Generation]
    F --> K[Example Cookbooks]

Directory Breakdown

DirectoryPurpose
haystack/Main Python package containing core framework code
docs-website/Docusaurus-powered documentation site
docker/Docker image definitions and build configurations
pydoc/YAML configurations for API reference generation
examples/Example applications and cookbooks (moved to haystack-cookbook)

Sources: docs-website/README.md:40-55

Installation for Development

Standard Installation

To set up Haystack for development, install the package via pip:

pip install haystack-ai

Nightly Pre-releases

For trying the newest features before official releases:

pip install --pre haystack-ai

Docker-based Development

Haystack provides Docker images for development environments. The base image contains a working Python environment with Haystack preinstalled and is designed to be derived FROM.

docker buildx bake base

To build custom images with specific branches or tags:

HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache

Sources: docker/README.md:15-30

Multi-Platform Docker Builds

Haystack images support multiple architectures. To limit builds to your local architecture:

# For Apple M1 (ARM)
docker buildx bake base --set "*.platform=linux/arm64"

Sources: docker/README.md:40-45

Documentation Development

The documentation website is built with Docusaurus 3 and provides comprehensive guides, tutorials, API references, and best practices for using Haystack.

Prerequisites

  • Node.js 18 or higher
  • npm (included with Node.js) or Yarn

Setting Up the Documentation Site

# Clone the repository and navigate to docs-website
git clone https://github.com/deepset-ai/haystack.git
cd haystack/docs-website

# Install dependencies
npm install

# Start the development server
npm start

# The site opens at http://localhost:3000 with live reload

Common Documentation Tasks

TaskCommandLocation
Edit a pageUpdate files under docs/ or versioned_docs/Preview at http://localhost:3000
Add to sidebarUpdate sidebars.js with doc IDdocs-website/
Production checknpm run build && npm run servedocs-website/

Sources: docs-website/README.md:20-35

Documentation Project Structure

docs-website/
├── docs/                          # Main documentation (guides, tutorials, concepts)
│   ├── _templates/               # Authoring templates (excluded from build)
│   ├── concepts/                 # Core Haystack concepts
│   ├── pipeline-components/      # Component documentation
│   └── ...
├── reference/                     # API reference (auto-generated, do not edit manually)
├── versioned_docs/               # Versioned copies of docs/
├── reference_versioned_docs/     # Versioned copies of reference/
├── src/                          # React components and custom code
│   ├── components/              # Custom React components
│   ├── css/                     # Global styles
│   ├── pages/                   # Custom pages
│   ├── remark/                  # Remark plugins
│   └── theme/                   # Docusaurus theme customization

Sources: docs-website/README.md:45-60

API Reference Development

The API reference is generated automatically from docstrings in the code using haystack-pydoc-tools. A GitHub workflow regenerates the API reference when code changes.

How API Reference Works

  1. Create a .yml file in the pydoc directory
  2. Configure how haystack-pydoc-tools will generate the page
  3. Commit the configuration to the main branch
  4. The GitHub workflow automatically generates the Markdown files

Version Management

All updates to API reference live in unstable docs version and are promoted to stable docs version when a new version is released.

Sources: pydoc/README.md:1-20

Contributing to Haystack

Haystack welcomes community contributions ranging from quick fixes like typo corrections to entirely new features.

Contribution Areas

AreaRepositoryDescription
Main Haystackdeepset-ai/haystackCore framework development
Integrationsdeepset-ai/haystack-core-integrationsIntegration components
Documentationhaystack/docs-websiteDocumentation content

Getting Started

  1. Review the Contributor Guidelines in CONTRIBUTING.md
  2. Check the full list of open issues available for contributions
  3. You don't need to be a Haystack expert to provide meaningful improvements

CI/CD and Quality Standards

The project maintains high quality standards through automated checks:

CheckBadgeDescription
TestsGitHub ActionsAutomated test suite
Type CheckingMypyStatic type analysis
Code CoverageCoverage BadgeTest coverage reporting
LintingRuffCode style enforcement
License ComplianceLicense CheckDependency license verification

Sources: README.md:30-55

Development Workflow

graph TD
    A[Start Development] --> B[Clone Repository]
    B --> C[Set Up Environment]
    C --> D[Install Dependencies]
    D --> E[Make Changes]
    E --> F[Run Tests]
    F --> G{Tests Pass?}
    G -->|No| H[Fix Issues]
    H --> E
    G -->|Yes| I[Run Linters]
    I --> J{Code Quality OK?}
    J -->|No| K[Address Linter Issues]
    K --> E
    J -->|Yes| L[Submit Pull Request]
    L --> M[Review Process]
    M --> N[Merge to Main]

Examples and Cookbooks

Example applications have been moved to a dedicated repository. All example cookbooks are now located at:

Repository: https://github.com/deepset-ai/haystack-cookbook/

This separation allows for more focused development and easier discovery of example applications.

Sources: examples/README.md:1-10

License and Compliance

All contributions must comply with the project's license. View license information at:

The project includes automated license compliance checking through GitHub workflows.

Sources: docker/README.md:50-60

Quick Reference Commands

CommandPurpose
pip install haystack-aiInstall Haystack
pip install --pre haystack-aiInstall pre-release version
npm installInstall documentation dependencies
npm startStart documentation dev server
npm run buildBuild documentation site
docker buildx bake baseBuild Docker base image

Additional Resources

Sources: README.md:1-10

Deployment and Infrastructure

Related topics: Development Guide, Introduction to Haystack

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Base Images

Continue reading this section for the full explanation and source context.

Section Building Custom Images

Continue reading this section for the full explanation and source context.

Section Multi-Platform Builds

Continue reading this section for the full explanation and source context.

Related topics: Development Guide, Introduction to Haystack

Deployment and Infrastructure

Overview

Haystack provides a comprehensive deployment infrastructure designed for production-ready LLM applications. The framework supports multiple deployment strategies including Docker containers, Kubernetes orchestration, and cloud platform integrations. This documentation covers the core deployment mechanisms, containerization approach, GPU acceleration support, and production best practices.

The deployment system is built around Docker images using BuildKit for efficient multi-platform builds, enabling deployment across x86_64 and ARM64 architectures. The infrastructure supports both development environments and production-grade deployments with high availability requirements.

Docker Containerization

Base Images

Haystack provides pre-built Docker images that serve as the foundation for custom deployments. The base images contain a working Python environment with Haystack preinstalled and are intended to be extended with application-specific configurations.

The primary image variant available is:

Image TagDescriptionUse Case
haystack:base-<version>Base Python environment with HaystackCustom image derivation

All images are published to Docker Hub and can be pulled directly for use in production environments. The images follow semantic versioning and align with Haystack releases.

Building Custom Images

Custom images can be built using Docker BuildKit and the bake command orchestrator. This approach allows for:

  • Custom Haystack versions or branches
  • Pre-installed dependencies
  • Application-specific configurations
  • Multi-platform support

The build process uses the docker-bake.hcl configuration file which defines build targets, platforms, and variable substitutions.

#### Basic Build Command

docker buildx bake base

#### Building with Custom Variables

To build with a custom Haystack version or branch, override the HAYSTACK_VERSION variable:

HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake base --no-cache

This mechanism enables CI/CD pipelines to build images from specific commits, branches, or release tags without modifying the underlying Dockerfile.

Multi-Platform Builds

Haystack Docker images support multiple architectures including:

  • linux/amd64 (x86_64)
  • linux/arm64 (ARM64)

#### Platform Limitations

Depending on the operating system and Docker environment, building all platforms locally may not be possible. If encountering the following error:

multiple platforms feature is currently not supported for docker driver. Please switch to a different driver
(eg. "docker buildx create --use")

The platform option must be overridden to match the local architecture. For example, on Apple M1 (ARM64):

docker buildx bake base --set "*.platform=linux/arm64"

#### Cross-Platform Considerations

When deploying multi-platform images, consider the following:

  • CPU Compatibility: Ensure target nodes match the built architecture
  • Performance: Native architecture builds perform optimally
  • Registry Support: Use registries that support multi-platform manifests

GPU Acceleration

Hardware Acceleration Support

Haystack supports GPU acceleration for compute-intensive operations including:

  • Model inference
  • Embedding generation
  • Tokenization
  • Custom model operations

GPU acceleration significantly improves throughput for LLM-based pipelines and embedding-heavy workloads.

Enabling GPU Support

#### NVIDIA GPUs (CUDA)

For NVIDIA GPU support, use CUDA-enabled base images and ensure the nvidia-container-toolkit is installed on the host system.

Docker Compose Example:

services:
  haystack:
    image: haystack:base-latest
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

#### AMD GPUs (ROCm)

AMD GPU support requires ROCm-enabled images and appropriate runtime configuration.

GPU Memory Management

For production deployments, configure memory limits based on model size:

Model SizeRecommended GPU MemoryConfiguration
Small (<1B params)8 GBCUDA_VISIBLE_DEVICES=0
Medium (1-7B params)16 GBCUDA_VISIBLE_DEVICES=0,1
Large (7-70B params)32+ GBMulti-GPU / quantization

Quantization Options

To reduce GPU memory requirements, consider model quantization:

  • 4-bit quantization: Reduces memory by ~75%
  • 8-bit quantization: Reduces memory by ~50%
  • Dynamic quantization: Trade-off between speed and accuracy

Kubernetes Deployment

Container Orchestration

Haystack can be deployed on Kubernetes for production environments requiring:

  • Horizontal scaling
  • High availability
  • Rolling updates
  • Resource management
  • Service discovery

Resource Configuration

#### Resource Limits

Configure CPU and memory limits based on workload:

resources:
  limits:
    cpu: "4"
    memory: "16Gi"
  requests:
    cpu: "2"
    memory: "8Gi"

#### GPU Resource Allocation

For GPU workloads, define accelerator resources:

resources:
  limits:
    nvidia.com/gpu: "2"
  requests:
    nvidia.com/gpu: "1"

High Availability Configuration

For production deployments, implement:

  1. Replica Sets: Deploy multiple replicas for fault tolerance
  2. Health Checks: Configure liveness and readiness probes
  3. Pod Disruption Budgets: Ensure availability during updates
  4. Anti-Affinity Rules: Distribute pods across nodes
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

Service Configuration

Expose Haystack services using Kubernetes Services:

apiVersion: v1
kind: Service
metadata:
  name: haystack-api
spec:
  selector:
    app: haystack
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8000
  type: LoadBalancer

Production Best Practices

Security Considerations

PracticeImplementation
Non-root executionConfigure USER directive in Dockerfile
Secret managementUse Kubernetes Secrets or external secret stores
Network policiesRestrict pod-to-pod communication
Image scanningScan images for vulnerabilities before deployment
TLS terminationConfigure ingress with TLS certificates

Monitoring and Observability

Implement monitoring using:

  • Metrics: Prometheus exporter for pipeline metrics
  • Logging: Centralized logging with ELK/Graylog
  • Tracing: OpenTelemetry for request tracing
  • Alerts: Configure alerts for error rates and latency

Performance Optimization

  1. Connection Pooling: Reuse database and API connections
  2. Caching: Implement caching for frequently accessed data
  3. Batch Processing: Process multiple requests in batches
  4. Async Processing: Use async/await for I/O operations

CI/CD Integration

Automated Builds

Haystack supports automated Docker image builds through:

  • GitHub Actions workflows
  • BuildKit with bake files
  • Multi-stage Docker builds

Deployment Workflows

graph TD
    A[Code Change] --> B[Run Tests]
    B --> C[Build Docker Image]
    C --> D[Push to Registry]
    D --> E[Update Deployment]
    E --> F[Health Check]
    F --> G{Healthy?}
    G -->|Yes| H[Deployment Complete]
    G -->|No| I[Rollback]

Registry Configuration

Popular registry options for Haystack images:

RegistryUse CaseAuthentication
Docker HubPublic deploymentsOptional
AWS ECRAWS infrastructureIAM roles
GCRGCP infrastructureService accounts
Azure ACRAzure infrastructureService principals
Private RegistryEnterprise deploymentsUsername/password

License and Compliance

The Haystack Docker images contain:

  • Haystack framework code under the Apache 2.0 license
  • Python runtime components
  • Base distribution software with their respective licenses

Users are responsible for ensuring compliance with all software licenses contained within deployed images. For enterprise deployments, review the license implications of all included components.

Summary

Haystack provides a flexible and production-ready deployment infrastructure supporting Docker containerization, Kubernetes orchestration, and GPU acceleration. The multi-platform Docker images enable deployment across diverse infrastructure, while Kubernetes support facilitates enterprise-grade deployments with high availability and scalability requirements. GPU acceleration support enables high-performance inference for LLM-powered applications, with quantization options for resource-constrained environments.

Source: https://github.com/deepset-ai/haystack / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high RFC: Signed receipts for Haystack pipeline component calls

First-time setup may fail or require extra isolation and rollback planning.

high feat: Add `run_async` to `MultiQueryEmbeddingRetriever`, `MultiQueryTextRetriever`, and `TextEmbeddingRetriever`

First-time setup may fail or require extra isolation and rollback planning.

high feat: add INTERSECTION join mode to DocumentJoiner

First-time setup may fail or require extra isolation and rollback planning.

high docs: Update Ragas docs

Users cannot judge support quality until recent activity, releases, and issue response are checked.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: RFC: Signed receipts for Haystack pipeline component calls

  • Severity: high
  • Finding: Installation risk is backed by a source signal: RFC: Signed receipts for Haystack pipeline component calls. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11039

2. Installation risk: feat: Add `run_async` to `MultiQueryEmbeddingRetriever`, `MultiQueryTextRetriever`, and `TextEmbeddingRetriever`

  • Severity: high
  • Finding: Installation risk is backed by a source signal: feat: Add run_async to MultiQueryEmbeddingRetriever, MultiQueryTextRetriever, and TextEmbeddingRetriever. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11358

3. Installation risk: feat: add INTERSECTION join mode to DocumentJoiner

  • Severity: high
  • Finding: Installation risk is backed by a source signal: feat: add INTERSECTION join mode to DocumentJoiner. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11365

4. Maintenance risk: docs: Update Ragas docs

  • Severity: high
  • Finding: Maintenance risk is backed by a source signal: docs: Update Ragas docs. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11178

5. Security or permission risk: EnvVarSecrets: add multi-tenant context support (ContextVar / pipeline-run context)

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: EnvVarSecrets: add multi-tenant context support (ContextVar / pipeline-run context). Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11366

6. Security or permission risk: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11311

7. Security or permission risk: feat: support token-based budget in LostInTheMiddleRanker

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: feat: support token-based budget in LostInTheMiddleRanker. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11351

8. Installation risk: Developers should check this installation risk before relying on the project: Proposal: Transaction Protocol for idempotent, auditable agent pipelines

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Proposal: Transaction Protocol for idempotent, auditable agent pipelines
  • User impact: Developers may fail before the first successful local run: Proposal: Transaction Protocol for idempotent, auditable agent pipelines
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Proposal: Transaction Protocol for idempotent, auditable agent pipelines. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_issue | fmev_58038e9b6373edf9376049b42d4b7bb4 | https://github.com/deepset-ai/haystack/issues/11266 | Proposal: Transaction Protocol for idempotent, auditable agent pipelines

9. Installation risk: Developers should check this installation risk before relying on the project: RFC: Signed receipts for Haystack pipeline component calls

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: RFC: Signed receipts for Haystack pipeline component calls
  • User impact: Developers may fail before the first successful local run: RFC: Signed receipts for Haystack pipeline component calls
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: RFC: Signed receipts for Haystack pipeline component calls. Context: Observed when using node, python
  • Evidence: failure_mode_cluster:github_issue | fmev_ce0b9c65d21126dcf11ede12120e154f | https://github.com/deepset-ai/haystack/issues/11039 | RFC: Signed receipts for Haystack pipeline component calls

10. Installation risk: Developers should check this installation risk before relying on the project: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
  • User impact: Developers may fail before the first successful local run: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Security: OWASP Agent Memory Guard for pipeline memory poisoning defense. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_issue | fmev_4d3276b6b9938595cb2dbb864a5509da | https://github.com/deepset-ai/haystack/issues/11311 | Security: OWASP Agent Memory Guard for pipeline memory poisoning defense

11. Installation risk: Developers should check this installation risk before relying on the project: [FEATURE] Support for code syntax-aware Document Splitters

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: [FEATURE] Support for code syntax-aware Document Splitters
  • User impact: Developers may fail before the first successful local run: [FEATURE] Support for code syntax-aware Document Splitters
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: [FEATURE] Support for code syntax-aware Document Splitters. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_issue | fmev_997b84068ae32409b1d8d55daaddd984 | https://github.com/deepset-ai/haystack/issues/11354 | [FEATURE] Support for code syntax-aware Document Splitters

12. Installation risk: MCP Server for Haystack docs

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: MCP Server for Haystack docs. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/deepset-ai/haystack/issues/11346

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using haystack with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence