qdrant Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

qdrant

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Qdrant Overview and System Architecture

Related topics: Vector Indexing: HNSW, Sparse, and Multivector Search, Distributed System: Sharding, Replication, and Consensus

Section Related Pages

Continue reading this section for the full explanation and source context.

Qdrant Overview and System Architecture

Purpose and Scope

Qdrant (pronounced "quadrant") is a vector similarity search engine and vector database implemented primarily in Rust. It provides a production-ready service for storing, searching, and managing points — vectors with optional JSON payloads — and is tailored for extended filtering, semantic matching, faceted search, and recommendation workloads. Source: README.md:1-50.

The project ships in two primary deployment shapes. The classic Qdrant Server follows a client–server architecture suitable for running as a Docker container, a managed cloud instance, or a Kubernetes deployment, and exposes both REST and gRPC interfaces. Source: README.md:55-70. Qdrant Edge, by contrast, is a lightweight in-process vector search engine that runs inside the application process and is intended for embedded devices, autonomous systems, and mobile agents. Source: lib/edge/publish/README.md:1-10. The repository also ships official client libraries for Python, JavaScript/TypeScript, .NET/C#, Java, and Go, plus community clients such as PHP. Source: README.md:35-50.

Core Architecture

Qdrant is organized as a Cargo workspace with a layered structure. The high-level layering, from the public binary down to storage primitives, is illustrated below.

graph TD
    A[Qdrant Binary / Edge Entry Point] --> B[API Layer: REST + gRPC]
    B --> C[Content Manager / TableOfContent]
    C --> D[Collection Crate: per-collection operations]
    D --> E[Shard: WAL, segments, optimizers]
    E --> F[Segment: HNSW, payload indexes, vector storage]
    F --> G[Storage Backends: Gridstore, RocksDB migration path]
    C --> H[Snapshots / Consensus / Cluster]
    C --> I[Adaptive Tokio Runtimes]

The collection crate implements all functions required to operate on a single collection of points, where points within a collection share a payload schema and vector size so that search can be performed over all points uniformly. Source: lib/collection/README.md:1-7. A collection is internally composed of one or more shards, each of which owns a write-ahead log (WAL), a set of on-disk segments, and an optimizer that merges smaller segments into larger indexed ones.

The storage layer exposes common utilities and a content manager that owns the table-of-contents (TOC) abstraction. The TOC coordinates collection creation, deletion, aliasing, snapshots, and consensus operations, and is the single point through which cluster-wide state is mutated. Source: lib/storage/src/common/mod.rs:1-5. Underneath the TOC, the runtime subsystem maintains two parallel Tokio search runtimes — high_cpu and high_io — sized for CPU-saturated and IO-bound workloads respectively, and routes spawn_blocking calls between them through an adaptive handle that observes CPU usage at runtime. Source: lib/storage/src/content_manager/toc/runtimes.rs:1-25.

Qdrant Server vs Qdrant Edge

Qdrant Edge shares the underlying segment, shard, and WAL code paths with the server, but exposes a much smaller public surface. The EdgeShard struct owns a path on disk, an on-disk EdgeConfig, a SerdeWal<CollectionUpdateOperations> for durability, and a LockedSegmentHolder for in-memory segment management. Source: lib/edge/src/lib.rs:30-55. Operations such as upsert, search, scroll, count, facet, and retrieve are implemented as methods on this struct, with search-specific code (for example, vector-dispatch over the QueryVector enum and score-threshold truncation) living in lib/edge/src/search.rs:1-50.

Because Qdrant Edge is in-process, it has no HTTP or gRPC layer, no consensus, and no multi-tenant content manager. It is designed to be embedded directly into a Python or Rust application and optionally synchronized with a Qdrant server, making it a fit for offline-capable mobile and embedded scenarios. Source: lib/edge/python/README.md:1-10; lib/edge/publish/README.md:1-10.

Key Subsystems and Features

Qdrant supports dense vectors for semantic similarity, sparse vectors for full-text search, and multivector search for late-interaction models such as ColBERT. Source: README.md:90-95. Payloads attached to points can be filtered with keyword matching, full-text, numeric ranges, geo-locations, and Boolean combinations of should, must, and must_not. Source: README.md:97-99. Hybrid search merges multiple vectors via Reciprocal Rank Fusion (RRF) or Distribution-Based Score Fusion (DBSF). Source: README.md:101-103.

Quantization is a first-class subsystem: scalar, binary, product, and the newer TurboQuant variant reduce RAM usage substantially. The v1.18.0 release added TurboQuant with 8× vector compression, tracked by community issue #8670 and discussed in #8524. Other highlighted capabilities include faceting, recommendation, discovery, Maximal Marginal Relevance (MMR), relevance feedback (added in v1.17.0), multitenancy, payload indexes for query planning, SIMD acceleration, GPU indexing, async I/O via io_uring, and write-ahead logging with ack-on-flush semantics. Source: README.md:107-135.

Snapshots are a critical subsystem for backup, restore, and shard migration. Snapshot downloads are streamed with a 60-second inactivity timeout enforced by a TimeoutReader wrapper that converts stalled streams into io::ErrorKind::TimedOut, preventing hung restores from blocking the cluster indefinitely. Source: lib/storage/src/content_manager/snapshots/download_tar.rs:10-40; lib/storage/src/content_manager/snapshots/download_tar.rs:50-70. Cluster peer roles (Follower, Candidate, Leader, PreCandidate) are mapped from the underlying raft crate in the shared types module. Source: lib/storage/src/types.rs:1-60.

Error Handling and Common Failure Modes

The unified error type StorageError lives in the storage crate and covers AlreadyExists, NotFound, ChecksumMismatch, Forbidden, Timeout, RateLimitExceeded, Service, BadInput, BadRequest, Inference, and ShardUnavailable. Source: lib/storage/src/content_manager/errors.rs:1-90. Collection errors are translated into StorageError through from_inconsistent_shard_failure, with StrictMode mapping to BadRequest and ShardUnavailable propagating to the API layer; conversions from IoError, tempfile::PathPersistError, mutex poisoning, channel closure, and oneshot receiver errors are also defined here. Source: lib/storage/src/content_manager/errors.rs:100-150; lib/storage/src/content_manager/errors.rs:150-200.

Notable operational concerns observed in the community include: the long-standing request to add a new vector field after collection creation (#1132), aggressive quantization tradeoffs for vectors below 1024 dimensions (#8524), and stale vector storage for deleted points that confuses the optimizer and query planner (#2550). The v1.16.2 release shipped a fix for a critical WAL bug that could break consensus, and v1.18.0 added the API to create and delete named vectors in an existing collection — a direct response to #1132. The live-reload shard tracking issue (#9241) is a foundational building block for the planned serverless deployment option.

Vector Indexing: HNSW, Sparse, and Multivector Search

Related topics: Quantization: Scalar, Binary, Product, and TurboQuant, Payload Indexing and Filtering

Section Related Pages

Continue reading this section for the full explanation and source context.

Vector Indexing: HNSW, Sparse, and Multivector Search

Overview and Scope

Qdrant's vector indexing subsystem is the core engine that turns embeddings, sparse term weights, and multi-vector representations (e.g. ColBERT late-interaction embeddings) into searchable structures. The repository exposes three indexing modalities that share a common query-planning layer:

Dense vectors indexed by HNSW (Hierarchical Navigable Small World) graphs.
Sparse vectors used for full-text and learned-sparse retrieval.
Multivectors, where each point carries a list of embeddings, enabling late-interaction scoring.

The server binary and the lightweight in-process EdgeShard API both implement these modalities. EdgeShard is the in-process engine used by Qdrant Edge; it mirrors the server's collection model but exposes a smaller configuration surface suitable for embedded and mobile deployments. Source: lib/edge/src/lib.rs:42-66.

flowchart LR
  A[Query / Vectors] --> B{RootPlan}
  B -->|MergePlan| C[Sources: Search, Scroll, Prefetch]
  C --> D[Per-source scoring]
  D --> E[Rescore stages]
  E --> F[Fill payload/vectors]
  F --> G[Filtered ScoredPoint list]

Dense Vector Indexing (HNSW)

HNSW is the default approximate nearest neighbor (ANN) structure for dense vectors in Qdrant. It is the only graph-based index family referenced in the user-facing configuration; both server collections and edge shards accept a global HnswConfig that can be overridden per named vector.

In the edge configuration, dense indexing is declared as EdgeVectorParams inside the vectors map of an EdgeConfig. The structure carries size, distance, hnsw_config, and an optional per-vector quantization_config. Source: lib/edge/src/config/shard.rs:8-58.

Notable configuration levers exposed to the user:

Option	Purpose	Defined in
`HnswConfig::m`, `ef_construct`, `full_scan_threshold`	Graph topology and build-time search beam	server `HnswConfig` (referenced by `EdgeConfig::hnsw_config`)
`on_disk` per vector	Keep raw vectors on disk and load pages on demand	lib/edge/src/config/shard.rs
`on_disk_payload`	Store JSON payloads on disk via Gridstore	lib/edge/src/config/shard.rs:48-52
`quantization_config`	Scalar / Binary / Product / TurboQuant compression	lib/edge/src/config/shard.rs:28-34

The optimizer pipeline that produces searchable HNSW segments is governed by EdgeOptimizersConfig. The prevent_unoptimized flag, introduced in v1.17.1, defers point updates that would land in oversized unoptimized segments and returns them only after the optimizer promotes them. Source: lib/edge/src/config/optimizers.rs:20-30.

Sparse Vector Search

Sparse vectors in Qdrant are first-class named vectors: each collection (or edge shard) can declare a sparse_vectors map alongside its dense vectors map. The edge configuration carries them as EdgeSparseVectorParams keyed by VectorNameBuf. Source: lib/edge/src/config/shard.rs:18-22.

Sparse retrieval is the foundation for full-text and learned-sparse workflows. Internally the storage layer delegates sparse indexing to a dedicated module under lib/segment/src/index/sparse_index/, which is referenced from the search facade when a query names a sparse vector. The same scoring, rescoring, and threshold-truncation code paths that apply to dense scores also apply to sparse scores: after the candidate list is produced, the engine drops anything that does not satisfy score_threshold by binary searching for the threshold breakpoint. Source: lib/edge/src/search.rs:9-22.

Multivector Search (Late Interaction / ColBERT)

Multivector points carry a list of embeddings per point, so similarity between a query and a point is computed as the aggregate (typically MaxSim) over all query-token-to-document-token pairs. The community tracks this work in the ColBERT tracking issue, which lists the prerequisite "New simple vector storage for multivector data" as merged.

In the edge configuration, multivector capability is implicit in EdgeVectorParams, but the engine drives multivector scoring through the same RootPlan / MergePlan pipeline used for prefetch and rescore. The query planner recursively walks a tree of Source::SearchesIdx and Source::ScrollsIdx nodes, executing each leaf in parallel and applying rescore stages bottom-up. Source: lib/edge/src/query.rs:43-95.

The leaf search itself differentiates on QueryVector variants (RecommendBestScore, RecommendSumScores, Discover, Context, FeedbackNaive, …), each of which produces a candidate ScoredPoint list that is then merged. Source: lib/edge/src/search.rs:1-12.

Query Planning and Runtime Isolation

Vector search is CPU-bound at the segment level and IO-bound at the persistence level, so the TableOfContent builds two parallel Tokio runtimes:

high_cpu — sized to one blocking thread per CPU core, used when the process is saturated.
high_io — uses common::defaults::search_thread_count for IO-heavy segment scans so latency can be hidden behind parallel reads.

AdaptiveSearchHandle routes spawn_blocking calls between them based on observed CPU usage. Source: lib/storage/src/content_manager/toc/runtimes.rs:1-15.

On the Edge side, the storage layer is Gridstore: a memory-mapped, block-based store with 32 MiB pages and 128-byte blocks, compressed with LZ4, with a region bitmask tracking free blocks. The shard persists vectors, payloads, and payload indexes in Gridstore, and updates are always written as new blocks (no in-place mutation). Source: lib/gridstore/readme.md:1-19.

Authorization Touchpoints

RBAC checks every read-side vector operation, including SearchRequestInternal, RecommendRequestInternal, DiscoverRequestInternal, and ContextSearchInternal. For example, a SearchRequest with group_by and a with_lookup against another collection requires whole read access on both collections; otherwise the engine returns a forbidden result before reaching the vector index. Source: lib/storage/src/rbac/ops_checks.rs:1-45.

Quantization: Scalar, Binary, Product, and TurboQuant

Related topics: Vector Indexing: HNSW, Sparse, and Multivector Search, Storage Engine: Segments, Gridstore, and WAL

Section Related Pages

Continue reading this section for the full explanation and source context.

Quantization: Scalar, Binary, Product, and TurboQuant

Overview and Purpose

Quantization in Qdrant is a compression mechanism that reduces the RAM and disk footprint of stored vectors by approximating full-precision (f32) embeddings with smaller, lossy representations. The top-level README frames this capability as a core selling point: built-in quantization "cuts RAM usage by up to 97% and lets you tune the trade-off between search speed and precision" README.md. The implementation lives in the lib/quantization crate and exposes four sibling encoders, one per file: encoded_vectors.rs, encoded_vectors_binary.rs, encoded_vectors_pq.rs, encoded_vectors_tq.rs, and encoded_vectors_u8.rs. The u8 and binary modules implement scalar and binary quantization respectively, while pq and tq implement Product Quantization and the newer TurboQuant variant.

Each encoder produces a common EncodedVectors abstraction (defined in encoded_vectors.rs) so the rest of the engine — segment storage, HNSW graph, distance scorers, and the new io_uring async I/O path introduced in v1.18.1 — can be written generically. PR #8988 in v1.18.1 explicitly "refactored quantized multi-vector scorers for io_uring support," confirming that the trait surface is the integration point for every variant v1.18.1.

The community has been actively shaping this surface. Issue #8524 ("TurboQuant quantization (ICLR 2026)") describes the motivation as filling the gap between "scalar is solid but tops out at 4×" and "binary falls apart below 1024d" while PQ still "needs codebook training and still underperforms" — a gap that TurboQuant was designed to close with "8x vector compression without the recall tax" announced in v1.18.0 v1.18.0, Issue #8524.

The Four Variants

Variant	File	Compression	Codebook Training	Best For
Scalar (`int8`)	`encoded_vectors_u8.rs`	~4× (f32 → u8)	None	General-purpose, low overhead
Binary	`encoded_vectors_binary.rs`	~32× (1 bit/dim)	None	Very high-dim vectors ≥1024d
Product Quantization (PQ)	`encoded_vectors_pq.rs`	Configurable, high	Required (k-means)	Large codebooks, recall-sensitive
TurboQuant (TQ)	`encoded_vectors_tq.rs`	~8×	None	4×–32× gap, recall-critical

Scalar quantization stores each dimension as an u8 plus per-segment scale/bias parameters. It is the default go-to choice because it adds no training step and preserves enough precision for most recall targets.

Binary quantization packs each dimension into a single bit. It is the most aggressive form and the README and community notes flag a sharp recall cliff below ~1024 dimensions Issue #8524.

Product Quantization splits the vector into subspaces and replaces each with a centroid index. It offers the deepest compression but requires a k-means training pass and is sensitive to the choice of m (number of subquantizers) and nbits (centroids per subquantizer).

TurboQuant is the newest addition, tracked in Issue #8670 and merged in the v1.18.0 milestone. It targets the compression/recall gap between scalar and binary, advertises 8× compression, and — unlike PQ — does not require a codebook training step.

Configuration and Lifecycle

Quantization is configured per collection (server) or per EdgeShard (edge). The edge configuration builder exposes a dedicated quantization_config setter and folds it into the on-disk EdgeConfig with a default fallback if unset, ensuring an explicit exhaustiveness check at build time lib/edge/src/builders/edge_config.rs. The config module's docstring reinforces the user-facing scope: "no SegmentConfig, payload_storage_type, or per-vector quantization. Use on_disk_payload, on_disk per vector, and global quantization_config / hnsw_config" lib/edge/src/config/mod.rs. This means the same QuantizationConfig value is reused across all named vectors unless a finer-grained API is supplied.

The lib/edge/src/lib.rs entry point shows the surrounding lifecycle: an EdgeShard owns a SaveOnDisk<EdgeConfig>, a WAL of CollectionUpdateOperations, and a LockedSegmentHolder — quantization parameters from the config flow into segment construction so encoded vectors are produced at segment-build time, not lazily on first read.

The lib/edge/src/search.rs scorers are written against the generic EncodedVectors interface rather than any concrete variant, which is what allows the same query pipeline to dispatch to int8, binary, PQ, or TQ scorers at runtime. Community work in v1.18.1 extended this path to multi-vector (ColBERT-style) scorers under io_uring v1.18.1, Issue #3684.

Data Flow

flowchart LR
    A[Raw f32 vector] --> B{Quantizer}
    B -->|Scalar| C[encoded_vectors_u8.rs]
    B -->|Binary| D[encoded_vectors_binary.rs]
    B -->|PQ| E[encoded_vectors_pq.rs]
    B -->|TurboQuant| F[encoded_vectors_tq.rs]
    C --> G[EncodedVectors trait]
    D --> G
    E --> G
    F --> G
    G --> H[Segment / HNSW / mmap storage]
    G --> I[Distance scorers]
    I --> J[io_uring async I/O]

Common Failure Modes and Trade-offs

Binary quantization on low-dimensional vectors loses too much recall; community guidance places the safe floor at ~1024 dimensions Issue #8524.
Product Quantization requires training data; under-trained codebooks degrade recall silently. Operators should evaluate recall@K on a held-out set after building a PQ segment.
Scalar quantization tops out at ~4× compression; if the RAM budget is still tight after switching to int8, escalate to PQ or TurboQuant rather than binary.
TurboQuant is the newest variant; track Issue #8670 for residual bugs and follow-up improvements introduced after the v1.18.0 milestone merge.
Cross-cutting refactors touch every variant: v1.18.1's io_uring refactor of quantized multi-vector scorers and v1.18.2's "don't rebuild payload index if changing on_disk flag" both live above the encoder layer, so changing quantization is rarely the cause of a regression in those areas v1.18.2, v1.18.1.

Payload Indexing and Filtering

Related topics: Vector Indexing: HNSW, Sparse, and Multivector Search, Distributed System: Sharding, Replication, and Consensus

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Recent community-relevant changes

Continue reading this section for the full explanation and source context.

Payload Indexing and Filtering

Overview

Payload indexing and filtering is the subsystem that allows Qdrant to attach arbitrary JSON metadata to vectors and selectively retrieve points using rich conditions. As described in the project README, "Attach any JSON payload to your vectors and filter on it using a rich set of conditions—keyword matching, full-text, numeric ranges, geo-locations, and more—combined with should, must, and must_not clauses." Source: README.md.

The collection crate (lib/collection) is responsible for "all functions required for operations with a single collection of points," and it relies on the lib/segment field-index layer to accelerate filter evaluation. Source: lib/collection/README.md.

Architecture

The collection crate contains shards, and each shard contains a SegmentHolder of on-disk segments. Each segment maintains its own payload indexes, which are aggregated into a per-shard PayloadIndexInfo schema. The ShardInfo structure produced by the edge runtime exposes a payload_schema map that merges points counts across segments for the same key. Source: lib/edge/src/info.rs.

flowchart TD
    A[Collection] --> B[Shard]
    B --> C[SegmentHolder]
    C --> D[Segment 1]
    C --> E[Segment 2]
    D --> F[Field Indexes<br/>numeric, keyword, full-text, geo]
    E --> F
    F --> G[Query Planner<br/>selects indexed vs. scan]
    G --> H[Filtered Result Set]
    H --> I[Vector Search Ranking]

The EdgeShard type encapsulates the on-disk layout used in the Edge build: a config, a write-ahead log (SerdeWal<CollectionUpdateOperations>), and a LockedSegmentHolder. The WAL records CollectionUpdateOperations, which include all payload mutations and are replayed on startup. Source: lib/edge/src/lib.rs.

Field Index Types and Filter Conditions

The full-text index ships with per-language stop-word lists ("common words that don't add significant meaning to search queries") to keep token tables small and queries accurate. Source: lib/segment/src/index/field_index/full_text_index/stop_words/README.md.

The following field index families are exposed in the segment crate and are the building blocks of the public filter grammar:

Index family	File path	Filter conditions
Numeric	`lib/segment/src/index/field_index/numeric_index/mod.rs`	ranges, `gte`/`lte`, equality
Map (keyword / bool)	`lib/segment/src/index/field_index/map_index/mod.rs`	exact match, `match any/except`, is null
Full-text	`lib/segment/src/index/field_index/full_text_index/mod.rs`	token match, phrase, BM25 scoring
Geo	`lib/segment/src/index/field_index/geo_index/mod.rs`	radius, polygon bounding box

These indexes are accessed through the FieldIndex trait and the index_selector module, which picks the appropriate structure for each payload field based on the schema. Source: lib/segment/src/index/field_index/index_selector.rs.

Search and Score-Threshold Filtering

Beyond the structural filter conditions, post-retrieval filtering is applied on the result list. The edge search implementation asserts that scored points are sorted by distance, then locates the first point that falls below score_threshold and truncates the list at that index. The offset is then applied via points.drain(..cmp::min(points.len(), offset)). Source: lib/edge/src/search.rs.

The query planner in lib/edge/src/query.rs constructs a RootPlan with a merge_plan and with_payload/with_vector settings, recurses through prefetch stages, and finally calls fill_with_payload_or_vectors to materialize the response. Source: lib/edge/src/query.rs.

Lifecycle, Configuration, and Recent Improvements

The TableOfContent schedules two search runtimes — high_cpu and high_io — to balance CPU-bound and IO-bound filter evaluation. The high_io runtime uses common::defaults::search_thread_count so that segment scans with many indexed predicates can hide latency behind more parallel workers. Source: lib/storage/src/content_manager/toc/runtimes.rs.

The CreateCollectionOperation validator walks each named vector in VectorsConfig::Multi (and each sparse-vector key) and rejects names that fail common::validation::validate_vector_name, returning StorageError::bad_input with a descriptive message. The same validation runs on PUT /collections/{name}/vectors/{vector_name}, so both creation paths reject the same set of bad names. Source: lib/storage/src/content_manager/collection_meta_ops.rs.

Payload mutations and index rebuilds are persisted through the same CollectionUpdateOperations stream that flows through the segment WAL, and storage-layer errors are mapped to a uniform StorageError enum (e.g. not_found, forbidden, timeout, rate_limit_exceeded, checksum_mismatch). Source: lib/storage/src/content_manager/errors.rs.

Access control treats filter-bearing read operations — SearchRequestInternal, RecommendRequestInternal, DiscoverRequestInternal, and GroupRequest — as whole-collection reads; the RBAC tests assert that, for example, a GroupRequest with a with_lookup into a second collection requires read access on both collections, not just the primary one. Source: lib/storage/src/rbac/ops_checks.rs.

Recent community-relevant changes

v1.18.2 — "Don't rebuild payload index if changing on_disk flag" (PR #9138): avoids an expensive full payload index rebuild when only the on-disk flag of an existing index changes.
v1.18.2 — "Clear cache of ID tracker after building a segment" (PR #9137): prevents stale cardinality estimates from being used in query plans built right after optimization.
v1.18.2 — "Log slow operations during shard WAL recovery" (PR #9282): surfaces long-running WAL replays that include payload mutations.
v1.16.1 — "Actively migrate vector, payload and payload index storage from RocksDB into Gridstore on startup" (PR #7551): moves payload index storage onto the newer Gridstore backend.

Distributed System: Sharding, Replication, and Consensus

Related topics: Storage Engine: Segments, Gridstore, and WAL, Operations, Configuration, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Distributed System: Sharding, Replication, and Consensus

Qdrant's distributed layer turns a single node's vector search engine into a clustered, fault-tolerant, horizontally scalable service. This page describes how collections are split into shards, how each shard is replicated across peers, and how state changes are committed through a Raft-based consensus protocol implemented in the lib/storage crate. The design supports per-collection replication factors, dynamic resharding, shard transfer, tenant promotion (introduced in v1.16.0 per the community context), and live-reload primitives tracked in #9241.

1. Architectural Overview

At the top of the cluster sits the Table of Content (ToC), which owns the set of collections and the consensus handle. The ToC delegates cluster coordination to a ConsensusManager and routes collection-scoped mutations through Raft log entries. Peer state and the Raft role are exposed publicly via lib/storage/src/types.rs.

flowchart TB
    Client[Client / API] --> ToC[TableOfContent]
    ToC -->|consensus proposal| CM[ConsensusManager]
    CM -->|Raft log| Wal[ConsensusOpWal]
    CM -->|commit notify| ToC
    ToC --> Col[Collection Container]
    Col --> RS1[ReplicaSet shard 0]
    Col --> RS2[ReplicaSet shard 1]
    RS1 --> LS[LocalShard]
    RS1 --> R1[RemoteShard peer B]
    RS1 --> R2[RemoteShard peer C]

The ToC loads collections from disk with bounded concurrency (buffer_unordered(load_concurrency)), initializes snapshot telemetry, and opens the alias persistence file lib/storage/src/content_manager/toc/mod.rs. The ConsensusManager is the only component allowed to mutate cluster-wide metadata; it serializes a SnapshotData struct that bundles CollectionsSnapshot, address_by_id, peer metadata, and cluster metadata lib/storage/src/content_manager/consensus_manager.rs.

2. Sharding: Distribution of Data

Each collection is partitioned into a fixed number of shards. The CreateCollectionOperation struct carries an optional ShardDistributionProposal describing which peers should host which shards, alongside the collection name and CreateCollection payload lib/storage/src/content_manager/collection_meta_ops.rs.

The proposal is converted to a CollectionShardDistribution whose shards field maps each shard_id to a set of PeerIds. The distribution algorithm in lib/storage/src/content_manager/shard_distribution.rs distributes replicas as evenly as possible while keeping the minimum number of replicas per shard. The included unit test test_distribution (4 peers, 6 shards, replication factor 1) verifies that shard counts are as even as possible (one peer owns 2 shards, the rest 1) and that no shard is left without a replica.

Key behaviors observed in the source:

The Multi and sparse_vectors vector-name maps are validated for filesystem-unsafe characters at creation time, since the Validate derive on CreateCollection only inspects map *values*, not keys lib/storage/src/content_manager/collection_meta_ops.rs.
Aliases are persisted separately in AliasPersistence under data.json and read fully into memory at startup lib/storage/src/content_manager/alias_mapping.rs.
v1.18.0 added APIs to create and delete named vectors in an *existing* collection, addressing long-standing requests like #1132 where users wanted to evolve schema without recreating the collection.

3. Replication: Replica Sets and State Application

Inside a collection, every shard is materialized as a replica set (one or more replicas spread across peers). The ToC's collection container wires three callbacks when a collection is created or rehydrated: change_peer_from_state_callback, request_shard_transfer_callback, and abort_shard_transfer_callback lib/storage/src/content_manager/toc/mod.rs. These callbacks translate local replica-state transitions into ConsensusOperations::abort_transfer or finish_transfer proposals, ensuring that node-local observations drive cluster-wide agreement.

When a state is applied to a local collection, the container:

Re-creates the collection if it does not exist locally.
Compares the incoming state with the current one; if different, it dispatches an apply_state call.
If the local peer was not the sender, the container logs that transfers must be aborted ("sender was not up to date") and submits a ConsensusOperations::abort_transfer proposal through consensus_proposal_sender lib/storage/src/content_manager/toc/collection_container.rs.
If the collection was created as part of snapshot application, all of its local shards are marked Dead so the cluster will re-distribute them via shard transfer.

The peer role inside Raft (Follower, Candidate, Leader, PreCandidate) is enumerated in lib/storage/src/types.rs and re-exported to operators via telemetry. Replica-state transitions surface to users through the CollectionMetaOperations variants (SetShardReplicaState, TransferShard) defined in collection_meta_ops.rs.

4. Consensus: Raft Log and Operations

All cluster-wide mutations must pass consensus. The wire format is the ConsensusOperations enum, defined in lib/storage/src/content_manager/mod.rs:

Variant	Purpose
`CollectionMeta(Box<CollectionMetaOperations>)`	Create / update / drop collections, aliases, shards, replica state
`AddPeer { peer_id, uri }`	Register a new peer in the cluster
`RemovePeer(PeerId)`	Deregister a peer
`UpdatePeerMetadata { peer_id, metadata }`	Refresh peer-side metadata
`UpdateClusterMetadata { key, value }`	Cluster-wide key/value (used for shard transfers, etc.)
`RequestSnapshot`	Ask a peer to send a snapshot
`ReportSnapshot { peer_id, status }`	Report outcome of a snapshot transfer

Each entry is serialized as CBOR and decoded via TryFrom<&RaftEntry> lib/storage/src/content_manager/mod.rs. Helper constructors abort_transfer and finish_transfer wrap the lower-level CollectionMetaOperations::TransferShard to keep call sites readable.

The ConsensusManager owns a ConsensusOpWal and an EntryId queue lib/storage/src/content_manager/consensus_manager.rs. It throttles peer-metadata refreshes to once per CONSENSUS_PEER_METADATA_UPDATE_INTERVAL (60 seconds) to avoid flooding the Raft log. Errors are surfaced through the rich StorageError enum (e.g. AlreadyExists, NotFound, ChecksumMismatch, Forbidden, Timeout, RateLimitExceeded) with ergonomic constructors lib/storage/src/content_manager/errors.rs.

Community-tracked gaps related to consensus include:

v1.18.1 fix for "Notify pending consensus ops on snapshot apply" — operations queued before a snapshot completes now resolve correctly.
v1.18.2 fix for "Fix critical WAL bug that could break consensus" — a class of corruptions that could split a cluster was patched.
v1.18.2 improvement to log slow operations during shard WAL recovery, helping diagnose stalls in production.
#9241 tracks a "Live-reload shard" — a read-only shard that can be replaced without draining traffic, foundational for a future serverless deployment option.
A staging-feature-only TestSlowDown operation exists for chaos-testing consensus latency lib/storage/src/content_manager/staging.rs.

Common Failure Modes

Vector name validation gaps: with map-key validation absent from the Validate derive, custom collection schemas are protected only by the imperative checks in CreateCollectionOperation::new. Adding new code paths for collection updates must reproduce those checks.
Snapshot desync: applying a snapshot can leave local shards in a transient Dead state; the ToC handles re-distribution, but operators should monitor shard transfer progress telemetry.
Stale peer metadata: peer metadata updates are rate-limited to once per minute, so a rapid scale-out will queue updates through Raft rather than be reflected instantly.

Storage Engine: Segments, Gridstore, and WAL

Related topics: Distributed System: Sharding, Replication, and Consensus, Quantization: Scalar, Binary, Product, and TurboQuant

Section Related Pages

Continue reading this section for the full explanation and source context.

Storage Engine: Segments, Gridstore, and WAL

Qdrant's storage engine is a layered system designed to balance durability, search performance, and write throughput. It combines three cooperating subsystems: segments as the primary on-disk representation of point data, Gridstore as the modern mmap-backed variable-size value store, and a Write-Ahead Log (WAL) that guarantees durability of updates before they are merged into segments.

Architecture Overview

The Table of Content (TableOfContent) is the central orchestration object that holds all collections, aliases, and consensus state. It dispatches operations, manages shard distribution, and coordinates background optimizers. Collections are split across shards, and each shard is built from a set of SegmentHolder entries that own Segment objects. New writes flow into a WAL, get applied to appendable segments, and are eventually merged into sealed, optimized segments by the optimizer pipeline.

flowchart TB
    Client[Client / API] --> TOC[TableOfContent]
    TOC --> Shard[Shard]
    Shard --> WAL[(Write-Ahead Log)]
    Shard --> SegHolder[SegmentHolder]
    SegHolder --> SegA[Appendable Segment]
    SegHolder --> SegB[Sealed Segment]
    SegA --> Grid[(Gridstore Pages)]
    SegB --> Grid
    SegA --> IDX[Payload & Vector Indexes]
    SegB --> IDX
    WAL -->|replay on recovery| SegA

The three subsystems are deliberately decoupled: WAL handles short-term durability, Gridstore provides the page-based block storage, and segments provide the searchable representation of point data.

Segments — Foundation of Searchable Data

A segment is the unit of search and storage within a shard. All points in a collection share the same payload schema and vector size, so a segment can apply a single index configuration across its contents Source: [lib/collection/README.md]. The collection crate implements every operation needed to manage a single collection, and segments are the building blocks it composes.

Segments come in two main flavors:

Appendable segments — accept new points and small updates in place, backed by the WAL. They are optimized for write throughput.
Sealed segments — read-only; built by merging or "optimizing" appendable segments. They carry fully built HNSW graphs, payload indexes, and quantization data, making searches fast.

The optimizer pipeline progressively promotes appendable segments to sealed segments in the background. The release notes for v1.17.1 introduced *deferred point updates* with prevent_unoptimized=true, allowing points to be applied in place and only optimized on demand, which reduces write amplification Source: [README.md]. As of v1.18.2, the cache of the ID tracker is cleared after building a segment to release memory promptly, and payload indexes are not rebuilt when only the on_disk flag toggles Source: [README.md].

Gridstore — Page-Based Variable-Size Storage

Gridstore is Qdrant's modern storage layer for variable-sized values such as vectors and payload fields. It was introduced to replace the older RocksDB-backed storage in stages — v1.16.1 added active migration from RocksDB into Gridstore on startup, and v1.17.1 made Gridstore flushes non-blocking to reduce search tail latency Source: [README.md].

The design is documented directly in the crate:

IDs are sequential integers starting at 0.
The storage is divided into file pages of fixed size (32 MB).
Data is read and written across multiple pages, all mapped into memory via mmap.
Data blocks are 128 bytes, and a value spans an integer number of contiguous blocks.
Values are compressed with LZ4 before being written.
A bitmask tracks which blocks are occupied; regions (a fixed number of contiguous blocks) track gaps of free blocks in a separate file.
Deletes mark blocks as deleted in memory and update their region; updates are *never* done in place — a new value is inserted and the old one becomes garbage.
Concurrency model: multiple readers and a single writer are supported.
On disk: one file per page, one file for the tracker, one file for the bitmask, and one file for gaps Source: [lib/gridstore/readme.md].

The Edge variant of Qdrant (qdrant-edge) reuses a subset of these building blocks. An EdgeShard owns a LockedSegmentHolder, a SerdeWal<CollectionUpdateOperations>, and a SaveOnDisk<EdgeConfig>, demonstrating how segments, the WAL, and configuration are bundled in the lightweight runtime Source: [lib/edge/src/lib.rs].

Write-Ahead Log (WAL) — Durability Layer

Qdrant uses WALs at two distinct levels:

Level	Purpose	Module
Per-shard WAL	Records every `CollectionUpdateOperations` update before it is applied to segments; replayed on startup	`lib/shard/src/wal` (consumed by EdgeShard)
Consensus WAL	Persists Raft consensus entries that drive cluster-wide state changes (collection create/drop, alias operations, shard rebalancing)	`consensus_wal.rs`

The consensus WAL is opened from a dedicated directory collections_meta_wal under the storage path Source: [lib/storage/src/content_manager/consensus/consensus_wal.rs]. It tracks a compacted_until_raft_index watermark, so entries below this Raft index are considered compacted and ignored on replay. Recovery speed matters: v1.18.2 added logging of slow operations during shard WAL recovery, helping operators diagnose startup stalls Source: [README.md].

Updates flow through the consensus manager, which wraps ConsensusOpWal and produces SnapshotData containing collections and aliases when snapshots are taken Source: [lib/storage/src/content_manager/consensus_manager.rs]. Aliases themselves are persisted as a small JSON file managed by AliasPersistence, providing O(1) reads from memory and durable atomic writes Source: [lib/storage/src/content_manager/alias_mapping.rs].

Errors during any of these stages — from WAL writes to segment application — are normalized into StorageError, a single error type used throughout the storage stack. It covers BadInput, NotFound, Timeout, RateLimitExceeded, ShardUnavailable, and others, providing a uniform surface for the API layer Source: [lib/storage/src/content_manager/errors.rs].

Configuration and Operational Notes

Qdrant exposes a StorageConfig consumed by the TableOfContent to control storage behavior Source: [lib/storage/src/content_manager/toc/mod.rs]. The storage stack supports:

Async I/O via io_uring for high-throughput writes, with quantized multi-vector scorers refactored to use it in v1.18.1 Source: [README.md].
Snapshot download streams that enforce an inactivity timeout of 60 seconds, aborting stalled downloads to avoid hanging recovery Source: [lib/storage/src/content_manager/snapshots/download_tar.rs].
Edge mode with a smaller, embedded configuration that still uses segments and WALs, exposed through EdgeConfig and the EdgeOptimizersConfig builder Source: [lib/edge/src/config/mod.rs].

A useful design property is that Gridstore *never* updates in place; this eliminates the need for crash-recovery of partial writes inside pages and keeps the on-disk format simple to reason about. Combined with the WAL's append-only model and the segment merge pipeline, this yields a storage engine that can absorb bursts of writes while still serving low-latency searches.

APIs, Clients, and Qdrant Edge

Related topics: Qdrant Overview and System Architecture, Operations, Configuration, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section REST API

Continue reading this section for the full explanation and source context.

Section gRPC API

Continue reading this section for the full explanation and source context.

Section Web UI and Agent Skills

Continue reading this section for the full explanation and source context.

APIs, Clients, and Qdrant Edge

Qdrant exposes its vector search engine to the outside world through a layered set of interfaces: a public REST and gRPC surface, official client SDKs in several languages, a built-in Web UI, and a separate in-process runtime called Qdrant Edge for embedded and offline scenarios. This page describes how those pieces fit together and points to the source files that implement them.

API Surface

REST API

The REST API is the primary public interface. It is defined as an OpenAPI 3.0 specification and is served by an Actix-based HTTP frontend. The OpenAPI document is generated by the tools/schema2openapi toolchain (see tools/schema2openapi/package.json) and is published alongside the repository.

The Actix entry point wires routes from the lib/api crate into the running service. Request handlers ultimately call into the lib/collection crate, which implements per-collection operations over a single shard. Collection-level behaviour is documented in lib/collection/README.md, which describes the in-memory structure of a collection and its update sequence.

Errors that bubble out of the collection layer are converted into a unified StorageError enum so that both the REST and gRPC layers can render them consistently (see lib/storage/src/content_manager/errors.rs). This means a single error path (for example NotFound, BadInput, or RateLimitExceeded) is surfaced the same way regardless of transport.

gRPC API

For latency-sensitive production traffic, Qdrant also exposes a gRPC interface. The proto definitions live in lib/api/src/grpc/proto/qdrant.proto, and the generated service implementation is in lib/api/src/grpc/qdrant.rs. The gRPC module is registered from lib/api/src/grpc/mod.rs, and both transports are unified in lib/api/src/lib.rs.

flowchart LR
    Client[Client / SDK] -->|HTTP/JSON| REST[REST API - Actix]
    Client -->|gRPC/Protobuf| GRPC[gRPC API]
    REST --> API[lib/api]
    GRPC --> API
    API --> Collection[lib/collection]
    Collection --> Shard[Shard / Segment]

Web UI and Agent Skills

A bundled Web UI provides a visual way to explore collections, exercise the REST API, and observe cluster health (see the description in README.md). Qdrant also ships Agent Skills — installable instruction packs that teach an AI coding assistant the right defaults for tasks like quantization, sharding, and tenant isolation.

Client Libraries

Qdrant maintains official clients for the most common runtimes, listed in the project README:

Language / Runtime	Repository	Notes
JavaScript / TypeScript	qdrant/qdrant-js	Browser and Node.js
Python	qdrant/qdrant-client	Most widely used client
.NET / C#	qdrant/qdrant-dotnet	.NET 6+
Java	java-client	JVM
PHP	hkulekci/qdrant-php	Community-maintained

These clients are generated from or aligned with the OpenAPI 3.0 specification, so the REST surface stays the single source of truth. Recent server releases have added new client-visible capabilities such as the API to create and delete named vectors in an existing collection (v1.18.0), Relevance Feedback queries (v1.17.0), and the API for detailed optimization progress reports (v1.17.0).

Qdrant Edge

Qdrant Edge is a separate, in-process build of the engine that runs inside the host application. Unlike the client-server deployment — which is started with docker run -p 6333:6333 qdrant/qdrant — Edge has no background service. It is described in lib/edge/publish/README.md as a lightweight vector search engine for embedded devices, autonomous systems, and mobile agents, with on-device retrieval and an optional sync path to Qdrant Cloud.

Architecture

The core type is EdgeShard, defined in lib/edge/src/lib.rs. It owns:

path: PathBuf — the on-disk location of the shard.
config: SaveOnDisk<EdgeConfig> — the edge-specific configuration persisted to EDGE_CONFIG_FILE.
wal: Mutex<SerdeWal<CollectionUpdateOperations>> — a write-ahead log stored under the wal directory, mirroring the server's durability model.
segments: LockedSegmentHolder — the in-memory segment holder reused from lib/shard.

This means Edge reuses the same segment, WAL, and storage primitives as the server; the difference is the absence of a network front-end and consensus layer.

Distribution

The Edge crate is auto-generated from the in-tree modules by publish/amalgamate.py and published to crates.io as qdrant-edge. Build and release instructions are in lib/edge/README.md. A Python binding, qdrant-edge-py, is built with maturin and lives in lib/edge/python/README.md; it is part of the main workspace, unlike the Rust crate.

The example below (from the top-level README.md) shows the Python API:

from qdrant_edge import Distance, EdgeConfig, EdgeVectorParams, EdgeShard, Point, UpdateOperation

shard = EdgeShard.create("./shard", EdgeConfig(
    vectors={"my-vector": EdgeVectorParams(size=4, distance=Distance.Cosine)}
))
shard.update(UpdateOperation.upsert_points([
    Point(id=1, vector={"my-vector": [0.1, 0.2, 0.3, 0.4]}, payload={"color": "red"})
]))

Limitations and Direction

Because Edge has no network or consensus layer, it cannot be used as a cluster node. A server-side equivalent — a read-only shard that supports live reload — is being tracked in community issue #9241 and is described as a building block for a future serverless deployment option.

Operations, Configuration, and Common Failure Modes

Related topics: Distributed System: Sharding, Replication, and Consensus, Storage Engine: Segments, Gridstore, and WAL

Section Related Pages

Continue reading this section for the full explanation and source context.

Operations, Configuration, and Common Failure Modes

Overview

Qdrant is a vector search engine that ships in two deployment shapes: the full Qdrant Server (multi-shard, replicated, Raft-consensus) and Qdrant Edge (in-process EdgeShard for resource-constrained devices). Both share the same underlying storage crates but expose different operational surfaces. This page walks through the configuration entry points, the runtime model, the consensus protocol that protects the meta store, and the canonical failure modes operators encounter in production.

Source: README.md:1-120

Configuration Surfaces

Configuration is split between static YAML files and per-collection parameters resolved at runtime. The README documents the runtime capabilities (SIMD acceleration, GPU indexing, async I/O via io_uring, write-ahead logging) that operators tune through these files. The Edge package adds its own configuration file, EDGE_CONFIG_FILE, which is loaded by EdgeShard from the shard directory.

Source: lib/edge/src/lib.rs:1-60

EdgeConfig is persisted to disk via SaveOnDisk<EdgeConfig> and reloaded on shard open, mirroring the server's behavior of reading config.yaml at boot. The Edge package is built as a separate workspace and published to crates.io; the source README documents the just rs-examples workflow for validating configuration changes locally.

Source: lib/edge/README.md:1-40

Runtime and Threading Model

The server's TableOfContent (toc) maintains a set of dedicated Tokio runtimes for I/O- and CPU-bound work. Two search runtimes are constructed side-by-side: high_cpu matches traditional sizing (high_cpu_blocking_threads) for CPU-saturated processes, and high_io uses common::defaults::search_thread_count via high_io_blocking_threads so that IO-bound searches can hide latency behind more parallel segment scans. An AdaptiveSearchHandle then routes spawn_blocking calls between the two based on observed CPU pressure.

Source: lib/storage/src/content_manager/toc/runtimes.rs:1-40

This split allows operators to keep tail latencies stable on network-attached storage — a pattern reinforced by recent release notes (v1.18.1) that refactored quantized multi-vector scorers for io_uring support.

Source: README.md:60-90

Distributed Operations and Consensus

In multi-node deployments, every mutation that affects the cluster topology or collection meta is serialized into a Raft entry. The ConsensusOperations enum enumerates these mutations: CollectionMeta, AddPeer, RemovePeer, UpdatePeerMetadata, UpdateClusterMetadata, RequestSnapshot, and ReportSnapshot. Each is serialized with serde_cbor and pushed through the consensus WAL before being applied.

Source: lib/storage/src/content_manager/mod.rs:1-80

The consensus manager is responsible for replaying committed entries and translating them into storage changes. apply_conf_change_entry decodes ConfChangeV2 messages, advances the Raft state machine, and stops consensus if a peer removal shrinks the cluster to a single node.

Source: lib/storage/src/content_manager/consensus_manager.rs:1-40

flowchart LR
    Client[Client Request] -->|propose| Raft[Raft Log]
    Raft -->|commit| WAL[ConsensusOpWal]
    WAL -->|replay| Manager[Consensus Manager]
    Manager -->|apply| Storage[TOC / Collections]
    Storage -->|notify| Raft

The ConsensusOpWal wraps a Wal instance rooted at collections_meta_wal and tracks compacted_until_raft_index so that recovery skips already-compacted entries while still honoring Raft's monotonic indexing.

Source: lib/storage/src/content_manager/consensus/consensus_wal.rs:1-30

Common Failure Modes and Error Handling

Qdrant translates low-level failures into a typed StorageError hierarchy that the gRPC and HTTP layers can map to status codes. Operators should be familiar with the following variants:

Variant	Trigger	Operator Action
`BadInput` / `BadRequest`	Strict-mode violation or validation failure	Inspect request payload; fix client
`NotFound`	Missing collection, point, or alias	Verify resource exists; check aliases
`AlreadyExists`	Duplicate collection or alias	Choose a different name or drop existing
`ChecksumMismatch`	Corrupted segment or snapshot	Restore from snapshot; investigate disk
`Timeout`	Operation exceeded its budget	Increase timeout; check `high_io` saturation
`RateLimitExceeded`	Write or read quota exhausted	Apply backoff; raise quota
`ShardUnavailable`	Target shard not in `Active` state	Wait for transfer; check peer health
`ServiceError`	I/O, panic, or internal invariant	Capture logs; file an issue

Source: lib/storage/src/content_manager/errors.rs:1-120

Several recent releases fix failure modes that bit operators in the field:

WAL bugs that broke consensus — v1.16.2 fixed a critical WAL bug that could break consensus or cause divergence; operators should run at least v1.16.2 in clustered deployments. (release notes v1.16.2)
Slow restarts on large filter updates — v1.15.5 acknowledged update/delete by filter on flush to prevent the very slow restart that previously occurred. (release notes v1.15.5)
ID tracker cache staleness — v1.18.2 clears the ID tracker cache after building a segment so that search results remain consistent with on-disk state. (release notes v1.18.2)
Wasted work on index rebuilds — v1.18.2 also avoids rebuilding a payload index when only its on_disk flag changes. (release notes v1.18.2)

Edge-Specific Failure Modes

EdgeShard runs inside the host process, so its failure surface is different. Because there is no Raft layer, a single process crash can leave the WAL in an unflushed state. The EdgeShard::create and EdgeShard::open paths both initialize a SerdeWal<CollectionUpdateOperations> guarded by a parking_lot::Mutex; callers should treat the returned EdgeShard as the only owner of the on-disk directory.

Source: lib/edge/src/lib.rs:30-60

Edge search, while simpler, still validates score thresholds and offset bounds before returning — invalid combinations are debug-asserted and the offending entries truncated. Long-running deployments should monitor disk usage because the WAL is only truncated when the segment constructor finalizes a flush.

Source: lib/edge/src/search.rs:1-30

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

Upgrade or migration may change expected behavior: v1.15.5

medium Configuration risk requires verification

Upgrade or migration may change expected behavior: v1.16.0

Doramagic Pitfall Log

Found 20 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: identity.distribution | github_repo:268163609 | https://github.com/qdrant/qdrant

2. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/qdrant/qdrant/issues/9241

3. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.15.5
User impact: Upgrade or migration may change expected behavior: v1.15.5
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.15.5. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.15.5

4. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.16.0
User impact: Upgrade or migration may change expected behavior: v1.16.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.16.0. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.16.0

5. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.16.1
User impact: Upgrade or migration may change expected behavior: v1.16.1
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.16.1. Context: Observed when using node
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.16.1

6. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.17.0
User impact: Upgrade or migration may change expected behavior: v1.17.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.17.0. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.17.0

7. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.17.1
User impact: Upgrade or migration may change expected behavior: v1.17.1
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.17.1. Context: Observed when using python, cuda
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.17.1

8. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.18.0
User impact: Upgrade or migration may change expected behavior: v1.18.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.18.0. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.18.0

9. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.18.1
User impact: Upgrade or migration may change expected behavior: v1.18.1
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.18.1. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.18.1

10. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: v1.18.2
User impact: Upgrade or migration may change expected behavior: v1.18.2
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v1.18.2. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_release | https://github.com/qdrant/qdrant/releases/tag/v1.18.2

11. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | github_repo:268163609 | https://github.com/qdrant/qdrant

12. Runtime risk: Runtime risk requires verification

Severity: medium
Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: packet_text.keyword_scan | github_repo:268163609 | https://github.com/qdrant/qdrant

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using qdrant with real data or production workflows.

[[Tracking] Live-reload shard](https://github.com/qdrant/qdrant/issues/9241) - github / github_issue
v1.18.2 - github / github_release
v1.18.1 - github / github_release
v1.18.0 - github / github_release
v1.17.1 - github / github_release
v1.17.0 - github / github_release
v1.16.3 - github / github_release
v1.16.2 - github / github_release
v1.16.1 - github / github_release
v1.16.0 - github / github_release
Installation risk requires verification - GitHub / issue
Configuration risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

qdrant

Qdrant Overview and System Architecture

Related Pages

Qdrant Overview and System Architecture

Purpose and Scope

Core Architecture

Qdrant Server vs Qdrant Edge

Key Subsystems and Features

Error Handling and Common Failure Modes

See Also

Vector Indexing: HNSW, Sparse, and Multivector Search

Related Pages

Vector Indexing: HNSW, Sparse, and Multivector Search

Overview and Scope

Dense Vector Indexing (HNSW)

Sparse Vector Search

Multivector Search (Late Interaction / ColBERT)

Query Planning and Runtime Isolation

Authorization Touchpoints

See Also

Quantization: Scalar, Binary, Product, and TurboQuant

Related Pages

Quantization: Scalar, Binary, Product, and TurboQuant

Overview and Purpose

The Four Variants

Configuration and Lifecycle

Data Flow

Common Failure Modes and Trade-offs

See Also

Payload Indexing and Filtering

Related Pages

Payload Indexing and Filtering

Overview

Architecture

Field Index Types and Filter Conditions

Search and Score-Threshold Filtering

Lifecycle, Configuration, and Recent Improvements

Recent community-relevant changes

See Also

Distributed System: Sharding, Replication, and Consensus

Related Pages

Distributed System: Sharding, Replication, and Consensus

1. Architectural Overview

2. Sharding: Distribution of Data

3. Replication: Replica Sets and State Application

4. Consensus: Raft Log and Operations

Common Failure Modes

See Also

Storage Engine: Segments, Gridstore, and WAL

Related Pages

Storage Engine: Segments, Gridstore, and WAL

Architecture Overview

Segments — Foundation of Searchable Data

Gridstore — Page-Based Variable-Size Storage

Write-Ahead Log (WAL) — Durability Layer

Configuration and Operational Notes

See Also

APIs, Clients, and Qdrant Edge

Related Pages

APIs, Clients, and Qdrant Edge

API Surface

REST API

gRPC API

Web UI and Agent Skills

Client Libraries

Qdrant Edge

Architecture

Distribution

Limitations and Direction

See Also

Operations, Configuration, and Common Failure Modes

Related Pages

Operations, Configuration, and Common Failure Modes

Overview

Configuration Surfaces

Runtime and Threading Model

Distributed Operations and Consensus

Common Failure Modes and Error Handling

Edge-Specific Failure Modes

See Also