# https://github.com/qdrant/qdrant-client Project Manual

Generated at: 2026-06-28 21:40:09 UTC

## Table of Contents

- [Client Architecture & Connection Modes](#page-1)
- [HTTP REST API & gRPC Protocol Layer](#page-2)
- [Local Mode, Persistence & Inference Integration](#page-3)
- [Operations, Troubleshooting & Client Generation Tooling](#page-4)

<a id='page-1'></a>

## Client Architecture & Connection Modes

### Related Pages

Related topics: [HTTP REST API & gRPC Protocol Layer](#page-2), [Local Mode, Persistence & Inference Integration](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)
- [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py)
- [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)
- [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py)
- [qdrant_client/http/api/aliases_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py)
- [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py)
- [qdrant_client/http/api/service_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/service_api.py)
- [qdrant_client/http/api/snapshots_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/snapshots_api.py)
- [qdrant_client/http/api/beta_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py)
- [qdrant_client/http/models/models.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/models/models.py)
- [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md)
</details>

# Client Architecture & Connection Modes

The `qdrant-client` Python package is the official client library for the [Qdrant](https://github.com/qdrant/qdrant) vector search engine. Its architecture is organized around three concerns: a set of generated HTTP API endpoint wrappers, a pydantic-based data model layer, and dual sync/async entry points that share a common request builder. This page documents how those pieces fit together and how the client surfaces different connection modes to user code. Source: [README.md:1-15]().

## 1. HTTP API Module Layout

The HTTP transport layer is split by resource domain. Each domain lives in its own module under `qdrant_client/http/api/` and follows an identical class structure so that the public client can compose them uniformly:

| Module | Responsibility |
| --- | --- |
| `collections_api.py` | Create, delete, update, list collections; get/set cluster info per collection. Source: [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py) |
| `points_api.py` | CRUD on points and vectors, batch updates, scroll, count, recommend. Source: [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py) |
| `search_api.py` | Vector search, batch search, search groups, discover points. Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py) |
| `indexes_api.py` | Create / delete payload field indexes. Source: [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py) |
| `aliases_api.py` | List, create, rename, delete collection aliases. Source: [qdrant_client/http/api/aliases_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py) |
| `snapshots_api.py` | Create, list, download, restore and delete snapshots. Source: [qdrant_client/http/api/snapshots_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/snapshots_api.py) |
| `distributed_api.py` | Cluster status, telemetry, shard transfers, collection cluster info. Source: [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py) |
| `service_api.py` | Kubernetes-style health probes (`/healthz`, `/livez`) and Prometheus metrics. Source: [qdrant_client/http/api/service_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/service_api.py) |
| `beta_api.py` | Beta-only maintenance endpoints such as `/issues`. Source: [qdrant_client/http/api/beta_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py) |

Every module exposes three classes: a private `_XxxApi` that holds request builders, a `SyncXxxApi` that returns concrete model instances, and an `AsyncXdrantApi` that returns the same models but awaits each call. For example, `SearchApi` exposes `_SearchApi`, `SyncSearchApi`, and `AsyncSearchApi` side-by-side. Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py).

## 2. Sync vs. Async Connection Modes

The same request shape is exposed twice — once blocking, once coroutine-based — so that user code can pick the model that matches its runtime without rewriting calls. This pattern was the subject of community issue [#157](https://github.com/qdrant/qdrant-client/issues/157) ("AsyncQdrantClient"), which asked for a high-level asynchronous client equivalent to `QdrantClient`. The HTTP modules already implement that pattern at the transport level: a single `_build_for_*` method is reused, and only the public wrapper class differs.

For example, the search endpoint is built identically in both modes. Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py):

```python
class _SearchApi:
    def _build_for_search_points(
        self,
        collection_name: str,
        consistency: m.ReadConsistency = None,
        timeout: int = None,
        search_request: m.SearchRequest = None,
    ):
        path_params = {"collection_name": str(collection_name)}
        query_params = {}
        if consistency is not None:
            query_params["consistency"] = str(consistency)
        if timeout is not None:
            query_params["timeout"] = str(timeout)
        headers = {"Content-Type": "application/json"}
        body = jsonable_encoder(search_request)
        return self.api_client.request(
            type_=m.InlineResponse20019,
            method="POST",
            url="/collections/{collection_name}/points/search",
            headers=headers if headers else None,
            path_params=path_params,
            params=query_params,
            content=body,
        )
```

`SyncSearchApi.search_points` simply invokes the builder and returns the result; `AsyncSearchApi.search_points` `await`s it. Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py). The same dual pattern is present in `points_api.py`, `collections_api.py`, `indexes_api.py`, and `aliases_api.py`.

```mermaid
flowchart LR
    A[User code<br/>sync or async] --> B[SyncXxxApi / AsyncXxxApi]
    B --> C[_XxxApi._build_for_*]
    C --> D[jsonable_encoder<br/>pydantic v1 or v2]
    D --> E[ApiClient.request]
    E --> F[Qdrant server<br/>REST API]
```

## 3. Request Building, Serialization and Models

Every `_build_for_*` method does the same four steps: assemble `path_params`, assemble `query_params` (coercing values to strings with `str(...)`), set `Content-Type: application/json` when a body is present, and serialize the body through `jsonable_encoder`. Source: [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py), [qdrant_client/http/api/aliases_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py), [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py).

`jsonable_encoder` is defined in each module's header and detects Pydantic v1 vs v2 at runtime, calling `model.json(...)` or `model.model_dump_json(...)` accordingly. This keeps the client compatible with both Pydantic major versions. Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py). Return types are pydantic models defined in `qdrant_client/http/models/models.py`, for example `SearchRequest`, `CreateCollection`, `CreateFieldIndex`, and `ChangeAliasesOperation`. Source: [qdrant_client/http/models/models.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/models/models.py).

The `ApiClient` injected into each `_XxxApi` is typed as `Union[ApiClient, AsyncApiClient]`, which is what enables the same builder to back both sync and async modes. Source: [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py).

## 4. Connection-Level Failure Modes From the Community

Several recurring community issues map directly onto the transport layer described above:

- **Read timeouts.** Issue [#380](https://github.com/qdrant/qdrant-client/issues/380) reports `ResponseHandlingException: The read operation timed out`. This is raised by the underlying HTTP/2 stack used by `ApiClient` when the server does not respond within the configured timeout, which is forwarded via the `timeout` query parameter handled by every builder (e.g. `_build_for_search_points`). Source: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py).
- **SSL handshake failures.** Issue [#770](https://github.com/qdrant/qdrant-client/issues/770) shows `SSL_ERROR_SSL: WRONG_VERSION_NUMBER` produced by gRPC's `ssl_transport_security` when the URL scheme (HTTP vs HTTPS) mismatches the server's TLS configuration. Configuring `https://` (or `grpc.ssl_channel_credentials` on the gRPC client) is required when the server is fronted by TLS.
- **Validation drift in `upload_collection`.** Issue [#17](https://github.com/qdrant/qdrant-client/issues/17) highlights that the high-level uploader does not validate that the incoming `vector_size` matches the target collection's vector configuration, so mismatched payloads are silently dropped by the server.
- **Python warnings on import.** Issue [#983](https://github.com/qdrant/qdrant-client/issues/983) reports a `SyntaxWarning: invalid escape sequence '\&'` originating in `qdrant_client/http/models/models.py:758`. The fix is to declare the docstring as a raw string. Source: [qdrant_client/http/models/models.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/models/models.py).

Release v1.18.0 added per-request custom headers for tracing and named-vector APIs. Per-request headers are the next layer above the URL/query builder, so they plug into the same `ApiClient.request` invocation that the `_build_for_*` methods call. Source: [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).

## See Also

- Search, recommend, and discover endpoints: [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)
- Point and vector operations: [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)
- Collection lifecycle and configuration: [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py)
- Payload field indexes: [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py)
- Cluster and shard transfer operations: [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py)
- Pydantic data models used by every API module: [qdrant_client/http/models/models.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/models/models.py)

---

<a id='page-2'></a>

## HTTP REST API & gRPC Protocol Layer

### Related Pages

Related topics: [Client Architecture & Connection Modes](#page-1), [Operations, Troubleshooting & Client Generation Tooling](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md)
- [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py)
- [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)
- [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)
- [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py)
- [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py)
- [qdrant_client/http/api/aliases_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py)
- [qdrant_client/http/api/service_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/service_api.py)
- [qdrant_client/http/api/beta_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py)
</details>

# HTTP REST API & gRPC Protocol Layer

## Overview

The HTTP REST API & gRPC Protocol Layer is the wire-protocol boundary of `qdrant-client`. It encapsulates every network operation the high-level `QdrantClient` performs when talking to a Qdrant server, hiding the differences between the two transports. The README positions the client as "a comprehensive client library for Qdrant" supporting "REST and gRPC", with type hints for all API methods, the ability to run the same code in local mode, and minimal dependencies ([README.md:1-20](https://github.com/qdrant/qdrant-client/blob/main/README.md)).

The protocol layer is organized into per-resource API modules under `qdrant_client/http/api/`. Each module groups related endpoints and exposes both synchronous and asynchronous variants through parallel class hierarchies.

## HTTP API Module Architecture

Every endpoint module follows the same internal layout: a private `_XxxApi` base that holds `_build_for_<endpoint>` methods, plus public `SyncXxxApi` and `AsyncXxxApi` subclasses that override the base to perform blocking or `await`ed HTTP calls respectively. For example, in `qdrant_client/http/api/points_api.py`, the `_PointsApi` class declares `_build_for_batch_update` and `_build_for_upsert_points` which return a tuple consumed by `self.api_client.request(...)` ([points_api.py:1-60](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)). The async surface is provided by `AsyncPointsApi(_PointsApi)`, which wraps the same `_build_for_*` methods in `async def` wrappers returning awaited typed responses ([points_api.py:60-120](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)).

The same pattern is applied uniformly across modules — `AsyncSearchApi` / `SyncSearchApi` in [search_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py), `AsyncCollectionsApi` / `SyncCollectionsApi` in [collections_api.py:1-60](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py), and `AsyncBetaApi` / `SyncBetaApi` in [beta_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py).

```mermaid
flowchart LR
    A["High-level QdrantClient"] --> B["HTTP API modules<br/>(collections, points, search, ...)"]
    B --> C["SyncXxxApi<br/>(blocking calls)"]
    B --> D["AsyncXxxApi<br/>(awaited calls)"]
    C --> E["ApiClient.request()"]
    D --> F["AsyncApiClient.request()"]
    E --> G["REST over HTTP"]
    F --> G
    F -.-> H["gRPC channel"]
```

## Endpoint Modules

Each module maps a resource family in Qdrant's REST API:

| Module | Responsibility | Example Endpoints |
|---|---|---|
| `collections_api.py` | Collection lifecycle and named vectors | `create_collection`, `delete_collection`, `create_vector_name`, `delete_vector_name` |
| `points_api.py` | Point CRUD, payload and vector mutations | `upsert_points`, `batch_update`, `set_payload`, `update_vectors`, `scroll`, `get_point` |
| `search_api.py` | Vector search, recommend, discover, query | `search_points`, `recommend_points`, `discover_points`, `search_matrix_pairs`, `search_point_groups` |
| `indexes_api.py` | Payload field indexing | `create_field_index`, `delete_field_index` |
| `distributed_api.py` | Cluster and shard operations | `cluster_status`, `cluster_telemetry`, `remove_peer`, `recover_current_peer` |
| `aliases_api.py` | Collection aliases | `get_collection_aliases`, `get_collections_aliases`, `update_aliases` |
| `service_api.py` | Health, liveness, metrics | `healthz`, `livez`, `metrics` |
| `beta_api.py` | Diagnostics endpoints | `get_issues`, `clear_issues` |

Source: [collections_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py), [points_api.py:1-60](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py), [search_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py), [indexes_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py), [distributed_api.py:1-80](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py), [aliases_api.py:1-40](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py), [service_api.py:1-60](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/service_api.py), [beta_api.py:1-60](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py).

The v1.18.0 release note on "create/delete named vector API after creating a collection" corresponds to `_build_for_create_vector_name` and `_build_for_delete_vector_name` in [collections_api.py:60-100](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py).

## Request Construction and Common Parameters

Every `_build_for_*` method follows the same recipe: assemble `path_params`, populate `query_params` from optional inputs, optionally build a JSON `body` via `jsonable_encoder`, then call `self.api_client.request(type_=..., method=..., url=..., headers=..., path_params=..., params=query_params, content=body)`. For example, `_build_for_searchPoints` in [search_api.py:1-50](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py) maps `consistency` and `timeout` to query strings, encodes a `SearchRequest` body, and sets `Content-Type: application/json`.

The same parameters recur across mutating endpoints:

- `timeout` (int) — per-request deadline propagated as a query parameter, present in nearly every endpoint (e.g., [points_api.py:30-50](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py), [search_api.py:10-30](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)).
- `wait` (bool) — serialized as `"true"/"false"` lowercase via `str(wait).lower()`, e.g., [collections_api.py:70-90](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py).
- `ordering` (`WriteOrdering`) — write ordering hint serialized via `str(ordering)` for mutating calls, e.g., [indexes_api.py:30-50](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py).
- `consistency` (`ReadConsistency`) — read consistency hint used on read endpoints, e.g., [points_api.py:50-70](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py).

## gRPC and Async Transport Notes

The HTTP API is one half of the dual-protocol design. `AsyncXxxApi` classes defer to an `AsyncApiClient` that can be configured to speak gRPC, addressing the recurring community ask in issue #157 ("AsyncQdrantClient") for a first-class async API. The runtime decision between gRPC and REST happens inside `ApiClient`/`AsyncApiClient`, not in the endpoint modules themselves.

## Common Failure Modes

Several widely reported issues surface at this protocol layer:

- **Read timeouts** (issue #380, "The read operation timed out") are produced when the server's socket-level read exceeds the client's timeout — the same `timeout` query parameter surfaced by `_build_for_*` methods is the user-facing knob.
- **SSL handshake failures** (issue #770) appear when connecting to a TLS-enabled server with mismatched TLS settings; this originates from the transport used by `ApiClient`, not the endpoint modules.
- **`SyntaxWarning: invalid escape sequence` noise** (issue #983) originates from docstring/description fields in `qdrant_client/http/models/models.py` and is unrelated to request building, but the warning only surfaces when the model layer is imported by the API layer.

## See Also

- Local mode and connection options: see the client initialization patterns in [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).
- High-level `QdrantClient` orchestration that calls into these HTTP modules.
- Qdrant server REST/gRPC reference for the underlying wire contract.

---

<a id='page-3'></a>

## Local Mode, Persistence & Inference Integration

### Related Pages

Related topics: [Client Architecture & Connection Modes](#page-1), [Operations, Troubleshooting & Client Generation Tooling](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [qdrant_client/local/qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/qdrant_local.py)
- [qdrant_client/local/async_qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/async_qdrant_local.py)
- [qdrant_client/local/local_collection.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/local_collection.py)
- [qdrant_client/local/persistence.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/persistence.py)
- [qdrant_client/local/distances.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/distances.py)
- [qdrant_client/local/sparse.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/sparse.py)
- [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md)
- [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)
- [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)
</details>

# Local Mode, Persistence & Inference Integration

## Overview

`qdrant-client` provides three execution modes for a `QdrantClient` instance: a remote HTTP/gRPC client that talks to a Qdrant server, and a **local mode** that runs the Qdrant engine *in-process* so that the same Python client can be used without a running server. Local mode supports two storage strategies — fully in-memory (`:memory:`) and on-disk persistence (`path="..."`) — and integrates an optional **Inference API** that embeds raw text or images before they reach the vector store. Together these features let users prototype, unit-test, and ship small workloads from a single dependency, as documented in the [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).

```mermaid
flowchart LR
    User[User Code] --> Client{QdrantClient}
    Client -- ":memory:" --> Mem[(In-Memory Engine)]
    Client -- "path=..." --> Disk[(Persistent Storage)]
    Disk -.serialize/load.-> Persistence[persistence.py]
    Client -- "Document / Image" --> Infer[Inference API]
    Infer -- "FastEmbed (local)" --> FE[ONNX Runtime]
    Infer -- "Remote" --> QC[Qdrant Cloud Models]
    FE --> Client
    QC --> Client
    Mem --> QdrantAPI[local_collection.py]
    Disk --> QdrantAPI
    QdrantAPI --> Dist[distances.py / sparse.py]
```

## Local Mode

### In-Memory and Persistent Initialization

The local engine is selected by passing a `location` argument to the client constructor. The README demonstrates both forms:

```python
from qdrant_client import QdrantClient

# In-memory, lost on process exit
client = QdrantClient(":memory:")

# On-disk persistence
client = QdrantClient(path="path/to/db")
```

Source: [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).

When `location=":memory:"` is set, the in-process engine in `qdrant_client/local/qdrant_local.py` (and the asynchronous counterpart `async_qdrant_local.py`) holds all collections in RAM. When a filesystem `path` is given, the same engine is paired with the persistence layer in `qdrant_client/local/persistence.py`, which serializes collection state so that the data survives process restarts. The `local_collection.py` module implements the per-collection data structure (segments, HNSW index, payload storage, quantization) that both the sync and async local clients wrap with a uniform API.

### Sync vs. Async Surface

The sync and async local clients expose the same high-level methods (`create_collection`, `upload_collection`, `query_points`, etc.) as the remote client, which is why local mode is a drop-in target for tests. The HTTP-API stubs visible in files such as `qdrant_client/http/api/points_api.py` and `qdrant_client/http/api/search_api.py` are *not* invoked in local mode — those generated request builders belong to the remote path and the in-process engine in `qdrant_client/local/` short-circuits the network layer entirely.

Community context: request #157 ("AsyncQdrantClient") requested a first-class async surface that mirrors the sync client, which is exactly what `async_qdrant_local.py` provides. Source: [qdrant_client/local/async_qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/async_qdrant_local.py).

## Persistence Layer

### What Is Persisted

`qdrant_client/local/persistence.py` is responsible for the on-disk format used by `QdrantClient(path=...)`. The local engine treats a path as a directory of segment files plus an index, mirroring (in a simplified form) the storage layout of the Qdrant server. On startup, `QdrantClient` loads any existing data from `path`; on shutdown, new or modified segments are flushed so the next launch resumes with the same points, payloads, and HNSW graphs.

### Community Failure Modes

| Issue | Symptom | Mitigation in local mode |
|-------|---------|--------------------------|
| #17 — `upload_collection` silent mismatch | Uploads 1 M records with wrong `vector_size`; nothing is added | In-process engine validates vector dimensions per point; mismatched data raises immediately rather than being silently dropped |
| #380 — "The read operation timed out" | Remote HTTP client raises `ResponseHandlingException` | Local mode has no network call, so socket timeouts cannot occur |
| #770 — `SSL handshake failed` | gRPC transport cannot negotiate TLS | Local mode is bound to the Python process, no SSL needed |

Source: Community context block in the prompt; cross-referenced against [qdrant_client/local/persistence.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/persistence.py) and [qdrant_client/local/qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/qdrant_local.py).

## Inference Integration

### FastEmbed (Local Inference)

The Inference API allows callers to pass `models.Document(text=..., model=...)` directly to upload/query methods; the client embeds them before the data ever reaches storage. The README shows the canonical setup:

```python
from qdrant_client import QdrantClient, models

client = QdrantClient(":memory:")
model_name = "sentence-transformers/all-MiniLM-L6-v2"

client.create_collection(
    "demo_collection",
    vectors_config=models.VectorParams(
        size=client.get_embedding_size(model_name),
        distance=models.Distance.COSINE,
    ),
)
```

Source: [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).

FastEmbed is installed as an extra (`pip install qdrant-client[fastembed]`) and uses ONNX Runtime so embedding generation runs on CPU or GPU locally. `get_embedding_size(model_name)` is a client helper that introspects the chosen model so the collection's `VectorParams.size` is set automatically — this addresses the dimension-mismatch problem reported in issue #17, because the same model is used both to size the collection and to embed the documents that are uploaded.

### Remote Inference (Qdrant Cloud)

For users who prefer server-side embedding, the same `models.Document` interface can route requests to models hosted in Qdrant Cloud. The client transparently switches backends based on the model identifier, so application code is identical between local FastEmbed and remote inference.

### Distances and Sparse Vectors

The in-process engine in `local_collection.py` relies on `qdrant_client/local/distances.py` for the actual similarity math (Cosine, Dot, Euclid) and on `qdrant_client/local/sparse.py` for sparse-vector scoring. These modules run entirely in the user's process, which means local mode is reproducible across hosts (no network jitter) and CI-friendly — both useful properties for the tests and tutorials that drove issues #17 and #157.

## Common Pitfalls and Best Practices

- **Validate dimensions up front.** Even though `upload_collection` historically did not validate vector size (issue #17), pairing `client.get_embedding_size(model_name)` with `create_collection` guarantees a match. Source: [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md).
- **Prefer async local client inside async code paths.** Use `AsyncQdrantClient` rather than wrapping the sync client, as motivated by issue #157. Source: [qdrant_client/local/async_qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/async_qdrant_local.py).
- **Pin FastEmbed for reproducible embeddings.** The 1.18.0 release bumped `fastembed` to v0.8.0; lock the version in your requirements to avoid silent embedding-space drift.
- **Local mode is for small-to-medium workloads.** In-process engines are bound by host RAM; for production scale, switch to a remote Qdrant server and the HTTP/gRPC clients in [qdrant_client/http/api/](https://github.com/qdrant/qdrant-client/tree/main/qdrant_client/http/api).
- **Mind console noise from generated code.** Issue #983 noted `SyntaxWarning` messages from `qdrant_client/http/models/models.py` due to non-raw docstrings; these are cosmetic but are silenced once Python is run with `-W error::SyntaxWarning` removed from CI, or by upgrading to a release that has raw-string docstrings.

## See Also

- [README.md](https://github.com/qdrant/qdrant-client/blob/main/README.md) — installation options, FastEmbed quickstart, and migration notes.
- [qdrant_client/local/qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/qdrant_local.py) — synchronous in-process engine.
- [qdrant_client/local/async_qdrant_local.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/async_qdrant_local.py) — asynchronous in-process engine.
- [qdrant_client/local/persistence.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/persistence.py) — on-disk serialization for `path=` mode.
- [qdrant_client/local/distances.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/distances.py) and [qdrant_client/local/sparse.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/local/sparse.py) — similarity and sparse-vector primitives.

---

<a id='page-4'></a>

## Operations, Troubleshooting & Client Generation Tooling

### Related Pages

Related topics: [Client Architecture & Connection Modes](#page-1), [HTTP REST API & gRPC Protocol Layer](#page-2), [Local Mode, Persistence & Inference Integration](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [qdrant_client/common/version_check.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/common/version_check.py)
- [qdrant_client/common/client_warnings.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/common/client_warnings.py)
- [qdrant_client/common/client_exceptions.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/common/client_exceptions.py)
- [qdrant_client/conversions/conversion.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/conversions/conversion.py)
- [qdrant_client/conversions/common_types.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/conversions/common_types.py)
- [qdrant_client/_pydantic_compat.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/_pydantic_compat.py)
- [qdrant_client/http/api/search_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/search_api.py)
- [qdrant_client/http/api/points_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/points_api.py)
- [qdrant_client/http/api/collections_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/collections_api.py)
- [qdrant_client/http/api/aliases_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/aliases_api.py)
- [qdrant_client/http/api/indexes_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/indexes_api.py)
- [qdrant_client/http/api/snapshots_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/snapshots_api.py)
- [qdrant_client/http/api/distributed_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/distributed_api.py)
- [qdrant_client/http/api/service_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/service_api.py)
- [qdrant_client/http/api/beta_api.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/api/beta_api.py)
- [qdrant_client/http/models/models.py](https://github.com/qdrant/qdrant-client/blob/main/qdrant_client/http/models/models.py)
</details>

# Operations, Troubleshooting & Client Generation Tooling

## Overview

The `qdrant-client` package is a generated Python SDK for the Qdrant vector database. The repository's `qdrant_client/http/api/` directory contains a set of files that are produced by an OpenAPI-driven code generator rather than hand-written. Recognizing the generated nature of these files is the first step toward effective operations and troubleshooting, because most of the "tooling" surface is the result of a deterministic template.

Every HTTP API module follows the same top-level structure: a `# flake8: noqa E501` directive (suppressing long-line warnings for generated output), a Pydantic-aware `to_json` helper, a shared `jsonable_encoder`, and three class forms: `_XxxApi` (a request-builder base), `AsyncXxxApi` (awaitable wrappers), and `SyncXxxApi` (blocking wrappers). Source: [qdrant_client/http/api/search_api.py:1-58]().

## Generated Layer Architecture

The HTTP client is organized as a thin request-building layer on top of an `ApiClient` transport. Each `_XxxApi` constructor accepts an `api_client` typed as `Union[ApiClient, AsyncApiClient]`, and each endpoint method follows a uniform pattern: assemble `path_params`, conditionally populate `query_params` and `headers`, encode the body with `jsonable_encoder`, and dispatch through `self.api_client.request(...)`. Source: [qdrant_client/http/api/points_api.py:28-70]().

The serialization helpers are duplicated verbatim across every generated module, which is a strong signal that they are produced from a shared mustache/jinja template:

```python
def jsonable_encoder(obj, include=None, exclude=None, by_alias=True,
                     skip_defaults=None, exclude_unset=True, exclude_none=True):
    if hasattr(obj, "json") or hasattr(obj, "model_dump_json"):
        return to_json(obj, include=include, exclude=exclude, by_alias=by_alias,
                       exclude_unset=bool(exclude_unset or skip_defaults),
                       exclude_none=exclude_none)
    return obj
```

Source: [qdrant_client/http/api/collections_api.py:20-35](). The Pydantic v1/v2 fork is resolved at import time via `PYDANTIC_V2 = PYDANTIC_VERSION.startswith("2.")`, which selects between `model_dump_json` and `json`. Source: [qdrant_client/http/api/aliases_api.py:13-17](); [qdrant_client/_pydantic_compat.py]() is the canonical compatibility shim used elsewhere in the project.

The following diagram summarizes the layered structure of the generated HTTP client:

```mermaid
flowchart TD
    User[User Code] --> SyncAPI[SyncXxxApi]
    User --> AsyncAPI[AsyncXxxApi]
    SyncAPI --> Base[_XxxApi request builder]
    AsyncAPI --> Base
    Base --> Encoder[jsonable_encoder + to_json]
    Base --> Client[ApiClient / AsyncApiClient]
    Encoder --> Pydantic[Pydantic v1 or v2 model]
    Client --> Server[(Qdrant HTTP Server)]
```

## Common Operational Issues and Their Sources

### Escape-sequence warnings in generated models

Issue #983 reports `SyntaxWarning: invalid escape sequence '\&'` originating from `qdrant_client/http/models/models.py:758`. Because this file is generated from the OpenAPI specification, fixing the warning in place is futile: the next regeneration will reintroduce it. The correct remediation path is to either patch the upstream OpenAPI description to escape the backslash (`\\&`) or to add a post-generation hook in the code generator that converts string literals in `Field(..., description=...)` into raw strings. Source: [qdrant_client/http/models/models.py:758-758]().

### Read timeouts and socket timeouts

Issue #380 surfaces `ResponseHandlingException: The read operation timed out`. Every endpoint in the generated layer accepts a `timeout` query parameter that is forwarded as a stringified integer. For example, `search_points` builds `query_params["timeout"] = str(timeout)` when the caller supplies a value. Source: [qdrant_client/http/api/search_api.py:65-68](). When `timeout` is left as `None`, the parameter is omitted entirely and the server default applies. Operators should distinguish between a client-side `urllib3`/`httpx` read timeout (configured on the `ApiClient`) and a server-side timeout passed per-request — these are independent dials and both must be tuned.

### SSL handshake failures

Issue #770 reports `SSL_ERROR_SSL: WRONG_VERSION_NUMBER` when the gRPC client attempts a TLS handshake against an endpoint that is not speaking TLS. The HTTP API layer does not perform a separate TLS handshake — it delegates to `api_client.request(...)` — so the fix is at the `ApiClient` configuration level (correct scheme, port, and CA bundle), not inside the generated API files. Source: [qdrant_client/http/api/service_api.py:36-50]() (representative of the request-dispatch pattern).

### Silent data corruption in `upload_collection`

Issue #17 reports that `upload_collection` does not validate vector dimensionality before streaming records, leading to silent ingestion of mismatched data. The generated layer exposes the raw batch endpoint (`POST /collections/{collection_name}/points/batch`) but does not enforce vector-shape invariants. Source: [qdrant_client/http/api/points_api.py:33-70](). The remediation is to perform a client-side check against the collection's `VectorParams.size` retrieved from `get_collection` before initiating the upload.

## Configuration Surface and Custom Headers

The v1.18.0 release introduced per-request custom headers for tracing (PR #1173). The generated `headers = {}` dictionary in each builder is the integration point — callers can inject `X-Trace-Id` or other observability headers by extending the request context before dispatch. Source: [qdrant_client/http/api/indexes_api.py:40-44]() (showing the `headers` construction site); [qdrant_client/http/api/snapshots_api.py:33-37]() (analogous pattern in the snapshots module).

The version compatibility between client and server is enforced through `qdrant_client/common/version_check.py`, while deprecation and mismatch messages are emitted by `qdrant_client/common/client_warnings.py`. Raised exceptions are defined in `qdrant_client/common/client_exceptions.py`, and the high-level `QdrantClient` translates low-level gRPC and HTTP errors into these types.

## Pydantic Compatibility and Model Regeneration

Because every generated API file calls `to_json` with a Pydantic version fork, the entire generated layer depends on `qdrant_client/_pydantic_compat.py` for correct behavior under both Pydantic v1 and v2. When upgrading Pydantic, the regeneration pipeline must be rerun so that the `PYDANTIC_V2` branch and the `jsonable_encoder` defaults (`by_alias=True, exclude_unset=True, exclude_none=True`) remain consistent with the model definitions in `qdrant_client/http/models/models.py`. Source: [qdrant_client/http/api/distributed_api.py:24-38]() (representative encoder block); [qdrant_client/conversions/conversion.py]() and [qdrant_client/conversions/common_types.py]() provide the type bridges that the generated layer relies on for gRPC ↔ HTTP translation.

## See Also

- Qdrant HTTP API reference (collections, points, search, snapshots, indexes, aliases, distributed, service, beta)
- Pydantic v1/v2 migration notes for `qdrant_client/_pydantic_compat.py`
- Release notes for v1.18.0 (per-request headers, turboquantization, named vector CRUD)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: qdrant/qdrant-client

Summary: Found 7 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/qdrant/qdrant-client/issues/935

## 2. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/qdrant/qdrant-client

## 3. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/qdrant/qdrant-client

## 4. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/qdrant/qdrant-client

## 5. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/qdrant/qdrant-client

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/qdrant/qdrant-client

## 7. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/qdrant/qdrant-client

<!-- canonical_name: qdrant/qdrant-client; human_manual_source: deepwiki_human_wiki -->
