# https://github.com/infiniflow/infinity Project Manual

Generated at: 2026-06-26 22:07:12 UTC

## Table of Contents

- [System Overview & Architecture](#page-1)
- [Indexing, Data Types & Hybrid Search](#page-2)
- [Query Pipeline: Parser, Planner, Optimizer & Executor](#page-3)
- [Client SDKs, Storage, Deployment & Known Issues](#page-4)

<a id='page-1'></a>

## System Overview & Architecture

### Related Pages

Related topics: [Indexing, Data Types & Hybrid Search](#page-2), [Query Pipeline: Parser, Planner, Optimizer & Executor](#page-3), [Client SDKs, Storage, Deployment & Known Issues](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md)
- [python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)
- [client/cpp/README.md](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)
- [go/README.md](https://github.com/infiniflow/infinity/blob/main/go/README.md)
- [third_party/zsv/README.md](https://github.com/infiniflow/infinity/blob/main/third_party/zsv/README.md)
- [third_party/cppjieba/README.md](https://github.com/infiniflow/infinity/blob/main/third_party/cppjieba/README.md)
- [third_party/mlas/README.md](https://github.com/infiniflow/infinity/blob/main/third_party/mlas/README.md)
</details>

# System Overview & Architecture

Infinity is an AI-native database positioned as a single-binary deployment target for retrieval-augmented generation (RAG) and large-scale embedding workloads. This page summarizes the high-level architecture, the supported client surfaces, and the structural patterns used to integrate third-party libraries, as evidenced by the repository's top-level documentation and module READMEs.

## Purpose and Scope

The repository describes Infinity as a database designed for "next-generation AI applications," with explicit emphasis on hybrid retrieval and low-latency search. The top-level README frames the system around three operational goals: extremely low query latency, powerful hybrid search, and ease of use across multiple SDKs. Source: [README.md:3-19]()

According to the top-level README, Infinity targets two headline performance figures:

- **0.1 ms query latency** and **15K+ QPS** on million-scale vector datasets.
- **1 ms latency** and **12K+ QPS** in full-text search across 33 million documents.

Source: [README.md:21-31]()

The intended user base includes developers building RAG applications, semantic search services, and pipelines that combine dense embeddings, sparse embeddings, tensors, and full-text filtering. Source: [README.md:5-11]()

## High-Level Architecture

The repository is organized as a single C++ server binary that exposes both a Python SDK and an HTTP API, with a Go SDK under active development. The structure follows a layered design: clients at the edge, protocol parsing, query planning, execution, and storage. Source: [README.md:33-67]()

```mermaid
graph TB
    subgraph Clients["Client Surfaces"]
        PySDK["Python SDK<br/>(infinity_sdk)"]
        GoSDK["Go SDK<br/>(interface layer)"]
        CppClient["C++ Client<br/>(under development)"]
        HTTP["HTTP API"]
    end

    subgraph Core["Infinity Server (single binary)"]
        Query["Query Planner"]
        Hybrid["Hybrid Search Engine"]
        Indices["Index Layer<br/>(Plaid, SMVE, Secondary)"]
        Storage["Storage Layer<br/>(WAL, MemIndex, RoaringBitmap)"]
    end

    subgraph ThirdParty["Third-Party Libraries"]
        Jieba["cppjieba<br/>(Chinese tokenization)"]
        ZSV["zsv<br/>(CSV parsing)"]
        MLAS["MLAS<br/>(GEMM kernels)"]
        Curlpp["curlpp<br/>(HTTP client)"]
    end

    PySDK --> Query
    GoSDK --> Query
    CppClient --> Query
    HTTP --> Query
    Query --> Hybrid
    Hybrid --> Indices
    Indices --> Storage
    Hybrid -.uses.-> Jieba
    Storage -.uses.-> ZSV
    Indices -.uses.-> MLAS
    HTTP -.uses.-> Curlpp
```

The Python SDK is the primary client surface, documented as "intuitive" and intended for end-user application code. Source: [README.md:51-55]() and [python/infinity_sdk/README.md:51-55]() The Go SDK currently implements only the interface layer — type definitions, error models, index configuration types, query types, and a connection pool — while the Thrift integration layer remains pending. Source: [go/README.md:55-67]() The C++ client is explicitly marked "Under development." Source: [client/cpp/README.md:3-5]()

## Component Layers

### Search and Index Subsystem

The search subsystem supports a hybrid of dense embeddings, sparse embeddings, tensors, and full-text search, alongside attribute filtering. Multiple rerankers are supported, including RRF, weighted sum, and ColBERT. Source: [README.md:33-43]()

Indexing is modular. The roadmap for v0.7.0 lists a fast plaid tensor index (#3264), an SMVE tensor index (#3373), mmap-backed secondary indices, and secondary functional indices for JSON members (#3360). Source: [README.md:69-79]() and the v0.7.0 release notes at [v0.7.0](https://github.com/infiniflow/infinity/releases/tag/v0.7.0). The implementation depends on optimized GEMM kernels from MLAS for dense vector operations. Source: [third_party/mlas/README.md:1-3]()

### Storage and Durability

The storage layer is the substrate for the in-memory index structures that queries depend on. Community-reported bugs reference `src/storage/common/roaring_bitmap.cppm` for bitmap-backed indexing (issue #3359) and a `WalManager::NewFlush()` lifecycle path that may exit before draining its queue during shutdown (issue #3375). These references indicate that the storage module is organized around memory indexes, persistent bitmap indices, and a Write-Ahead Log that runs on a dedicated flush thread.

### Third-Party Integration

The repository integrates several third-party libraries under `third_party/`:

| Library | Role |
|---------|------|
| cppjieba | Chinese word segmentation for full-text indexing ([third_party/cppjieba/README.md:1-10]()) |
| zsv | High-performance CSV parsing for bulk ingest ([third_party/zsv/README.md:1-15]()) |
| MLAS | Processor-optimized GEMM kernels for vector math ([third_party/mlas/README.md:1-3]()) |
| curlpp | C++ HTTP client wrapper ([third_party/curlpp/src/curlpp/Easy.cpp:1-10]()) |

## Common Failure Modes and Community Concerns

The community has surfaced several architectural pain points that operators should be aware of:

1. **WAL shutdown semantics.** `WalManager::NewFlush()` may exit before draining the queue, risking data loss for in-flight entries on graceful shutdown (issue #3375).
2. **Upgrade regressions.** A segfault was reported when upgrading from `v0.7.0-dev5` to `v0.7.0` in a Kubernetes deployment, indicating instability in pre-release-to-release transitions (issue #3377).
3. **Bitmap index invariants.** A `RoaringBitmap::SetTrue: row_index >= count_` error surfaces when `AppendMemIndex` processes rows beyond the current bitmap allocation, suggesting that row-count tracking must stay synchronized with block appends (issue #3359).
4. **Backward compatibility.** Issue #3371 explicitly asks whether future 0.7.x+ versions will maintain downward compatibility — an open question that affects long-running deployments.
5. **Initialization race conditions.** Users have observed `(2008, 'Infinity is initing')` connection refusals during restarts when upstream tools (e.g., RAGFlow) reconnect before the server has finished booting (issue #2523).

These observations describe the system's current operational profile rather than design intent. Most are concentrated around the storage, persistence, and lifecycle management paths, which are the layers most sensitive to single-binary deployment choices.

## See Also

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md)
- [ROADMAP 2025 (issue #2393)](https://github.com/infiniflow/infinity/issues/2393)
- [v0.7.0 Release Notes](https://github.com/infiniflow/infinity/releases/tag/v0.7.0)
- [Python SDK Reference](https://infiniflow.org/docs/dev/pysdk_api_reference)

---

<a id='page-2'></a>

## Indexing, Data Types & Hybrid Search

### Related Pages

Related topics: [System Overview & Architecture](#page-1), [Query Pipeline: Parser, Planner, Optimizer & Executor](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md)
- [python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)
- [src/storage/knn_index/plaid/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/plaid/README.md)
- [src/storage/invertedindex/fst/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/invertedindex/fst/README.md)
- [gui/lib/databse-interface.ts](https://github.com/infiniflow/infinity/blob/main/gui/lib/databse-interface.ts)
- [go/README.md](https://github.com/infiniflow/infinity/blob/main/go/README.md)
- [client/cpp/README.md](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)
- [src/storage/knn_index/emvb/emvb_index.cppm](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/emvb/emvb_index.cppm)
- [src/storage/knn_index/sparse/bmp_handler.cppm](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/sparse/bmp_handler.cppm)
- [third_party/mlas/README.md](https://github.com/infiniflow/infinity/blob/main/third_party/mlas/README.md)
</details>

# Indexing, Data Types & Hybrid Search

## Overview

Infinity positions itself as an AI-native database that combines vector search, sparse retrieval, tensor indexing, and full-text search in a single engine. As stated in [README.md](https://github.com/infiniflow/infinity/blob/main/README.md), it "Supports a hybrid search of dense embedding, sparse embedding, tensor, and full text, in addition to filtering." The same capabilities are re-affirmed in [python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md). The system achieves sub-millisecond latency for vector workloads (0.1 ms at million-scale) and ~1 ms latency for full-text search over 33M documents [Source: [README.md]()].

This page documents the indexing modules, the data-type surface exposed to clients, and how these components combine to deliver hybrid search. It is bounded to features that can be substantiated from the in-tree source files and community-tracked milestones.

## Indexing Systems

### K-Nearest-Neighbor (KNN) Indexes

Infinity ships multiple KNN index families under `src/storage/knn_index/`:

| Index Family | Module | Key Traits (per source) |
|---|---|---|
| HNSW | `knn_hnsw` | Graph-based ANN, used as the default dense-vector index |
| IVF | `knn_ivf` | Inverted-file coarse quantizer + fine ranking |
| DiskANN | `knn_diskann` | SSD-friendly graph index for large datasets |
| EMVB | `emvb` | Product/OPQ quantization (8/16-bit), higher accuracy, no mmap |
| Sparse BMP | `sparse/bmp_handler` | Inverted-index for sparse embeddings via BMP postings |
| PLAID | `plaid` | Residual 2/4-bit quantization, mmap-friendly, full batch search |

[src/storage/knn_index/plaid/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/plaid/README.md) explicitly compares PLAID to EMVB and notes that PLAID adds mmap support and full batch search while sacrificing some accuracy in exchange for ~1/8–1/16 of the original memory footprint.

### Full-Text Inverted Index

The full-text search subsystem uses an FST (finite-state transducer) library located at [src/storage/invertedindex/fst/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/invertedindex/fst/README.md). The README describes it as "a simplified C++ reimplementation of the BurntSushi/fst library," with the goal of "storing and searching *very large* sets or maps (i.e., billions)" through compressed FSM representation that supports both prefix queries and automata-style queries such as regular expressions or Levenshtein distance. This is the indexing structure behind the reported 12K+ QPS full-text throughput [Source: [README.md]()].

### Hardware Acceleration

GEMM-heavy index paths use MLAS, Microsoft’s processor-optimized matrix multiplication library, vendored under [third_party/mlas/README.md](https://github.com/infiniflow/infinity/blob/main/third_party/mlas/README.md). MLAS provides "processor optimized GEMM kernels and platform specific threading code" used for distance computations and quantization steps inside the KNN indexes.

## Data Types

The README advertises "Rich data types" covering strings, numerics, and vectors [Source: [README.md]()]. Client-facing type definitions appear in two places:

- The TypeScript GUI client defines `ITableColumns`, `ITableIndex`, and `ITableSegment` in [gui/lib/databse-interface.ts](https://github.com/infiniflow/infinity/blob/main/gui/lib/databse-interface.ts), where each column carries a `name`, `type`, and `default`, and each index carries `columns`, `index_name`, and `index_type`. The response envelope `IResponseBody` pairs an `error_code` with arbitrary payload fields.
- The Go SDK exposes "Index types and configurations (index.go)" per its implementation-status table in [go/README.md](https://github.com/infiniflow/infinity/blob/main/go/README.md), alongside query types and a connection pool.

The v0.7.0 release notes and roadmap ([Issue #2393](https://github.com/infiniflow/infinity/issues/2393)) extend this surface with the **JSON data type and related functions**, plus **secondary index on JSON members** ([Issue #3360](https://github.com/infiniflow/infinity/issues/3360)). Boolean type support and low-cardinality optimization for secondary indexes also landed in v0.7.0.

## Hybrid Search Architecture

Hybrid search is the composition of multiple retrieval channels followed by a reranking stage. The channels and rerankers that are documented in-tree are:

- Dense embedding (KNN indexes: HNSW, IVF, DiskANN, EMVB, PLAID)
- Sparse embedding (BMP postings)
- Tensor (PLAID fast tensor index, SMVE tensor index planned per [Issue #3373](https://github.com/infiniflow/infinity/issues/3373))
- Full-text (FST-backed inverted index)
- Filtering (predicate pushdown applied alongside each channel)

Rerankers listed in [README.md](https://github.com/infiniflow/infinity/blob/main/README.md) include RRF, weighted sum, and ColBERT.

```mermaid
flowchart LR
    Q[Query] --> D[Dense<br/>HNSW / IVF / DiskANN<br/>EMVB / PLAID]
    Q --> S[Sparse<br/>BMP postings]
    Q --> T[Tensor<br/>PLAID / SMVE]
    Q --> F[Full-text<br/>FST inverted index]
    Q --> P[Filter<br/>predicate]
    D --> R[Reranker<br/>RRF / Weighted / ColBERT]
    S --> R
    T --> R
    F --> R
    P --> R
    R --> O[Ranked results]
```

The architecture above reflects the four retrieval channels mentioned in the top-level README, the named rerankers, and the PLAID tensor index described in [src/storage/knn_index/plaid/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/plaid/README.md). SMVE is a tracked feature request ([Issue #3373](https://github.com/infiniflow/infinity/issues/3373)) and is shown for completeness rather than as a shipped capability.

## Client Access and Known Limitations

The Python SDK ([python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)) is the primary client. The C++ client ([client/cpp/README.md](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)) is "Under development," and the Go SDK ([go/README.md](https://github.com/infiniflow/infinity/blob/main/go/README.md)) has completed the "Interface Layer" but the "Thrift Integration Layer" remains pending.

Community-reported issues relevant to indexing and search:

- **RoaringBitmap bounds check** during `AppendMemIndex` ([Issue #3359](https://github.com/infiniflow/infinity/issues/3359)) — `row_index >= count_` can be raised when a memindex append exceeds the bitmap’s pre-allocated row count.
- **WAL flush shutdown race** ([Issue #3375](https://github.com/infiniflow/infinity/issues/3375)) — `WalManager::NewFlush()` may exit before draining the queue, risking lost entries on shutdown.
- **Downgrade compatibility** from v0.7.0 ([Issue #3371](https://github.com/infiniflow/infinity/issues/3371)) — backward compatibility with earlier on-disk formats is a tracked concern.

## See Also

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md) — High-level feature list and performance claims.
- [src/storage/knn_index/plaid/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/knn_index/plaid/README.md) — Detailed PLAID index design and EMVB comparison.
- [src/storage/invertedindex/fst/README.md](https://github.com/infiniflow/infinity/blob/main/src/storage/invertedindex/fst/README.md) — FST-based full-text indexing internals.
- [Issue #2393 — ROADMAP 2025](https://github.com/infiniflow/infinity/issues/2393) — Tracked roadmap including JSON type, PLAID, SMVE, and secondary indexes.
- [Issue #3373 — SMVE tensor index](https://github.com/infiniflow/infinity/issues/3373) — Feature request for spherical-anchor multi-vector retrieval.

---

<a id='page-3'></a>

## Query Pipeline: Parser, Planner, Optimizer & Executor

### Related Pages

Related topics: [System Overview & Architecture](#page-1), [Indexing, Data Types & Hybrid Search](#page-2), [Client SDKs, Storage, Deployment & Known Issues](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/parser/sql_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/sql_parser.cppm)
- [src/parser/expr_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/expr_parser.cppm)
- [src/parser/search_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/search_parser.cppm)
- [src/planner/logical_planner.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/logical_planner.cppm)
- [src/planner/binder/query_binder.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/binder/query_binder.cppm)
- [src/planner/optimizer.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/optimizer.cppm)
</details>

# Query Pipeline: Parser, Planner, Optimizer & Executor

## Overview

Infinity processes every incoming query — whether issued through the Python SDK, the HTTP API, or the native Thrift interface — through a multi-stage pipeline that converts user-facing statements into an executable execution plan. The pipeline is composed of three logical layers, each implemented as a separate module:

| Stage | Module | Source |
|-------|--------|--------|
| Parsing | SQL / expression / search parsers | `src/parser/` |
| Planning & binding | Logical planner and query binder | `src/planner/` |
| Optimization & execution | Optimizer feeding the executor | `src/planner/optimizer.cppm` and downstream |

The pipeline is what enables Infinity to expose both classic SQL semantics and AI-native search operations (dense, sparse, tensor, full-text, JSON) under a unified query model. Source: [README.md](https://github.com/infiniflow/infinity/blob/main/README.md).

## Parser Layer

The parser is the entry point for any query text. It is split into three cooperative sub-parsers rather than one monolithic grammar, so that each dialect can evolve independently:

- **SQL parser** — handles DDL (CREATE/DROP database, table, index) and DML (SELECT, INSERT, DELETE, UPDATE) statements. Source: [src/parser/sql_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/sql_parser.cppm).
- **Expression parser** — parses filter expressions, computed projections, and scalar expressions that appear inside DML. Source: [src/parser/expr_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/expr_parser.cppm).
- **Search parser** — parses the AI-native search clauses such as vector similarity, full-text MATCH, sparse vector, and tensor match operators. Source: [src/parser/search_parser.cppm](https://github.com/infiniflow/infinity/blob/main/src/parser/search_parser.cppm).

This split reflects a recurring design goal in the codebase: keep the SQL grammar focused on relational structure, and let the search/expr parsers carry the AI-specific syntax (e.g. `MATCH VECTOR`, `SPARSE`, `TENSOR`, `FUSION`). The output of the parser layer is a parse tree (AST) that is consumed by the binder.

```mermaid
flowchart LR
    A[Client Request] --> B[sql_parser.cppm]
    A --> C[expr_parser.cppm]
    A --> D[search_parser.cppm]
    B --> E[AST]
    C --> E
    D --> E
    E --> F[query_binder.cppm]
    F --> G[logical_planner.cppm]
    G --> H[optimizer.cppm]
    H --> I[Executor]
```

## Binder and Logical Planner

Once the parser produces an AST, the **query binder** performs name resolution and semantic validation: it resolves table and column names against the catalog, type-checks expressions, and binds index references. Source: [src/planner/binder/query_binder.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/binder/query_binder.cppm).

After binding, the **logical planner** translates the bound AST into a tree of logical operators (e.g. `Scan`, `Filter`, `Projection`, `Join`, `Fusion`, `Match`). Logical operators are still storage- and execution-agnostic, which is what allows the same plan to describe both relational and AI search workloads. Source: [src/planner/logical_planner.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/logical_planner.cppm).

The logical plan is the contract between the front-end (parser + binder) and the back-end (optimizer + executor). Because hybrid search in Infinity is expressed as additional logical operators (such as a `Fusion` over multiple `Match` operators), the planner is the natural place where features like RRF or weighted-sum reranking get modeled. Source: [README.md](https://github.com/infiniflow/infinity/blob/main/README.md).

## Optimizer

The optimizer consumes a logical plan and produces an optimized logical plan that is equivalent in semantics but cheaper to execute. Source: [src/planner/optimizer.cppm](https://github.com/infiniflow/infinity/blob/main/src/planner/optimizer.cppm).

Typical optimizations applied here include:

- **Predicate pushdown** — moving filters as close to the storage scan as possible so that the index engine can prune rows early.
- **Index selection** — choosing which secondary or full-text index to use for a given `Match` or filter clause, taking into account the index kinds supported in v0.7.0 (HNSW, IVF, BMP, secondary on JSON, fast Plaid tensor index, SMVE tensor index). Source: [issue #2393 ROADMAP 2025](https://github.com/infiniflow/infinity/issues/2393).
- **Fusion reordering** — for hybrid queries, ordering the per-stream retrievals so that the most selective stream runs first.

The optimizer sits between the planner and the executor, and the executor then walks the optimized plan, dispatching each operator to the appropriate storage, index, or expression engine.

## End-to-End Query Lifecycle

A typical hybrid search query in Infinity traverses the pipeline as follows:

1. The user submits a SELECT statement through the Python or HTTP API that combines a vector `MATCH` with a full-text `MATCH` and a scalar filter.
2. `sql_parser.cppm` builds the SELECT skeleton, `search_parser.cppm` parses the `MATCH` clauses, and `expr_parser.cppm` parses the filter expression. The three sub-trees are merged into one AST.
3. `query_binder.cppm` resolves table, column, and index references against the catalog.
4. `logical_planner.cppm` lowers the bound AST into a logical plan: `Scan → Filter → Match(dense) → Match(text) → Fusion → Projection`.
5. `optimizer.cppm` pushes the filter below the matches, selects the right index for each match, and chooses a fusion order.
6. The executor materializes results, optionally applying a reranker (RRF, weighted sum, or ColBERT). Source: [README.md](https://github.com/infiniflow/infinity/blob/main/README.md).

## Known Failure Modes and Community-Reported Issues

Several community-reported bugs interact directly with the query pipeline and the components that surround it:

- **WAL flush dropping entries on shutdown** — `WalManager::NewFlush()` may return before the queue is fully drained, which can corrupt on-disk state for any plan that relies on persisted mem-indexes. Source: [issue #3375](https://github.com/infiniflow/infinity/issues/3375).
- **Segmentation fault after upgrading to v0.7.0** — reported when RAGFlow connects to a freshly upgraded Infinity; users encountering this should check whether their deployment used `v0.7.0-dev5` as an intermediate step. Source: [issue #3377](https://github.com/infiniflow/infinity/issues/3377).
- **`RoaringBitmap::SetTrue: row_index >= count_`** — an invariant violation originating in `src/storage/common/roaring_bitmap.cppm` that can be triggered by certain append-mem-index paths, often surfacing during long-running imports that the planner eventually scans. Source: [issue #3359](https://github.com/infiniflow/infinity/issues/3359).
- **Backward-compatibility concerns across the 0.7.x line** — users have asked whether later 0.7.x releases will remain wire-compatible with earlier ones; the query pipeline's parser/binder contracts are the most likely source of any breakage. Source: [issue #3371](https://github.com/infiniflow/infinity/issues/3371).

When diagnosing such issues, it is useful to identify which stage of the pipeline the failure occurred in: parser errors are surface-level syntax messages, binder errors name the offending identifier, planner/optimizer errors typically point at an operator, and runtime errors (RoaringBitmap, WAL, segfault) come from the executor and storage layers below the plan.

## See Also

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md) — high-level product overview, supported features, and benchmark claims.
- [ROADMAP 2025 (issue #2393)](https://github.com/infiniflow/infinity/issues/2393) — upcoming index types (Plaid, SMVE) and JSON support that influence planner/optimizer behavior.
- [Infinity official documentation](https://infiniflow.org/docs/dev/) — Python API, HTTP API, and quickstart guides that exercise this pipeline end-to-end.

---

<a id='page-4'></a>

## Client SDKs, Storage, Deployment & Known Issues

### Related Pages

Related topics: [System Overview & Architecture](#page-1), [Query Pipeline: Parser, Planner, Optimizer & Executor](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/infiniflow/infinity/blob/main/README.md)
- [python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)
- [client/cpp/README.md](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)
- [src/storage/common/roaring_bitmap.cppm](https://github.com/infiniflow/infinity/blob/main/src/storage/common/roaring_bitmap.cppm)
- [src/storage/wal/wal_manager.cppm](https://github.com/infiniflow/infinity/blob/main/src/storage/wal/wal_manager.cppm)
- [src/network/infinity_thrift_service.cppm](https://github.com/infiniflow/infinity/blob/main/src/network/infinity_thrift_service.cppm)
- [src/network/http_server.cppm](https://github.com/infiniflow/infinity/blob/main/src/network/http_server.cppm)
- [go/infinity.go](https://github.com/infiniflow/infinity/blob/main/go/infinity.go)
</details>

# Client SDKs, Storage, Deployment & Known Issues

This page documents how external clients connect to Infinity, how the server-side storage layer is organised, how the database is deployed in production, and the known operational issues reported by the community in the v0.6.x and v0.7.x release lines. It is intended for operators and integrators who need a stable, end-to-end view of the system boundaries before deploying at scale.

## Client SDKs

Infinity exposes three first-party client surfaces. The Python SDK is the canonical interface and ships in two flavours: a networked client that talks Thrift/HTTP to a running Infinity server, and an embedded client that links the engine directly into the host process. Both flavours share the same high-level API (`infinity.get_database(...)`, table creation, import, search) so application code can switch between them with minimal change. As described in the project overview, the database is distributed "as a single-binary" deployment that runs alongside an "intuitive Python API" (Source: [README.md](https://github.com/infiniflow/infinity/blob/main/README.md)).

```mermaid
flowchart LR
    A[Python SDK<br/>network] -->|Thrift| S(Infinity server)
    B[Python SDK<br/>embedded] -->|in-process| S
    C[Go SDK] -->|Thrift| S
    D[C++ SDK<br/>under development] -->|Thrift| S
    S --> ST[(Storage layer)]
    style D stroke-dasharray: 4 4
```

The Python client README confirms that the SDK provides a high-level wrapper for hybrid search, indexing, and metadata operations, and lists dedicated documentation entry points: the Python API reference, the HTTP API reference, and the Quickstart guide (Source: [python/infinity_sdk/README.md](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)).

The Go SDK is published alongside the v0.7.0 release notes — the v0.7.0 summary explicitly lists "GO SDK" as a shipped deliverable, signalling first-class support for Go-based integrators (Source: [v0.7.0 release notes](https://github.com/infiniflow/infinity/releases/tag/v0.7.0)).

The C++ client is the third official binding. Its repository README states it is "Under development", which means it is not yet recommended for production integrators who have stable alternatives in Python or Go (Source: [client/cpp/README.md](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)).

All network clients converge on two server-side entry points implemented under `src/network/`: a Thrift-defined RPC service for the language-native SDKs and an HTTP server for REST/curl-style consumers. These two surfaces share the same handler stack, so feature parity between the Python, Go, and HTTP clients is a deliberate design goal (Source: [src/network/infinity_thrift_service.cppm](https://github.com/infiniflow/infinity/blob/main/src/network/infinity_thrift_service.cppm)).

## Storage Architecture

On the server side, persistence and indexing are organised under `src/storage/`. Two subsystems are particularly relevant to operators:

- **Write-Ahead Log (WAL).** `WalManager` serialises incoming writes into a flush queue so that durability is preserved before any in-memory index is updated. The community-reported bug "WAL flush thread may drop queued entries during shutdown" indicates that `WalManager::NewFlush()` can exit before draining its queue during process termination, which is a correctness hazard for crash-recovery scenarios (Source: issue [#3375](https://github.com/infiniflow/infinity/issues/3375)).
- **In-memory secondary indexes.** Filters and boolean projections are accelerated by a `RoaringBitmap` helper under `src/storage/common/`. The `RoaringBitmap::SetTrue` routine enforces that a row index never exceeds the bitmap's allocated count; violating this invariant aborts the operation with a diagnostic (Source: issue [#3359](https://github.com/infiniflow/infinity/issues/3359), referencing [src/storage/common/roaring_bitmap.cppm:120](https://github.com/infiniflow/infinity/blob/main/src/storage/common/roaring_bitmap.cppm)).

These two layers are shared by every client SDK, so a failure in either path is observable through any of the three bindings and through the HTTP surface.

## Deployment Topology

Infinity is designed to ship as a single self-contained binary, which simplifies container rollouts. The project README highlights "A single-binary" deployment as a usability feature, and the v0.7.0 release adds an "ARM64 official build" so that the same image family covers both x86_64 and arm64 Kubernetes nodes (Source: [README.md](https://github.com/infiniflow/infinity/blob/main/README.md), [v0.7.0 release notes](https://github.com/infiniflow/infinity/releases/tag/v0.7.0)).

| Distribution channel | Frequency | Use case |
|---|---|---|
| `nightly` | Per-commit rolling | Pre-release validation |
| `nightly-arm64` | Per-commit rolling | ARM64 pre-release validation |
| `v0.7.0`, `v0.7.0-dev*` | Tagged | Production / staging |
| `v0.6.15` and earlier | Maintenance | Legacy clusters |

Operators who upgrade from `v0.7.0-dev5` to `v0.7.0` should review the on-disk format changes — the community request "0.7.0 开始，后续的版本升级考虑向下兼容吗" ([#3371](https://github.com/infiniflow/infinity/issues/3371)) explicitly asks for documented backward-compatibility guarantees across minor versions.

## Known Issues & Operational Hazards

The community has surfaced several real defects and friction points worth tracking:

- **Segmentation fault after `v0.7.0-dev5` → `v0.7.0` upgrade.** Reported on a Debian + Kubernetes v1.34.6 deployment; users who held a working RAGFlow stack on `v0.7.0-dev5` experienced crashes immediately after upgrading. The recommended mitigation is to pin to the dev6/dev7 tags until a fix is published, and to capture a core dump for triage (Source: issue [#3377](https://github.com/infiniflow/infinity/issues/3377)).
- **`RoaringBitmap::SetTrue` invariant violation.** The error fires when the appended `row_index` exceeds the bitmap's `count_`. Typical trigger is `AppendMemIndex` receiving a row range that the bitmap has not been pre-extended for. Workarounds until a fix ships include re-creating the affected index after a clean checkpoint (Source: issue [#3359](https://github.com/infiniflow/infinity/issues/3359)).
- **WAL flush-thread shutdown race.** `WalManager::NewFlush()` can return before the queue is empty, which may cause data loss on unclean shutdown. Perform an orderly `CHECKPOINT` and flush before stopping the process (Source: issue [#3375](https://github.com/infiniflow/infinity/issues/3375)).
- **Memory leak under sustained load.** A long-running memory growth regression has been reported; the investigation thread recommends periodic restarts as a temporary mitigation while the upstream fix is developed (Source: issue [#2502](https://github.com/infiniflow/infinity/issues/2502)).
- **Connection refused during startup.** The error `(2008, 'Infinity is initing')` with `TTransportException("Could not connect to any of [...]")` is thrown when upstream consumers (notably RAGFlow) probe the server before it has finished initialising. Clients should retry with backoff rather than treating this as a fatal error (Source: issue [#2523](https://github.com/infiniflow/infinity/issues/2523)).

## See Also

- [Infinity README & key features](https://github.com/infiniflow/infinity/blob/main/README.md)
- [Python SDK README](https://github.com/infiniflow/infinity/blob/main/python/infinity_sdk/README.md)
- [C++ client README](https://github.com/infiniflow/infinity/blob/main/client/cpp/README.md)
- [ROADMAP 2025 tracking issue](https://github.com/infiniflow/infinity/issues/2393)
- [ROADMAP 2024 tracking issue](https://github.com/infiniflow/infinity/issues/338)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: infiniflow/infinity

Summary: Found 9 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: runtime_trace
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Repro command: `docker run -d --name infinity -v /var/infinity/:/var/infinity --ulimit nofile=500000:500000 --network=host infiniflow/infinity:nightly`
- Evidence: identity.distribution | https://github.com/infiniflow/infinity

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/infiniflow/infinity/issues/3385

## 3. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/infiniflow/infinity

## 4. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: packet_text.keyword_scan | https://github.com/infiniflow/infinity

## 5. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/infiniflow/infinity

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/infiniflow/infinity

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/infiniflow/infinity

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/infiniflow/infinity

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/infiniflow/infinity

<!-- canonical_name: infiniflow/infinity; human_manual_source: deepwiki_human_wiki -->