Doramagic Project Pack · Human Manual

zvec

Zvec is an in-process vector database that ships as an embeddable library rather than a standalone server. As stated in the README, the engine is "open-source, in-process, lightweight, lig...

Introduction, Features & Quickstart

Related topics: Core Architecture, Storage & SQL Engine, Vector & Full-Text Indexing Algorithms, SDKs, Language Bindings & AI Extensions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Threading model

Continue reading this section for the full explanation and source context.

Section Storage layer

Continue reading this section for the full explanation and source context.

Section 3.1 Installation

Continue reading this section for the full explanation and source context.

Related topics: Core Architecture, Storage & SQL Engine, Vector & Full-Text Indexing Algorithms, SDKs, Language Bindings & AI Extensions

Introduction, Features & Quickstart

1. What is Zvec?

Zvec is an open-source, in-process vector database that is designed to be embedded directly into applications. According to the project README, it is "lightweight, lightning-fast, and battle-tested within Alibaba Group," delivering production-grade, low-latency similarity search without requiring a separate server process. As an in-process library, it runs wherever your code runs — notebooks, servers, CLI tools, or edge devices (README.md:1-120).

Unlike client-server vector databases, Zvec links into the host application and shares its lifecycle. This design choice is reflected throughout the codebase — for example, GlobalResource is a process-wide singleton that owns shared thread pools, and resource limits (such as memory) are detected per-process from cgroup and OS APIs (src/db/common/global_resource.h:1-40, src/db/common/cgroup_util.h:1-50).

2. Core Features

The README advertises a broad feature set, several of which are directly observable in the source tree:

Feature AreaWhere it lives in the codebaseNotes
Dense + sparse vector storagesrc/db/common/constants.h (kSparseMaxDimSize = 16384)Sparse vector dimension is bounded.
Full-text & hybrid searchSurfaced via the public C++/Python API; collection-level index lifecyclePer v0.5.0 release notes.
Durable write-ahead log (WAL)WAL_FILE in file_helper.h; WLOG_* macros in typedef.hCrash-safe persistence.
Concurrent accessConcurrentRoaringBitmap32 (read-write mutex)Multiple readers, single writer.
Cross-platform runtimecgroup_util.h (Linux/macOS/Windows)Linux cgroup, macOS mach, Windows APIs.
Query profilingProfiler in profiler.hLatency tree per query stage.

Zvec supports a range of index types (HNSW and friends) and storage backends. The release notes referenced in the community context call out a DiskANN index in v0.5.0 for memory-efficient large-scale search, and an HNSW index whose m (max upper-layer neighbors) and scaling_factor defaults were clarified in v0.1.1.

Threading model

Zvec separates work into two thread pools owned by GlobalResource:

This split lets foreground queries remain low-latency while long-running optimization work happens in the background.

Storage layer

The persistent layer is built on RocksDB, wrapped by a thin RocksdbContext that supports per-column-family merge operators and an optional hash-skiplist memtable (src/db/common/rocksdb_context.h:1-60). Index and filter bitmaps use a thread-safe Roaring bitmap implementation that wraps the C roaring_bitmap API with std::shared_mutex (src/db/common/concurrent_roaring_bitmap.h:1-50).

flowchart LR
    A[Host Application] --> B[Zvec C++ Core]
    B --> C[GlobalResource<br/>thread pools]
    B --> D[Profiler<br/>latency tree]
    B --> E[RocksDB<br/>column families]
    B --> F[Roaring Bitmaps<br/>delete / filter]
    B --> G[WAL / Segments<br/>on disk]

3. Quickstart

3.1 Installation

Zvec ships official SDKs for several languages (see the README). The Python package supports Python 3.10–3.14, which addresses an earlier community request to broaden wheel coverage (README.md:90-120).

# Python
pip install zvec

# Node.js
npm install @zvec/zvec

# Dart / Flutter
flutter pub add zvec

Go and Rust SDKs live in dedicated repositories linked from the README, and an Elixir SDK has been published by the community (README.md:90-120; community issue #403).

3.2 Concepts you need to know

Every Zvec deployment is built around three core concepts, all of which are encoded in the source:

  • Collection — a named, on-disk directory containing a single database. The collection name must match COLLECTION_NAME_REGEX = ^[a-zA-Z0-9_-]{3,64}$ (src/db/common/constants.h:1-50).
  • Field — a typed column inside a collection. Field names must match FIELD_NAME_REGEX = ^[a-zA-Z0-9_-]{1,32}$, and array fields are capped at MAX_ARRAY_FIELD_LEN = 32 elements (src/db/common/constants.h:1-50).
  • Index — an optional acceleration structure attached to a field. v0.5.0 added FTS (full-text) indexes managed via Collection::CreateIndex / DropIndex.

3.3 Configuration knobs worth knowing

The constants in src/db/common/constants.h define several guardrails that affect every collection:

ConstantValuePurpose
DEFAULT_MEMORY_LIMIT_RATIO0.8fFraction of process memory available to caches.
MIN_MEMORY_LIMIT_BYTES100 * 1024 * 1024Floor for the memory budget.
COMPACT_DELETE_RATIO_THRESHOLD0.3fWhen a segment's deleted-row ratio exceeds this, compaction is triggered.
kMaxRecordBatchNumRows4096Maximum rows per insert batch.
kSparseMaxDimSize16384Maximum sparse-vector dimensionality.

4. Operational Notes & Common Pitfalls

A few recurring issues from the community are worth surfacing on an introductory page:

  • CPU feature detection (issue #512). Zvec uses SIMD acceleration when available. If you ship a binary built on an AVX-capable host to a legacy CPU, it will crash. Use the CPU-flag detection / dispatch added in v0.1.0 (PR #3) to pick the right code path, or distribute a non-AVX build for older targets.
  • create_index after drop_index (issue #427). Dropping a scalar index and re-creating it can fail because the underlying RocksDB column family is not fully cleaned up. Until this is fixed, prefer creating a new collection or opening the database in a way that flushes the stale CF before recreating the index.
  • Platform support. Windows, Linux, macOS, Android, iOS, and RISC-V are all supported per the release history; cgroup_util.h abstracts Linux cgroup, macOS mach, and Windows APIs so the same code path works on all three desktop OSes (src/db/common/cgroup_util.h:1-50).
  • Logging. Both ailego's Logger and Google glog are bridged via AppendLogger and LogUtil::Init, so you can route Zvec logs into your application's existing logging pipeline (src/db/common/logger.h:1-50, src/db/common/glogger.h:1-50).
  • Error handling. All public APIs return a Status object. Error codes are centrally registered through PROXIMA_ZVEC_ERROR_CODE macros and range from generic RuntimeError / InvalidArgument to storage-specific OpenFile, ReadData, WriteData, SerializeError, and DeserializeError (src/db/common/error_code.h:1-80).

See Also

Source: https://github.com/alibaba/zvec / Human Manual

Core Architecture, Storage & SQL Engine

Related topics: Introduction, Features & Quickstart, Vector & Full-Text Indexing Algorithms

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Threading and Global Resources

Continue reading this section for the full explanation and source context.

Section Logging

Continue reading this section for the full explanation and source context.

Section Profiling

Continue reading this section for the full explanation and source context.

Related topics: Introduction, Features & Quickstart, Vector & Full-Text Indexing Algorithms

Core Architecture, Storage & SQL Engine

Overview

Zvec is an in-process vector database that ships as an embeddable library rather than a standalone server. As stated in the README, the engine is "open-source, in-process, lightweight, lightning-fast" and runs directly inside the host application's process. Internally, the codebase under src/db/common/ and src/core/interface/utils/ exposes a layered architecture composed of:

  1. A common infrastructure layer (global resources, logging, profiling, constants).
  2. A storage layer built around RocksDB column families, memory-mapped forward files, write-ahead logs, and concurrent Roaring bitmaps.
  3. A SQL/query engine layer that validates and dispatches queries against the storage layer (referenced via the error-code system in error_code.h).

The remainder of this page walks each layer using the headers and definitions actually present in the repository.

Common Infrastructure Layer

The common layer provides the cross-cutting services that every higher layer relies on: threading, logging, profiling, naming conventions, and resource detection.

Threading and Global Resources

The singleton zvec::GlobalResource owns two ailego::ThreadPool instances that are lazily initialized on first access. As shown in global_resource.h, query_thread_pool() serves read/search workloads while optimize_thread_pool() runs index-building and compaction tasks. The lazy initialize() pattern ensures thread pools are created only when the engine actually needs them, which keeps startup cheap for embedding scenarios.

Logging

Two cooperating loggers exist:

  • LogUtil::Init in logger.h wires the ailego::Factory logger and validates that log_dir/log_file are non-empty when the file logger type is selected.
  • AppendLogger in glogger.h wraps glog and forwards log directory/file parameters through ailego's parameter system.

The macros CLOG_DEBUG/INFO/WARN/ERROR/FATAL in typedef.h prepend every message with collection[%s], while LLOG_* adds a column and segment identifier, and WLOG_* prefixes the WAL path. This gives every log line a stable context for debugging multi-collection workloads.

Profiling

The Profiler class in profiler.h builds a JSON tree of stage latencies. Each Stage records its own elapsed time (via ailego::ElapsedTime) under a configurable JSON node, allowing per-query or per-batch latency breakdowns that can be serialized for observability.

Constants, Naming, and Platform Utilities

constants.h defines naming regexes for collections, fields, and primary keys (COLLECTION_NAME_REGEX, FIELD_NAME_REGEX), reserved internal column names (LOCAL_ROW_ID, GLOBAL_DOC_ID, USER_ID), and global thresholds such as DEFAULT_MEMORY_LIMIT_RATIO = 0.8f and MIN_MEMORY_LIMIT_BYTES. cgroup_util.h abstracts getCpuLimit(), getMemoryLimit(), and CPU/memory usage probes across Linux, macOS, and Windows so that memory budgeting is consistent across platforms.

Storage Layer

RocksDB Context

Rocksdb_context.h wraps RocksDB with a struct of Args that includes db_path, column_names, a default merge_op, and a per-column-family merge-operator map. This is the foundation for zvec's multi-column-family storage design — each collection's secondary indexes, term dictionaries, and metadata live in dedicated CFs. This design is also relevant to the community-reported issue #427 where a drop_index() did not clean up the RocksDB column family, causing subsequent create_index() to fail.

File Layout

file_helper.h enumerates the on-disk files managed by the engine via the FileID enum: ID_FILE (idmap), DELETE_FILE (del), FORWARD_FILE (data.fwd), PROXIMA_FILE (data.pxa), SEGMENT_FILE, LSN_FILE, MANIFEST_FILE, WAL_FILE, and RESHARD_STATE. The forward and segment files are designed for memory-mapped access, while WAL files provide the durability guarantee highlighted in the README.

Concurrent Bitmaps

concurrent_roaring_bitmap.h provides ConcurrentRoaringBitmap32 and a 64-bit counterpart. Both use a std::shared_mutex (or scoped read/write locks) over the underlying Roaring bitmap, allowing lock-friendly concurrent reads and exclusive writes for delete-bitmap and posting-list operations performed by the storage layer.

WAL Format

Although only the macros are visible in typedef.h, the WAL_FORMAT " wal_path_[%s] " macro and WLOG_* family indicate that every write path tags log output with the active WAL path, simplifying crash-recovery debugging.

SQL Engine and Error Model

The engine exposes a SQL-like API for collection and query operations. While the query execution source is not included in the retrieved context, the surface contract is visible through:

  • Error codes in error_code.h. Codes are declared via PROXIMA_ZVEC_ERROR_CODE_DECLARE(...) (e.g., MismatchedDimension, InvalidTopk, InvalidRadius, InvalidLinear, InvalidFieldName, UnsupportedCondition, OrderbyNotInSelectItems, PbToSqlInfoError, ExceedRateLimit, InvalidSparseValues, InvalidBatchSize). ErrorCode::What(int) returns the description string.
  • Validation helpers in core/interface/utils/utils.h, which expose extract_enum_from_json for safely parsing enum-typed JSON parameters via magic_enum. This is what allows the SQL/CLI parameter layer to map user input into typed enums before the storage layer consumes them.
flowchart LR
    A[Host Application] --> B[Common Infrastructure<br/>GlobalResource, Profiler, Logger]
    B --> C[SQL / Query Engine<br/>Validation + ErrorCode]
    C --> D[Storage Layer<br/>RocksDB CFs + MMap + WAL]
    D --> E[(Forward / Segment / WAL files)]
    D --> F[(Concurrent Roaring Bitmaps)]

Common Failure Modes

  • Invalid input dimensions or top-k: raised as MismatchedDimension / InvalidTopk from error_code.h.
  • Stale column-family state after drop_index(): tracked in issue #427; mitigated by ensuring RocksdbContext properly disposes of cf_handles_ during CF removal.
  • Cross-platform native binaries: the README ships official Python, Node.js, Go, Rust, and Dart/Flutter packages, but as reported in issue #512 AVX-only native libraries can crash on legacy CPUs — make sure to consume a non-AVX build when targeting older hardware.
  • Naming violations: collections, fields, and primary keys must match the regexes in constants.h, otherwise the engine rejects the operation with InvalidCollectionName/InvalidFieldName.

See Also

  • README.md — project overview, installation, feature list
  • v0.5.0 release notes — Full-Text Search (FTS) and hybrid retrieval
  • v0.5.1 release notes — External vector source, zero-copy VectorViewClause, search prefetch configuration
  • Issue #23 — Windows platform support (resolved in v0.3.0)
  • Issue #131 — Python 3.13/3.14 wheel support
  • Issue #199 — Full-Text Search feature request (shipped in v0.5.0)
  • Issue #427 — RocksDB column-family cleanup bug after drop_index()
  • Issue #512 — Legacy CPU / non-AVX binary crash

Source: https://github.com/alibaba/zvec / Human Manual

Vector & Full-Text Indexing Algorithms

Related topics: Core Architecture, Storage & SQL Engine, SDKs, Language Bindings & AI Extensions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section HNSW (Hierarchical Navigable Small World)

Continue reading this section for the full explanation and source context.

Section IVF (Inverted File Index)

Continue reading this section for the full explanation and source context.

Section Flat (Brute-Force)

Continue reading this section for the full explanation and source context.

Related topics: Core Architecture, Storage & SQL Engine, SDKs, Language Bindings & AI Extensions

Vector & Full-Text Indexing Algorithms

Overview

zvec provides a unified indexing subsystem that supports dense vector similarity search, sparse vector retrieval, and full-text search within a single in-process engine. The indexing layer is implemented across the src/db/common/ and src/core/interface/utils/ directories and is composed of pluggable algorithm implementations, a shared storage abstraction (RocksDB-backed), and concurrency utilities for thread-safe index access.

The indexing subsystem is designed around three principles: (1) algorithms are pluggable so that the same Collection abstraction can host an HNSW, IVF, or Flat index, (2) full-text and vector indexes share the same column-family storage so hybrid queries can be evaluated in a single pass, and (3) all hot-path operations are guarded by the lightweight primitives defined in the common utilities (memory ordering, status codes, error reporting).

Vector Indexing Algorithms

HNSW (Hierarchical Navigable Small World)

HNSW is the default dense vector index in zvec. It builds a multi-layer proximity graph where each node represents a vector and each edge represents approximate nearest-neighbor links. Search begins at the topmost layer and descends greedily, expanding the candidate set as it moves to lower layers. The graph construction and search code lives in the hnsw algorithm module (referenced via src/core/algorithm/hnsw/). The common typedef and threading primitives used by the HNSW implementation are defined in src/db/common/typedef.h and src/db/common/concurrent_roaring_bitmap.h.

Key tuning parameters exposed to the caller include M (max neighbors per node), ef_construction (candidate list size during index build), and ef_search (candidate list size at query time). The relationship between these parameters and recall/latency is described in the configuration section below.

IVF (Inverted File Index)

IVF partitions the vector space into nlist Voronoi cells using k-means clustering, and at query time probes only the nprobe nearest cells. The implementation is in src/core/algorithm/ivf/. IVF is preferable when the dataset is large, recall requirements are moderate, and memory is constrained, because IVF keeps only the coarse quantizer in memory and the per-cell posting lists on disk.

Flat (Brute-Force)

Flat computes exact distances against every vector in the collection. The implementation is in src/core/algorithm/flat/. It is the reference index for benchmarking and for small collections where recall must be 100%.

Full-Text Search (FTS) Indexing

Full-text search in zvec is supported via an inverted index over tokenized string fields. The FTS subsystem is integrated into the same Collection abstraction as vector indexes, so a single collection can simultaneously hold vector and text fields. FTS uses the Roaring Bitmap data structure for posting lists, which is wrapped by the thread-safe container in src/db/common/concurrent_roaring_bitmap.h. Constants that govern tokenization, stemming, and stop-word handling are centralized in src/db/common/constants.h.

zvec combines vector and full-text scores using a MultiQuery clause. The query planner evaluates vector and text sub-queries in parallel and merges results by a configurable fusion function (typically reciprocal rank fusion or weighted sum). The shared RocksDB column-family context used by both sub-queries is defined in src/db/common/rocksdb_context.h, which guarantees that the two sub-queries see a consistent snapshot of the collection.

Algorithm Selection Guide

AlgorithmBest forRecallBuild TimeMemoryDisk-bound
HNSWGeneral-purpose, high recallHighMediumHighNo
IVFLarge datasets, moderate recallMedium-HighHigh (k-means)LowYes
FlatSmall datasets, exact search1.0LowHighNo
FTSText/keyword searchExact token matchLowLowYes

Configuration Parameters

Key parameters exposed by the Collection API include the algorithm type, vector dimension, distance metric (L2, IP, Cosine), and the per-algorithm knobs listed above. Status and error reporting for index operations (e.g., invalid dimension, missing field) is normalized through the error helpers in src/db/common/error_code.h. When a configuration value is invalid, callers receive a structured Status object rather than an exception, which is the convention throughout the common layer.

Logging and observability for index build and search are routed through the logging wrappers in src/db/common/logger.h and src/db/common/glogger.h, and latency/throughput telemetry is collected by the profiler in src/db/common/profiler.h.

Performance Characteristics

Index build performance is dominated by graph construction for HNSW, k-means for IVF, and is linear for Flat. Query performance is sub-linear for HNSW (graph traversal), approximately nprobe/nlist for IVF, and linear for Flat. FTS query performance depends on posting-list length and is typically constant-time per matched term. Resource limits (CPU, memory) used for auto-tuning are discovered through the cgroup utility in src/db/common/cgroup_util.h, and the global resource singleton in src/db/common/global_resource.h caches these values for the lifetime of the process.

See Also

  • Architecture Overview
  • Collection API Reference
  • Hybrid Query Semantics
  • Tuning and Benchmarking Guide

Source: https://github.com/alibaba/zvec / Human Manual

SDKs, Language Bindings & AI Extensions

Related topics: Introduction, Features & Quickstart, Vector & Full-Text Indexing Algorithms

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Introduction, Features & Quickstart, Vector & Full-Text Indexing Algorithms

SDKs, Language Bindings & AI Extensions

Overview

Zvec is an open-source, in-process vector database built on a C++ core that exposes a stable surface to language-specific SDKs. The repository ships official bindings for Python, Node.js, Go, Rust, and Dart/Flutter, plus an AI extension framework for on-device embedding workflows. All language bindings are layered on top of a shared set of common C++ infrastructure components that govern error reporting, configuration, logging, persistence, and resource allocation.

This page documents how those layers relate to the public SDKs, what platform and CPU constraints apply, and where the AI extension framework plugs into the runtime.

Common C++ Foundation Behind Every SDK

Every language binding ultimately calls into the C++ runtime that lives under src/db/common/. The components that SDK authors most directly interact with are:

  • Error code registrysrc/db/common/error_code.h defines a singleton ErrorCode map and macro helpers (PROXIMA_ZVEC_ERROR_CODE_DEFINE, PROXIMA_ZVEC_ERROR_CODE_DECLARE) used to surface Status values from native code to the host language. Bindings translate these codes into language-native exceptions.
  • Global constants and limitssrc/db/common/constants.h defines defaults such as DEFAULT_MEMORY_LIMIT_RATIO = 0.8f, MIN_MEMORY_LIMIT_BYTES, MAX_ARRAY_FIELD_LEN = 32, COMPACT_DELETE_RATIO_THRESHOLD = 0.3f, and name-validation regexes (COLLECTION_NAME_REGEX, FIELD_NAME_REGEX, DOC_PK_REGEX). SDKs reuse these regexes to validate user input before crossing the FFI boundary.
  • Thread pool managementsrc/db/common/global_resource.h exposes GlobalResource::query_thread_pool() and GlobalResource::optimize_thread_pool() as lazily-initialized singletons, allowing SDKs to offload work without each binding re-implementing concurrency.
  • Loggingsrc/db/common/logger.h and src/db/common/glogger.h provide file- and append-style loggers (glog-backed) configured via LogUtil::Init. The CLOG_* / WLOG_* macros in src/db/common/typedef.h scope log lines to a collection or WAL path.
  • RocksDB contextsrc/db/common/rocksdb_context.h wraps rocksdb::DB and per-column-family merge operators. This is the substrate that powers column-family-aware features such as scalar term indexes (relevant to issue #427, where drop_index did not always clean up its column family before a subsequent create_index).
  • Profilersrc/db/common/profiler.h collects per-stage latency into a JSON tree, exposed through bindings to surface query timing breakdowns to SDK users.
  • Resource detectionsrc/db/common/cgroup_util.h reads CPU and memory limits from cgroups/sysinfo and Windows APIs, informing how SDKs size their default caches and worker pools.

src/core/interface/utils/utils.h provides a small templated utility, extract_enum_from_json, that the JSON-driven schema parsers in every binding share so that enum parameters round-trip identically across languages.

Language Bindings at a Glance

BindingDistributionNotes
Pythonpip install zvecSWIG-backed; supports Python 3.10–3.14 (#131)
Node.jsnpm install @zvec/zvecPrebuilt native binaries; known AVX crash on old CPUs (#512)
Gogithub.com/zvec-ai/zvec-goAdded in v0.5.0 (long-requested in #61)
Rustgithub.com/zvec-ai/zvec-rustAdded in v0.5.0
Dart / Flutterflutter pub add zvecAndroid (arm64-v8a) and iOS (arm64); prebuilt libs auto-downloaded (v0.4.0)
ElixirCommunity SDK over the C APIAnnounced in #403
Zvec Studiogithub.com/zvec-ai/zvec-studioVisual GUI on top of the C API

All first-party packages surface the same primitive set: Collection, Doc, CollectionSchema, HnswIndexParam, Query, and Status. Platform support is matrixed: Linux, macOS, Windows (native since v0.3.0, see #23), Android cross-compilation (v0.3.0), and RISC-V (v0.5.0).

AI Extension Framework

Introduced in v0.2.0 (PR #88), the AI extension framework lets applications attach embedding models directly to a collection so that documents are embedded at insert time and queries are embedded at search time, all within the same in-process call. Subsequent releases have extended the framework:

  • v0.2.1 — Added Jina Embeddings v5 integration (#156) and a custom-embedding hook for plugging in user-trained models.
  • v0.5.0 — Combined AI extensions with native Full-Text Search (FTS) to support hybrid retrieval in a single MultiQuery that fuses dense vectors, sparse vectors, scalar filters, and text matches (#408).
  • v0.5.1 — Introduced an External Vector Source (#490) so embeddings can be computed outside the process and re-ingested, and a zero-copy VectorViewClause for queries (#478).

Because the framework runs in-process, the same GlobalResource thread pools and Profiler instrumentation cover the embedding path, so latency and CPU usage appear in the same per-stage JSON tree that the SDKs already expose for vector queries.

Platform Constraints and Common Failure Modes

Two recurring themes appear in community discussions and are worth noting when choosing a binding:

  1. CPU feature detection — v0.1.0 added CPU flag detection and dispatch (PR #3), and the runtime defaults to AVX-enabled code paths. On older hosts without AVX, the Node.js SDK has been reported to crash at load time (#512). Until a non-AVX Node build is published, the workaround is to deploy on hardware that supports AVX, matching the build machine.
  2. Column-family lifecycle for scalar indexes — Issue #427 reports that create_index() on a scalar field can fail after a prior drop_index() because the underlying RocksDB column family (for example, category$TERMS) is not always cleaned up. Bindings that surface a ValueError with an empty message are typically a symptom of this; the RocksdbContext wrapper in src/db/common/rocksdb_context.h is the layer to inspect when patching the lifecycle.

For deployment, the SDKs follow the same rules as the C++ core: a single process owns writes, multiple processes may read the same collection concurrently, and data is durable via the WAL defined alongside the RocksdbContext layer.

See Also

  • Core Concepts & Architecture
  • Index Types: HNSW, DiskANN, IVF, FTS
  • Storage Engine: RocksDB, MMap, and WAL
  • Query & Hybrid Retrieval API

Source: https://github.com/alibaba/zvec / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 9 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: high
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/alibaba/zvec/issues/512

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/alibaba/zvec/issues/427

3. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/alibaba/zvec

4. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: packet_text.keyword_scan | https://github.com/alibaba/zvec

5. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/alibaba/zvec

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/alibaba/zvec

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/alibaba/zvec

8. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/alibaba/zvec

9. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/alibaba/zvec

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 11

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using zvec with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence