# https://github.com/memvid/memvid Project Manual

Generated at: 2026-06-22 20:50:29 UTC

## Table of Contents

- [Overview and System Architecture](#page-1)
- [Core Features, Search and Ingestion](#page-2)
- [Data Operations, Time-Travel and Troubleshooting](#page-3)
- [SDKs, Deployment, Docker and Provider Integration](#page-4)

<a id='page-1'></a>

## Overview and System Architecture

### Related Pages

Related topics: [Core Features, Search and Ingestion](#page-2), [Data Operations, Time-Travel and Troubleshooting](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/memvid/memvid/blob/main/README.md)
- [src/analysis/mod.rs](https://github.com/memvid/memvid/blob/main/src/analysis/mod.rs)
- [src/analysis/auto_tag.rs](https://github.com/memvid/memvid/blob/main/src/analysis/auto_tag.rs)
- [src/analysis/ner.rs](https://github.com/memvid/memvid/blob/main/src/analysis/ner.rs)
- [src/search/tantivy/mod.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/mod.rs)
- [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs)
- [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)
- [src/memvid/search/builders.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/builders.rs)
- [src/memvid/search/fallback.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/fallback.rs)
- [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)
- [src/memvid/search/helpers.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/helpers.rs)
- [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)
- [docker/README.md](https://github.com/memvid/memvid/blob/main/docker/README.md)
</details>

# Overview and System Architecture

## Purpose and Scope

Memvid is a **single-file memory layer for AI agents** that provides persistent, versioned, and portable memory without requiring a traditional database. The project is published as a Rust crate (`memvid-core`) and ships with a companion CLI, both distributed as the `memvid` project. As described in the [README.md](https://github.com/memvid/memvid/blob/main/README.md), the system is designed for "instant retrieval and long-term memory" with the key property that "everything lives in a single `.mv2` file" — no `.wal`, `.lock`, `.shm`, or sidecar files are produced.

The library's public surface revolves around the `Memvid` type (constructed via `Memvid::create` and similar factory functions), which exposes a `put`-then-`commit` ingestion model, a `search` API supporting multiple retrieval strategies, and convenience methods for transcript and embedding workflows. The system targets both local-first development (ONNX-based embeddings, Whisper transcription, CLIP image search) and cloud integration (OpenAI embeddings) through opt-in feature flags.

## Core Architecture

Memvid is organized into several cooperating subsystems, each encapsulated in its own module path:

```mermaid
graph TB
    Client[Client / CLI / SDK]
    Memvid["Memvid Core<br/>(lifecycle.rs)"]
    WAL["Embedded WAL<br/>(crash recovery)"]
    Frames["Frame Store<br/>(data segments)"]
    Lex["Lex Index<br/>(Tantivy / BM25)"]
    Vec["Vec Index<br/>(HNSW)"]
    Time["Time Index"]
    TOC["TOC Footer<br/>(segment offsets)"]
    Analysis["Analysis Pipeline<br/>(auto_tag, ner, temporal)"]

    Client -->|put / search| Memvid
    Memvid --> WAL
    Memvid --> Frames
    Memvid --> Lex
    Memvid --> Vec
    Memvid --> Time
    Memvid --> TOC
    Frames --> Analysis
    Lex -.rerank.-> Memvid
    Vec -.rerank.-> Memvid
```

- **Lifecycle and frame management** is the entry point: `Memvid::create("knowledge.mv2")` allocates a new file, while `mem.put(...)` accepts documents with `PutOptions` (title, URI, tags) and `mem.commit()` finalizes them.
- **Crash-safe write-ahead log**: an embedded WAL provides durability, and a recent fix in v2.0.140 addressed a checksum mismatch that could occur under sustained `put` + `commit` workloads that grew the WAL region (community report #230).
- **Index subsystems** are conditionally compiled: `lex` enables Tantivy-backed full-text search, `vec` enables HNSW-backed vector search, and additional features cover CLIP, Whisper, encryption, and natural-language temporal parsing.

Source: [README.md](https://github.com/memvid/memvid/blob/main/README.md), [src/analysis/mod.rs](https://github.com/memvid/memvid/blob/main/src/analysis/mod.rs)

## Data Model and File Format

All persisted state is stored in a single `.mv2` file with a fixed structural layout. The on-disk regions, in order, are:

| Region          | Size           | Purpose                                       |
| --------------- | -------------- | --------------------------------------------- |
| Header          | 4 KB           | Magic, version, capacity                      |
| Embedded WAL    | 1–64 MB        | Crash recovery                                |
| Data Segments   | Variable       | Compressed frames                             |
| Lex Index       | Variable       | Tantivy full-text index                       |
| Vec Index       | Variable       | HNSW vectors                                  |
| Time Index      | Variable       | Chronological ordering                        |
| TOC (Footer)    | Variable       | Segment offsets for random access             |

The `Frame` type is the fundamental unit of stored content. Each frame carries an ID, status (`Active`/tombstoned), optional `parent_id` for chunked documents, URI, tags, labels, track, timestamp, and metadata (including MIME type). Frame-level MIME classification is used to decide what is text-indexable — `is_frame_text_indexable` skips binary content such as video, image, and audio frames and accepts textual MIME types plus document types like PDFs, Office, and OpenDocument formats.

Source: [README.md](https://github.com/memvid/memvid/blob/main/README.md), [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)

## Search and Retrieval System

The search subsystem is structured around a *primary* vector-or-hybrid path with a *fallback* lexical path, plus an optional **sketch** pre-filter for fast candidate generation.

**Hybrid ranking pipeline** ([src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)):

1. If sketch indexes are available, a Hamming-distance sketch pre-filter narrows the candidate set before lexical/vector scoring.
2. `try_tantivy_search` ([src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)) is attempted first when the Tantivy engine is loaded; the engine is built in [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs) using a stemmed tokenizer and indexed fields for content, timestamp, frame id, tags, labels, track, and URI.
3. If Tantivy is unavailable or returns no usable result, `search_with_lex_fallback` ([src/memvid/search/fallback.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/fallback.rs)) performs BM25-style lexical matching directly.
4. The `builders.rs` module coordinates the construction and decoding of the `LexIndex` and `VecIndex` artifacts; `build_vec_artifact` rebuilds the vector index from active frames when needed.
5. Search hits are enriched with temporal metadata (when the `temporal_track` feature is enabled) and entity annotations from the Logic-Mesh in [src/memvid/search/helpers.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/helpers.rs), which propagates entities from parent documents down to child chunks.

The `analysis` module provides the supporting intelligence: `auto_tag.rs` derives tags and labels via token frequency and regex phrase extraction, and `ner.rs` runs an ONNX-based NER model to extract entities for the Logic-Mesh.

Source: [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs), [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs), [src/analysis/auto_tag.rs](https://github.com/memvid/memvid/blob/main/src/analysis/auto_tag.rs)

## Deployment and Community Context

The project ships a multi-architecture Docker image (`memvid/cli`, supporting `linux/amd64` and `linux/arm64`) that runs as a non-root user, with images automatically published to Docker Hub via GitHub Actions as documented in [docker/README.md](https://github.com/memvid/memvid/blob/main/docker/README.md). The README links to a hosted sandbox at `sandbox.memvid.com` and dedicated documentation at `docs.memvid.com`.

Community discussion has surfaced several recurring themes that frame realistic expectations of the architecture:

- **Scale and benchmark transparency** — Issue #42 requests reproducible benchmarks for "real-world" workloads beyond the bundled single-PDF example, given the system's claim of being a cloud-vector-database replacement.
- **Local LLM integration** — Issue #23 asks for `ollama`-style local model support alongside the existing cloud embedding path.
- **Compatibility with proxy/gateway endpoints** — Issue #202 proposes custom base URLs for OpenAI- and Gemini-compatible providers (LiteLLM, enterprise proxies).
- **Critical evaluation** — Issue #43 links to a third-party critique ([janekm/retrieval_comparison](https://github.com/janekm/retrieval_comparison/blob/main/memvid_critique.md)) comparing memvid against alternative retrieval stacks; readers evaluating the system are encouraged to review this alongside the official benchmarks.
- **Durability hardening** — The v2.0.140 release fixed a WAL checksum mismatch that could be triggered by workloads that grow the embedded WAL region, demonstrating the project's active maintenance of the crash-recovery path.

Source: [docker/README.md](https://github.com/memvid/memvid/blob/main/docker/README.md), [README.md](https://github.com/memvid/memvid/blob/main/README.md)

## See Also

- File Format Specification (`MV2_SPEC.md`)
- Search and Retrieval Subsystem
- Analysis Pipeline (NER, Auto-tagging, Temporal)
- Deployment and CLI Reference

---

<a id='page-2'></a>

## Core Features, Search and Ingestion

### Related Pages

Related topics: [Overview and System Architecture](#page-1), [Data Operations, Time-Travel and Troubleshooting](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/memvid/memvid/blob/main/README.md)
- [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)
- [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)
- [src/memvid/search/builders.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/builders.rs)
- [src/memvid/search/helpers.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/helpers.rs)
- [src/memvid/search/fallback.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/fallback.rs)
- [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)
- [src/search/tantivy/mod.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/mod.rs)
- [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs)
- [src/analysis/mod.rs](https://github.com/memvid/memvid/blob/main/src/analysis/mod.rs)
- [src/analysis/auto_tag.rs](https://github.com/memvid/memvid/blob/main/src/analysis/auto_tag.rs)
</details>

# Core Features, Search and Ingestion

## Overview

Memvid is a single-file memory layer for AI agents that combines persistent storage, full-text search, vector similarity, and chronological retrieval inside one `.mv2` container. The library is organized around two core workflows: **ingestion** (writing frames via `put`/`commit`) and **search** (lexical, vector, hybrid, and adaptive retrieval against the same container). Feature flags gate optional capabilities such as BM25, HNSW vectors, CLIP, Whisper transcription, OpenAI embeddings, and encryption (`Source: [README.md]()`).

The project explicitly avoids sidecar files — there is no `.wal`, `.lock`, or `.shm` file. Instead, an embedded WAL lives inside the container, which has historically caused checksum-corruption issues during sustained workloads (see [community issue #230 in v2.0.140](https://github.com/memvid/memvid/releases) regarding `grow_wal_region` and WAL integrity).

## Single-File Container Layout

The `.mv2` file is a self-contained archive with multiple regions:

```mermaid
graph TD
    A[Header 4KB<br/>Magic, version, capacity] --> B[Embedded WAL 1-64MB<br/>Crash recovery]
    B --> C[Data Segments<br/>Compressed frames]
    C --> D[Lex Index<br/>Tantivy full-text]
    D --> E[Vec Index<br/>HNSW vectors]
    E --> F[Time Index<br/>Chronological ordering]
    F --> G[TOC Footer<br/>Segment offsets]
```

Source: [README.md]()

Each frame written through `put` becomes a compressed segment, and the in-memory builder layer reconstructs the Lex/Vec index artifacts on demand (`Source: [src/memvid/search/builders.rs]()`).

## Ingestion: Put, Commit, and Auto-Tagging

`Memvid::create("knowledge.mv2")` opens a container; `Memvid::put` adds documents with optional title, URI, tags, and metadata. Frames are batched until `commit`, which finalizes segments and flushes indices (`Source: [README.md]()`).

Auto-tagging enriches ingested text by extracting the most frequent meaningful tokens. The implementation in `auto_tag.rs` lowercases the input, applies a stopword filter, drops tokens shorter than three characters, and ranks the remainder by descending frequency. It also derives uppercase-phrase labels (e.g., `Meeting Notes`) using a regex anchored to line starts (`Source: [src/analysis/auto_tag.rs]()`). The analysis module re-exports `auto_tag` alongside NER and temporal enrichment (`Source: [src/analysis/mod.rs]()`).

For audio/video inputs, the `whisper` feature runs Whisper transcription with configurable model sizes (`whisper-tiny-en-q8k` for fast/constrained use, `whisper-tiny-en` for balance, or the default FP32 small model for highest accuracy) (`Source: [README.md]()`). CLIP embeddings enable image search through the `clip` feature.

## Search: Lexical, Vector, Hybrid, and Adaptive

The search layer is composed in `src/memvid/search/mod.rs`. The top-level dispatcher decides which engine to use based on enabled features and query characteristics. When a sketch track is available, it generates a candidate set via Hamming distance (threshold 32, up to `top_k * 10` candidates, minimum 500), which is then intersected with any existing filters before passing the reduced set to BM25 (`Source: [src/memvid/search/mod.rs]()`).

### Tantivy BM25 Path

If the `lex` feature is enabled, `try_tantivy_search` runs the BM25/Tantivy engine. Tokens from the query are stemmed through `engine.analyse_text` so that they match the indexed forms (e.g., `technology` → `technolog`). The engine is initialised with a single-threaded writer for deterministic index generation (`Source: [src/search/tantivy/engine.rs]()`). Each indexed document carries the frame content, timestamp, frame ID, tags, labels, track, and URI as separate fields (`Source: [src/search/tantivy/engine.rs]()`). The Tantivy module re-exports `TantivyEngine`, `TantivySnapshot`, and `TantivyDocHit` for downstream use (`Source: [src/search/tantivy/mod.rs]()`).

### Fallback Path

If Tantivy is not available, `search_with_lex_fallback` walks a `LexIndex` directly. Matches are computed per token, then snippet windows are sliced from each document. Stale frames are skipped with `stale_skips` counted on the response (`Source: [src/memvid/search/fallback.rs]()`).

### Hit Scoring and Context Assembly

After the engine returns raw hits, `reorder_hits_by_token_matches` re-ranks them by unique-token count, total occurrences, and tightest span (`Source: [src/memvid/search/helpers.rs]()`). `build_context` then groups hits by base URI (split on `#`) and assembles a diverse context capped at 24 hits to avoid overwhelming the LLM with noise (`Source: [src/memvid/search/helpers.rs]()`). When the `temporal_track` feature is enabled, `attach_temporal_metadata` annotates hits with parsed timestamps, and entity enrichment pulls NER tags from both the chunk frame and its parent document (`Source: [src/memvid/search/helpers.rs]()`).

### Adaptive Vector Search

`search_adaptive` and `search_adaptive_acl` inspect the relevance score distribution to return all relevant results while excluding noise — important when answers span multiple chunks or when score spread varies by query (`Source: [src/memvid/search/api.rs]()`). This complements the fixed `top_k` retrieval by handling multi-chunk answers that would otherwise be truncated.

### Indexability Gate

`is_frame_text_indexable` ensures only Active frames with text-classified MIME types are pushed to the search index. It accepts a curated list of `application/*` text types (JSON, XML, YAML, TOML, SQL, etc.), Office document subtypes, and any `+xml`/`+json` suffix while explicitly skipping binary MIME types such as video, image, and audio (`Source: [src/memvid/search/api.rs]()`).

## Community Notes and Known Limits

- **Real-world scale claims (issue #42)**: The default example only ships a single PDF, which has prompted users to benchmark larger corpora to validate vector-DB replacement claims.
- **OpenAI-compatible gateways (issue #202)**: Users have requested custom `base_url` support for OpenAI and Gemini embeddings to plug into LiteLLM and enterprise proxies — relevant for the `api_embed` ingestion path.
- **Local LLMs (issue #23)**: Ollama integration has been requested as an alternative to cloud embeddings.
- **WAL corruption (v2.0.140)**: A region-growth bug caused `wal record checksum mismatch` errors; the fix patches `grow_wal_region`/`ensure_wal_capacity` so the data region is shifted correctly and offsets stay aligned.

## See Also

- File format specification: `MV2_SPEC.md`
- Feature flag reference: `README.md` § Feature Flags
- Tantivy engine internals: `src/search/tantivy/engine.rs`
- Docker deployment: `docker/README.md`

---

<a id='page-3'></a>

## Data Operations, Time-Travel and Troubleshooting

### Related Pages

Related topics: [Overview and System Architecture](#page-1), [Core Features, Search and Ingestion](#page-2)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/memvid/memvid/blob/main/README.md)
- [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)
- [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)
- [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)
- [src/memvid/search/fallback.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/fallback.rs)
- [src/memvid/search/builders.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/builders.rs)
- [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs)
- [src/analysis/mod.rs](https://github.com/memvid/memvid/blob/main/src/analysis/mod.rs)
- [src/analysis/ner.rs](https://github.com/memvid/memvid/blob/main/src/analysis/ner.rs)
- [src/analysis/auto_tag.rs](https://github.com/memvid/memvid/blob/main/src/analysis/auto_tag.rs)
</details>

# Data Operations, Time-Travel and Troubleshooting

## Overview

Memvid is a single-file memory layer for AI agents that bundles persistent frames, embedded WAL, lexical and vector indexes, and chronological metadata into one `.mv2` capsule [Source: [README.md](https://github.com/memvid/memvid/blob/main/README.md)]. The `Data Operations, Time-Travel and Troubleshooting` surface covers three concerns that fall out of that design:

1. **Data operations** — the `put`, `commit`, and `search` lifecycle that adds and reads frames in the capsule.
2. **Indexing and analysis** — the Tantivy lexical engine, vector builder, NER pipeline, and auto-tagging used to enrich search hits.
3. **Troubleshooting** — observable failure modes that surface in the code, including the WAL corruption regression tracked in release v2.0.140.

The goal of this page is to describe how those three concerns interlock, what APIs and engines are available, and which failure modes are documented in the source.

## Data Operations: Put, Commit, and Search

The public surface is intentionally narrow. `Memvid::create(path)` opens a new capsule, `put_bytes_with_options(...)` appends a frame with metadata (title, URI, tags, labels), and `commit()` flushes the embedded WAL and index segments [Source: [README.md](https://github.com/memvid/memvid/blob/main/README.md)]. Reads go through `Memvid::search(SearchRequest)`, which returns a ranked list of `SearchHit` records with snippets, ranks, and metadata.

Search execution is layered. A request first dispatches to a sketch-based pre-filter when the `lex` feature is enabled, which generates candidates in microseconds and then hands them off to either the Tantivy engine or the lex BM25 fallback for reranking [Source: [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)]. The sketch pre-filter is parameterized by `hamming_threshold` and `max_candidates`, which is set to `top_k * 10` (or 500) to keep recall high before BM25 prunes noise [Source: [src/memvid/search/mod.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/mod.rs)].

Adaptive retrieval is exposed through the `search_with_adaptive` flow, which dynamically expands `top_k` based on score distribution rather than returning a fixed window [Source: [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)]. This matters when answers span multiple chunks of the same document, because a hard cut-off can drop the supporting context.

```mermaid
flowchart LR
    A[Client] -->|put_bytes_with_options| B[Memvid capsule]
    B -->|commit| C[Embedded WAL + Indexes]
    A -->|search| D[Sketch pre-filter]
    D -->|candidates| E{Tantivy available?}
    E -->|yes| F[Tantivy BM25 + snippets]
    E -->|no| G[Lex BM25 fallback]
    F --> H[SearchResponse]
    G --> H[SearchResponse]
```

The `search` API also enforces ACL filtering and, when `temporal_track` is compiled in, attaches RFC3339 timestamps and date mentions to each hit's metadata [Source: [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs), [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)].

## Indexing, Analysis, and Time-Aware Enrichment

Three indexing engines can be active simultaneously, and the capsule decides which to use per query:

- **Tantivy lexical engine** — created in a temp directory with a single-writer thread for deterministic index generation; documents are tokenized, stemmed, and stored with frame id, timestamp, tags, labels, track, and URI fields [Source: [src/search/tantivy/engine.rs](https://github.com/memvid/memvid/blob/main/src/search/tantivy/engine.rs)]. Search uses stemmed tokens so that the index and query analyzers agree [Source: [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)].
- **Lex BM25 fallback** — used when Tantivy is unavailable; computes matches and snippets directly from the embedded lex index [Source: [src/memvid/search/fallback.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/fallback.rs)].
- **HNSW vector index** — built incrementally as embeddings arrive; `build_vec_artifact` reuses the prior index entries that point to active frames and adds new ones [Source: [src/memvid/search/builders.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/builders.rs)].

Index artifacts are produced via `LexIndex::decode` and `VecIndex::decode` after their respective builders finish, then attached to segments in the capsule footer [Source: [src/memvid/search/builders.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/builders.rs)].

Analysis runs alongside indexing and is exposed through the `analysis` module [Source: [src/analysis/mod.rs](https://github.com/memvid/memvid/blob/main/src/analysis/mod.rs)]:

- **NER (`ner.rs`)** — loads an ONNX session and a Hugging Face tokenizer, pads to the longest in batch, and decodes logits of shape `[1, seq_len, num_labels]` into `ExtractedEntity` records with byte offsets and a configurable `min_confidence` [Source: [src/analysis/ner.rs](https://github.com/memvid/memvid/blob/main/src/analysis/ner.rs)].
- **Auto-tagging (`auto_tag.rs`)** — lowercases tokens, drops short words and a hard-coded stopword set, counts frequencies, and returns the top-N labels. It also derives title-case phrases from line-anchored `^([A-Z][A-Za-z0-9 &/-]{3,})$` patterns to seed human-readable labels.
- **Temporal enrichment** — gated behind the `temporal_track` and `temporal_enrich` features, surfaces RFC3339 timestamps and detected date mentions per hit.

Search hits are enriched after the engine returns results: entities are pulled from the Logic-Mesh for the hit's frame, and if the frame is a `DocumentChunk` the parent document frame is consulted, since NER runs on full documents [Source: [src/memvid/search/helpers.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/helpers.rs)]. Token occurrences and match metrics are also computed to support tighter reranking by `reorder_hits_by_token_matches` [Source: [src/memvid/search/helpers.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/helpers.rs)].

## Troubleshooting and Known Failure Modes

Several failure modes are visible in the codebase and the community thread.

**WAL corruption after region growth (v2.0.140).** Sustained `put` + `commit` workloads that grew the embedded WAL surfaced as `Embedded WAL is corrupted at offset N: wal record checksum mismatch`, sometimes paired with runaway sparse-file growth past EOF. The root cause was `grow_wal_region` / `ensure_wal_capacity` shifting the data region by `delta` and updating offsets without re-checksumming the moved records. The community release notes call this out as a regression that must be patched before relying on Memvid for long-running ingestion [Source: community context, latest release v2.0.140].

**WAL vs. sidecar expectations.** Memvid explicitly avoids `.wal`, `.lock`, `.shm`, or sidecar files — everything lives in the `.mv2` header, embedded WAL, data segments, lex/vec/time indexes, and a TOC footer [Source: [README.md](https://github.com/memvid/memvid/blob/main/README.md)]. Operators debugging on-disk artifacts should expect a single file; missing sidecars are by design, not a bug.

**Benchmark and scale questions.** Issue #42 asks for benchmarks on encoding times and the real-world performance delta versus cloud vector databases, since the example corpus is a single PDF [Source: community context #42]. Issue #43 ("Are the claims actually what you claim?") points to an external critique of retrieval quality [Source: community context #43]. Both motivate running `cargo test --release` and the `examples/` programs (including `examples/openai_embedding.rs`) against a representative corpus before declaring parity with managed vector stores.

**Local LLM and provider routing.** Issue #23 asks about Ollama support and #202 asks for custom base URLs for OpenAI/Gemini-compatible gateways, indicating that the embedding and ask flows need to route through user-supplied endpoints for offline and enterprise setups [Source: community context #23, #202]. Until those land, callers must rely on the `api_embed` feature and the default OpenAI endpoint documented in `README.md`.

**Tantivy indexing hygiene.** Because Tantivy stems during indexing, a mismatch between the indexer and the search-time analyzer produces empty result sets; the `try_tantivy_search` path explicitly re-stems query tokens before evaluating [Source: [src/memvid/search/tantivy.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/tantivy.rs)].

## See Also

- [README.md](https://github.com/memvid/memvid/blob/main/README.md) — feature flags, quick start, file format
- [MV2_SPEC.md](https://github.com/memvid/memvid/blob/main/MV2_SPEC.md) — capsule file format specification
- [examples/openai_embedding.rs](https://github.com/memvid/memvid/blob/main/examples/openai_embedding.rs) — cloud embedding flow

---

<a id='page-4'></a>

## SDKs, Deployment, Docker and Provider Integration

### Related Pages

Related topics: [Core Features, Search and Ingestion](#page-2), [Data Operations, Time-Travel and Troubleshooting](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/memvid/memvid/blob/main/README.md)
- [docker/README.md](https://github.com/memvid/memvid/blob/main/docker/README.md)
- [docker/core/README.md](https://github.com/memvid/memvid/blob/main/docker/core/README.md)
- [docker/cli/README.md](https://github.com/memvid/memvid/blob/main/docker/cli/README.md)
- [docker/cli/Dockerfile](https://github.com/memvid/memvid/blob/main/docker/cli/Dockerfile)
- [docker/cli/TESTING.md](https://github.com/memvid/memvid/blob/main/docker/cli/TESTING.md)
- [src/analysis/ner.rs](https://github.com/memvid/memvid/blob/main/src/analysis/ner.rs)
- [src/memvid/search/api.rs](https://github.com/memvid/memvid/blob/main/src/memvid/search/api.rs)
</details>

# SDKs, Deployment, Docker and Provider Integration

## Overview

Memvid ships as a multi-surface library: a Rust crate (`memvid-core`), a command-line binary, and containerized images. The same single-file `.mv2` format is shared across all surfaces, so a memory file produced in one can be read in another without translation. This page covers the SDK shape, the Docker deployment story, and the provider integration points (cloud embeddings, local models, NER).

The crate is published on crates.io as `memvid-core` ([README.md:1-60]()), and the CLI image is published to Docker Hub under `memvid/cli` with multi-architecture support for `linux/amd64` and `linux/arm64` ([docker/README.md:1-50]()). 

## SDK Distribution and Feature Flags

The Rust SDK is feature-gated so consumers only pay for the capabilities they need. Feature flags are documented in the project README and are the primary knob for tuning build size, dependencies, and runtime behavior.

| Feature             | Description                                                      |
| ------------------- | ---------------------------------------------------------------- |
| `lex`               | Full-text search with BM25 ranking (Tantivy)                     |
| `pdf_extract`       | Pure Rust PDF text extraction                                    |
| `vec`               | Vector similarity search (HNSW + local text embeddings via ONNX) |
| `clip`              | CLIP visual embeddings for image search                          |
| `whisper`           | Audio transcription with Whisper                                 |
| `api_embed`         | Cloud API embeddings (OpenAI)                                    |
| `temporal_track`    | Natural language date parsing ("last Tuesday")                   |
| `parallel_segments` | Multi-threaded ingestion                                         |
| `encryption`        | Password-based encryption capsules (.mv2e)                       |
| `symspell_cleanup`  | Robust PDF text repair (fixes "emp lo yee" -> "employee")        |

Source: [README.md:120-180]()

Consumers enable features selectively in `Cargo.toml`:

```toml
[dependencies]
memvid-core = { version = "2.0", features = ["lex", "vec", "temporal_track"] }
```

The high-level `Memvid` API (`Memvid::create`, `put`, `commit`, `search`) stays stable across feature combinations; gating is mostly internal ([README.md:60-110]()).

## Docker Deployment

Memvid publishes two Docker images: a development image for the core crate, and a CLI image for production usage. Both follow a non-root pattern and are wired for GitHub Actions CI/CD.

### Image Layout

```mermaid
graph LR
  A[Source: memvid/memvid] --> B[core Dockerfile]
  A --> C[cli Dockerfile]
  B --> D[memvid-core:dev]
  C --> E[memvid/cli:latest]
  C --> F[memvid/cli:2.0.129]
  D --> G[cargo build / cargo test]
  E --> H[CLI usage]
  F --> H
```

### Core Development Image

The core image is intended for running examples, tests, and benchmarks against the Rust crate. It mounts the source tree and a cargo cache to keep iteration fast. The image is configured for OOM-resistance testing with explicit `--memory` limits ([docker/core/README.md:1-60]()).

```bash
# Development container
docker-compose exec dev cargo run --example basic_usage

# With features
docker-compose exec dev cargo run --example pdf_ingestion --features lex,pdf_extract

# Memory-constrained OOM test
docker run --rm --memory=150m --memory-swap=150m \
  -v $(pwd):/app \
  memvid-test cargo test --features encryption --test encryption_capsule
```

Source: [docker/core/README.md:1-60]()

### CLI Image

The CLI image wraps the compiled `memvid` binary and runs as a non-root user named `memvid` for improved security. Images are automatically published to Docker Hub on tag push via the workflow in `.github/workflows/docker-release.yml`. Tags follow `latest` plus version-specific values like `2.0.129` ([docker/README.md:1-50]()).

```bash
cd cli
docker build -t memvid/cli:test .
```

Source: [docker/README.md:1-50]()

## Cloud Provider Integration

### OpenAI-Compatible Endpoints

The `api_embed` feature enables OpenAI cloud embeddings, with documented model choices for different quality/cost trade-offs ([README.md:140-200]()).

| Model                      | Dimensions | Best For                   |
| -------------------------- | ---------- | -------------------------- |
| `text-embedding-3-small`   | 1536       | Default, fastest, cheapest |
| `text-embedding-3-large`   | 3072       | Highest quality            |
| `text-embedding-ada-002`   | 1536       | Legacy model               |

A complete reference example is shipped at `examples/openai_embedding.rs`. The `api_embed` flag is the only feature required to enable this path; no other model files need to be downloaded.

### Custom Base URLs (LiteLLM, Gateways, Proxies)

Community request #202 ("Support custom provider base URLs for OpenAI and Gemini") tracks the need to point the embedding and ask flows at non-default endpoints — LiteLLM proxies, OpenAI-compatible gateways, enterprise AI proxies, and Gemini-compatible routed hosts. Memvid's cloud path is the integration surface where such routing will land; consumers wanting the feature today should follow the issue for the planned API shape.

Source: [README.md:140-200](), Issue [#202](https://github.com/memvid/memvid/issues/202)

## Local Model Options

### Text Embeddings (ONNX)

The `vec` feature bundles local text embeddings via ONNX. The default model is BGE-small (384 dimensions), and it must be downloaded manually before first use ([README.md:80-140]()).

```bash
mkdir -p ~/.cache/memvid/text-models
curl -L 'https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/...'
```

Larger BGE models are supported by overriding the cache path; quantization trade-offs are exposed through the configuration layer.

### Audio Transcription (Whisper)

The `whisper` feature adds offline transcription with multiple model size/quantization presets selected via the `MEMVID_WHISPER_MODEL` environment variable ([README.md:50-120]()).

| Model                  | Size  | Speed   | Best For                              |
| ---------------------- | ----- | ------- | ------------------------------------- |
| `whisper-tiny-en`      | 75 MB | Fast    | Balanced                              |
| `whisper-tiny-en-q8k`  | 19 MB | Fastest | Quick testing, resource-constrained  |

Programmatic configuration goes through `WhisperConfig::with_quantization()` or `WhisperConfig::with_model(name)`.

### Named Entity Recognition

The NER subsystem loads ONNX-based models at runtime. `NerModel::load` takes a model path, tokenizer path, and an optional minimum confidence threshold (default 0.5), and is gated on the NER feature. Tokenizer padding uses `BatchLongest` strategy and right-direction padding with `[PAD]` as the pad token ([src/analysis/ner.rs:1-80]()).

```mermaid
graph TD
  A[Memvid::put] --> B[Text Frames]
  B --> C[NER Analysis]
  C --> D[Entity Frames]
  B --> E[Lex Index / Tantivy]
  B --> F[Vec Index / HNSW]
  C --> G[Search Enrichment]
  E --> G
  F --> G
```

## Operational Notes and Known Issues

- **Provider routing**: Until custom base URLs ship (tracked in #202), consumers using LiteLLM or enterprise proxies must run their own local OpenAI-compatible relay and point standard environment variables at it.
- **Local LLM support**: Issue #23 requests first-class Ollama support; today, only embeddings and transcription are local — generation must still flow through a remote or user-managed LLM.
- **WAL durability**: v2.0.140 fixes a WAL checksum mismatch (#230) where sustained `put` + `commit` workloads could corrupt the embedded WAL during region growth. Upgrading to v2.0.140 is recommended for any deployment with heavy sustained writes.
- **Container security**: Both Docker images run as a non-root user; mounted host directories should be granted appropriate permissions for the `memvid` UID/GID ([docker/README.md:1-50]()).

## See Also

- [Architecture and File Format](Architecture-and-File-Format.md)
- [Search and Retrieval](Search-and-Retrieval.md)
- [Encryption Capsules](Encryption-Capsules.md)
- [CLI Reference](CLI-Reference.md)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: memvid/memvid

Summary: Found 16 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/218

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/230

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/234

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/225

## 5. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/215

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/210

## 7. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/100

## 8. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/131

## 9. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/memvid/memvid

## 10. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/222

## 11. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memvid/memvid

## 12. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/memvid/memvid

## 13. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/memvid/memvid

## 14. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/memvid/memvid/issues/219

## 15. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memvid/memvid

## 16. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/memvid/memvid

<!-- canonical_name: memvid/memvid; human_manual_source: deepwiki_human_wiki -->