# https://github.com/cozodb/cozo Project Manual

Generated at: 2026-06-24 00:52:12 UTC

## Table of Contents

- [Project Overview and Architecture](#page-1)
- [Query Engine and Datalog Language](#page-2)
- [Storage Backends and Persistence](#page-3)
- [Language Bindings, Server, and Advanced Search Features](#page-4)

<a id='page-1'></a>

## Project Overview and Architecture

### Related Pages

Related topics: [Query Engine and Datalog Language](#page-2), [Storage Backends and Persistence](#page-3), [Language Bindings, Server, and Advanced Search Features](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md)
- [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md)
- [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs)
- [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs)
- [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs)
- [cozo-lib-c/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/README.md)
- [cozo-lib-java/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-java/src/lib.rs)
- [cozo-lib-nodejs/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/src/lib.rs)
- [cozo-lib-nodejs/package.json](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/package.json)
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md)
- [cozorocks/README.md](https://github.com/cozodb/cozo/blob/main/cozorocks/README.md)
- [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs)
- [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs)
</details>

# Project Overview and Architecture

## What is CozoDB

Cozo is a general-purpose, transactional, relational-database/graph-database hybrid that uses Datalog as its query language. The project positions Datalog as a powerful alternative to SQL or graph query languages, with the ability to express recursive queries, perform graph traversals, and integrate advanced index types such as HNSW vectors and MinHash-LSH near-duplicate search.

As described in the main [README.md](https://github.com/cozodb/cozo/blob/main/README.md), the latest version (v0.7) extends the v0.6 release by adding MinHash-LSH for near-duplicate search, full-text search, and JSON value support. The project is dual-licensed under MPL-2.0 and a commercial license, and offers language bindings for Rust, Python, Node.js, Java, C, Swift, and Android, plus a WebAssembly build.

## High-Level Architecture

Cozo is organized as a Rust core engine wrapped by language-specific bindings and an HTTP server. Storage is delegated to pluggable backends, and queries are expressed in CozoScript (a Datalog dialect) and submitted to a `DbInstance`.

```mermaid
graph TB
    A[Client Applications] -->|REST / HTTP| B[cozo-bin server]
    A2[Python / JS / Java / C / Swift] -->|FFI / JNI / N-API| C[cozo-core]
    A3[Browser] -->|WASM| C
    C --> D[CozoScript Datalog Engine]
    D --> E[Storage Abstraction]
    E --> F[(Memory)]
    E --> G[(SQLite)]
    E --> H[(RocksDB via cozorocks)]
    I[REPL] --> C
```

The layered design allows the same engine to run in-process inside any host language, as an HTTP service, or inside a browser tab.

Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md) and [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs) confirm that the binary crate exposes a server with HTTP endpoints such as `POST /text-query`, `GET /export/{relations}`, `PUT /import`, `POST /backup`, and `POST /import-from-backup`, plus an experimental Server-Sent Events stream at `GET /changes/{relation}` for change notifications.

## Core Engine and Storage

The core engine is implemented in Rust under the `cozo` crate. Query parsing, optimization, and execution all happen there, and the engine exposes a small handle-based API: language bindings request a `DbInstance` for a given engine name (e.g. `mem`, `sqlite`, `rocksdb`) and a path, then submit scripts to it.

Storage is pluggable. Three backends ship in the repository tree:

| Backend   | Purpose                                            | Source location |
|-----------|----------------------------------------------------|-----------------|
| `mem`     | In-memory engine, useful for tests and REPL demos  | [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs) (default for the CLI) |
| `sqlite`  | Persistent single-file storage                     | Referenced as an `engine` option in [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs) |
| `rocksdb` | High-performance embedded KV store via FFI         | [cozorocks/README.md](https://github.com/cozodb/cozo/blob/main/cozorocks/README.md) and [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs) |

The `cozorocks` crate is described as "Bindings to RocksDB's C++ API" (Source: [cozorocks/README.md](https://github.com/cozodb/cozo/blob/main/cozorocks/README.md)). It wraps RocksDB primitives such as `DbBuilder`, `RocksDb`, transactions, iterators, and SST file writers behind a Rust-friendly surface, and exposes the underlying C++ errors as a `RocksDbStatus` struct (Source: [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs) and [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs)). The DB builder exposes a fluent API for tuning parallelism, bloom filters, blob files, block cache size, and other RocksDB options (Source: [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs)).

## Language Bindings

The same engine is reused across host languages through thin FFI/JNI layers. Each binding keeps a thread-safe map of integer handles to `DbInstance` values, with a global atomic counter to mint new IDs.

- **C**: The header is shipped as a single `cozo_c.h` file; an example program is provided in `example.c`. Functions such as `cozo_open` accept an `engine` string (`"mem"`, `"sqlite"`, `"rocksdb"`), a UTF-8 path, and a JSON options string. Errors are returned as heap-allocated C strings that must be released with `cozo_free_str` (Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs) and [cozo-lib-c/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/README.md)).
- **Java/Android**: A `Handles` struct mirrors the C design; `Java_org_cozodb_CozoJavaBridge_openDb` takes engine, path, and options strings and returns a database ID (Source: [cozo-lib-java/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-java/src/lib.rs)). The Maven coordinates are published as `io.github.cozodb:cozo-clj` and `io.github.cozodb:cozo_android` (Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md)).
- **Node.js**: The package `cozo-node` (version 0.7.6) is published with prebuilt native binaries downloaded via `node-pre-gyp` (Source: [cozo-lib-nodejs/package.json](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/package.json)). Rust code converts `DataValue` and `NamedRows` to JavaScript values through a small helper module (Source: [cozo-lib-nodejs/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/src/lib.rs)).
- **WebAssembly**: The published module is built with `wasm-pack build --target web --release` and requires `CARGO_PROFILE_RELEASE_LTO=fat`; the `--target no-modules` option is suggested for cross-browser web workers (Source: [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md)).

## CLI, REPL, and HTTP Server

The `cozo-bin` crate ships two executables: an interactive REPL and an HTTP server. The REPL is built on `rustyline` and supports a set of meta-commands (`%set`, `%unset`, `%clear`, `%params`, `%run`, `%import`, `%save`, `%backup`, `%restore`) described in [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md). The REPL opens a database with the chosen engine, path, and config, and installs a `ctrlc` handler so the engine shuts down cleanly on Ctrl+C (Source: [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs)).

The HTTP server uses `axum` and `tower-http` (CORS, compression, optional auth layer) and exposes a `POST /text-query` endpoint that accepts a JSON body shaped as `{"script": "<COZOSCRIPT QUERY STRING>", "params": {}}` (Source: [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs)). For safety, parameterized queries are strongly preferred over string concatenation.

## Community Context and Known Concerns

Several recurring community topics are worth noting for new contributors and users:

- **Maintenance status**: Issue [#301](https://github.com/cozodb/cozo/issues/301) raised the question "Is cozo still being maintained?" because of a long gap between commits; the most recent tagged release is **v0.7.6** (Source: [cozo-lib-nodejs/package.json](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/package.json)) and primarily contained a documentation update for `cozo_run_query`.
- **Vector and LLM integration**: Issue [#238](https://github.com/cozodb/cozo/issues/238) requested a LangChain/LlamaIndex vector store backed by Cozo. This aligns with the v0.6 HNSW vector search feature highlighted in [README.md](https://github.com/cozodb/cozo/blob/main/README.md), which is designed to integrate seamlessly with Datalog joins and recursive queries.
- **Binary embeddings**: Issue [#256](https://github.com/cozodb/cozo/issues/256) requested support for binary / int8 embeddings alongside HNSW indices. The current codebase exposes HNSW as a Datalog index, but binary quantisation is not yet wired in.
- **Cross-device sync / CRDT**: Issues [#240](https://github.com/cozodb/cozo/issues/240) and [#252](https://github.com/cozodb/cozo/issues/252) discuss CRDT-based synchronization between Cozo instances, referencing `cr-sqlite` and `datalog-crdt`. As of the source tree, no CRDT layer ships in the repository; the change-feed endpoint (`GET /changes/{relation}`) documented in [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md) is currently marked experimental and is the closest facility for replication.

## See Also

- [Query Language (CozoScript)](https://docs.cozodb.org/en/latest/index.html)
- [HTTP API Reference](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md)
- [HNSW Vector Search](https://docs.cozodb.org/en/latest/releases/v0.6.html)
- [RocksDB Engine Options](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs)

---

<a id='page-2'></a>

## Query Engine and Datalog Language

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Storage Backends and Persistence](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md)
- [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md)
- [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs)
- [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs)
- [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs)
- [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md)
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md)
- [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs)
- [cozorocks/src/bridge/mod.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/mod.rs)
</details>

# Query Engine and Datalog Language

## Overview and Purpose

CozoDB is a relational, transactional, graph-oriented database whose query interface is built around **Datalog**, a declarative logic-programming language that subsumes relational queries while making recursion first-class. The query engine and its Datalog dialect are positioned as the project's main user-facing surface: every binding (Rust, C, Node.js, WebAssembly, the standalone `cozo` binary, and the embedded HTTP server) ultimately dispatches CozoScript strings to the same engine via the `run` / `run_script` entry points.

According to the project README, Cozo's Datalog is "supercharged" beyond classical Datalog by allowing "recursion through a safe subset of aggregations" and by shipping canned algorithms (e.g., PageRank) for common graph recursions. The engine also supports **time travel** at the relation level: users opt in for each relation, and past states can be queried as if the database were immutable. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

The engine is designed to be composable. Rules can be defined and reused, and the language deliberately rejects the monolithic style of nested `SELECT-FROM-WHERE` queries, instead encouraging decomposition of queries into named, reusable rules.

## Datalog Language Characteristics

Datalog in CozoDB differs from SQL along several axes that the README highlights:

| Feature | Cozo Datalog | Notes |
|---|---|---|
| Expressiveness | Full relational queries plus recursion | Recursion is a primitive, not a CTE workaround |
| Composability | Rules act like functions | Queries can be built piece by piece |
| Time travel | Per-relation opt-in | Historical views are queryable |
| Recursion | Allowed over a safe subset of aggregations | Enables algorithms like PageRank |
| Graph integration | Recursive Datalog + graph algorithms | Property graph features are layered on the relational model |

Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

The README frames the choice of Datalog over SQL as deliberate: "Recursion in Datalog is much easier to express, much more powerful, and usually runs faster than in SQL." This is particularly relevant for graph workloads, where piercing insights "come from graph structures _implicit_ several levels deep" in the data. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## Query Submission Paths

The same Datalog string can be submitted through several interfaces. Each binding converges on a `run` / `run_script` function, which means users pick a transport based on deployment context, not on language features.

### Standalone REPL and CLI

The `cozo` binary ships a REPL driven by `rustyline`. Inside the REPL, queries are entered as CozoScript, and the following meta-ops are available:

- `%set <KEY> <VALUE>` / `%unset <KEY>` / `%clear` / `%params` — manage named parameters
- `%run <FILE>` — execute a script from a file
- `%import <FILE OR URL>` — import data in JSON format
- `%save <FILE>` — write the next result to a file
- `%backup <FILE>` / `%restore <FILE>` — snapshot and restore the database

Source: [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md) and [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs).

### HTTP API

The binary can also run as an HTTP server. The default endpoint is `http://127.0.0.1:9070/text-query`, accepting JSON of the form `{"script": "<COZOSCRIPT QUERY STRING>", "params": {}}`. The server handler decodes `params` into `DataValue` instances and forwards both the script and the mutability flag to `run_script_fold_err`. Responses are always JSON; a successful response has `"ok": true`. Source: [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md) and [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs).

```mermaid
flowchart LR
    A[CozoScript string] --> B{Transport}
    B -->|CLI| C[cozo REPL / cozo-bin]
    B -->|HTTP| D[POST /text-query]
    B -->|C FFI| E[cozo_run_query]
    B -->|Node.js| F[CozoDb.run]
    B -->|WASM| G[wasm-bindgen binding]
    C --> H[DbInstance::run_script]
    D --> H
    E --> H
    F --> H
    G --> H
    H --> I[Result as JSON]
```

### C and Node.js Bindings

The C ABI exposes `cozo_run_query` and friends, identified by an integer `db_id` drawn from a process-wide handle table. The C layer threads a UTF-8 script string and a JSON params payload into the engine. Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs).

The Node.js binding wraps the same logic in a `CozoDb` class whose `run(script, params)` method is `async` and returns a parsed object. The constructor accepts an `engine` string (`"mem"`, `"sqlite"`, `"rocksdb"`, etc.) and a `path`. Source: [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md).

## Parameters and Script Mutability

A common pattern across all bindings is the use of **named parameters** rather than string concatenation. The README and HTTP API both stress this: "Always use params instead of concatenating strings when you need parametrized queries." Source: [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md).

The HTTP handler accepts an optional `immutable` flag. When the server is started in `ScriptMutability::Immutable` mode, every script is forced to be read-only; otherwise the caller may opt in per request. This flag is what makes time-travel and read-replica modes safe to expose. Source: [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs).

## Storage-Backed Execution

The query engine is storage-agnostic. The same Datalog program runs against in-memory, SQLite, RocksDB, Sled, or TiKV backends. For RocksDB, the C++ FFI is exposed through `cozorocks` with a thin Rust wrapper around `autocxx`-generated bindings. Source: [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs) and [cozorocks/src/bridge/mod.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/mod.rs). The README reports mixed read/write QPS of ~100K and read-only QPS above 250K for a 1.6M-row relation on RocksDB, illustrating that the engine itself is the bottleneck, not the storage layer. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## Community and Status

Community interest in CozoDB's query engine centers on a few themes: cross-device sync via CRDTs (#240, #252), integration with LangChain/LlamaIndex for vector store use (#238), and binary/reduced-precision embeddings for vector search (#256). The v0.7 release extended the engine with HNSW vector search, MinHash-LSH near-duplicate search, full-text search, and JSON value support, all expressed inside Datalog. The latest published release at the time of writing is **v0.7.6**, which only updates the `cozo_run_query` doc comments (PR #190). Users frequently ask about the project's maintenance status (issue #301) — the most recent tagged release and the steady set of feature requests indicate the engine remains usable, though core activity is currently quiet. Source: community context.

## See Also

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md) — high-level introduction and performance numbers
- [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md) — HTTP API and REPL meta-ops
- [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md) — Node.js client surface
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md) — browser deployment notes
- [cozo-docs](https://docs.cozodb.org/) — full CozoScript language reference (external)

---

<a id='page-3'></a>

## Storage Backends and Persistence

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Query Engine and Datalog Language](#page-2)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md)
- [cozorocks/README.md](https://github.com/cozodb/cozo/blob/main/cozorocks/README.md)
- [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs)
- [cozorocks/src/bridge/mod.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/mod.rs)
- [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs)
- [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs)
- [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md)
- [cozo-lib-nodejs/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/src/lib.rs)
- [cozo-lib-java/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-java/src/lib.rs)
- [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs)
- [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs)
</details>

# Storage Backends and Persistence

## Overview

CozoDB persists its relations through a pluggable storage layer. The storage engine defines a `trait` interface whose required operations are essentially the provision of a key-value store for binary data with range-scan capabilities, and the rest of the database (query engine, schema, transactions, algorithms) sits on top of it. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

This design lets users pick the right trade-off between durability, throughput, and deployment topology. At the time of the v0.7.x release line, the project supports five backends identified by single-letter codes used throughout the toolchain.

## Backend Matrix

The `engine` parameter passed when opening a database selects the backend. The same identifiers are recognised by the C, Java, Node.js, and WebAssembly bindings, and by the `cozo` CLI. Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs) and [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md).

| Code | Backend | Persistence | Notes |
|------|---------|-------------|-------|
| `M`  | In-memory | None | Non-persistent; useful for tests and ephemeral workloads. |
| `Q`  | SQLite | Single-file on disk | Also used as the backup file format for interchange between backends. |
| `R`  | RocksDB | Single-node disk | Supports a custom RocksDB options file for tuning. |
| `S`  | Sled | Single-node disk | Pure-Rust embedded store. |
| `T`  | TiKV | Distributed | Designed for clustered deployments. |

Source for the codes and the per-backend descriptions: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

Not all backends may be present in a given binary release because some are gated behind compile-time features. When the database is embedded in Rust, custom backends implementing the storage trait can also be supplied. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## On-Disk Data Format

The storage engine defines a *row-oriented* binary data format that the storage engine implementation does not need to understand. The key format is an implementation of the memcomparable format used by MyRocks, which allows rows of data to be stored as binary blobs such that a lexicographic sort yields the correct order. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

A practical consequence is that SQLite-backed CozoDB files cannot be queried with ordinary SQL: the data must pass through CozoDB's decoder, which understands the row layout. The same encoded form is written to RocksDB, Sled, and TiKV through the key-value trait, which is what enables the SQLite file to act as a universal backup format. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## RocksDB Options and the `cozorocks` Bridge

For the RocksDB backend, the `cozorocks` crate exposes a thin FFI bridge to RocksDB's C++ API. The top-level re-exports define the surface area a caller interacts with:

```rust
pub use bridge::db::DbBuilder;
pub use bridge::db::RocksDb;
pub use bridge::ffi::{RocksDbStatus, SnapshotBridge, StatusCode, StatusSeverity, StatusSubCode};
pub use bridge::iter::{DbIter, IterBuilder};
pub use bridge::tx::{PinSlice, Tx, TxBuilder};
```

Source: [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs).

The bridge declares the operations that map onto the storage trait: `DbBuilder` constructs databases, `Tx`/`TxBuilder` model transactions with configurable read and write options, and `DbIter`/`IterBuilder` produce range cursors. The `TxBridge` interface exposes `verify_checksums`, `fill_cache`, `set_snapshot`, `clear_snapshot`, and a generic `get`/`put` API, mirroring the key-value-with-range-scan contract that the storage trait requires. Source: [cozorocks/src/bridge/mod.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/mod.rs).

`DbBuilder` exposes knobs that cover most tuning needs without writing C++:

- `path`, `options_path` — location of the database and an external RocksDB options file
- `prepare_for_bulk_load` — pre-tune compaction for one-shot ingest
- `increase_parallelism` — set the background thread pool size
- `optimize_level_style_compaction` — enable universal vs. level-style compaction
- `create_if_missing` — auto-create empty databases
- Bloom filter and prefix-extractor options for read amplification control
- `block_cache_size` — explicit cache sizing

Source: [cozorocks/src/bridge/db.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/db.rs).

For full control, dropping a file named `options` into the database directory causes CozoDB to load it as a [RocksDB options file](https://github.com/facebook/rocksdb/wiki/RocksDB-Options-File). The standalone `cozo` executable emits a log message when this path is taken, so a DBA can confirm the file is being picked up. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## Backup, Restore, and Cross-Backend Interchange

Backups are written in the SQLite file format, regardless of which backend is in use. The C bridge exposes the lifecycle:

- `cozo_backup` writes a backup of the current database to a file.
- `cozo_restore` reads a backup and replaces the contents of the (empty) target database.
- `cozo_import_from_backup` imports selected relations from a backup into a live database; triggers defined on the destination are *not* fired — if triggers must run, replay the data through a parameterised query instead.

Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs).

This is the canonical way to migrate data between backends, for example RocksDB → TiKV, since the row format is uniform across implementations. The CLI REPL mirrors the same operations with the meta-commands `%backup <FILE>` and `%restore <FILE>`; `%restore` requires the current database to be empty. Source: [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs) and [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## Selecting a Backend in Client Bindings

Every language binding surfaces the storage selection through an `engine` string. The Node.js binding, for example, accepts `engine`, `path`, and `options` in its constructor and exposes the same `run`, `exportRelations`, and `importRelations` surface regardless of backend. Source: [cozo-lib-nodejs/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/README.md) and [cozo-lib-nodejs/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/src/lib.rs).

The Java/JNI bridge uses the same identifiers in `Java_org_cozodb_CozoJavaBridge_openDb` and parks each opened instance behind a handle map so multiple databases can coexist. Source: [cozo-lib-java/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-java/src/lib.rs). The C bridge stores its handles in a `BTreeMap<i32, DbInstance>` guarded by a `Mutex`, and the HTTP server keeps its `DbInstance` behind an `Arc<Mutex<...>>` for concurrent request handling. Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs) and [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs).

## Common Failure Modes and Caveats

- **Choosing the wrong options file path.** RocksDB will refuse to open the database if the `options` file is malformed. The `cozo` CLI logs a message when the file is loaded, so check the log when tuning does not appear to take effect. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).
- **Inspecting a SQLite backup with normal SQL tools.** The file uses CozoDB's row-oriented encoding and the memcomparable key format, so SQLite clients will see opaque blobs. Re-open the file through CozoDB. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).
- **Expecting triggers to fire on backup import.** `cozo_import_from_backup` and the equivalent `importRelations` API skip triggers; drive replay through queries with parameters if trigger side effects are required. Source: [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs).
- **Cross-device synchronisation.** CozoDB does not ship a built-in CRDT/sync layer; community discussions (#240, #252) explore integration with [cr-sqlite](https://github.com/vlcn-io/cr-sqlite) and datalog CRDTs, but the storage layer in v0.7.x is a single-node key-value store per process (TiKV provides distribution, not sync). Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).
- **Distribution availability.** Not every backend ships in every binary; check the [release artefacts](https://github.com/cozodb/cozo/releases) for which engines are compiled in, and fall back to the SQLite backend for the most portable exchange format. Source: [README.md](https://github.com/cozodb/cozo/blob/main/README.md).

## See Also

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md) — high-level architecture and install matrix
- [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs) — HTTP server and `cozo_run_query`
- [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs) — REPL meta-commands (`%backup`, `%restore`, `%import`, `%save`)
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md) — building the WebAssembly backend
- [cozo-lib-nodejs/package.json](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/package.json) — published `cozo-node` artefact (v0.7.6)
- Community issue #240 — CRDT-based cross-device sync (open discussion)
- Community issue #252 — CRDT integration with Cozo (open discussion)

---

<a id='page-4'></a>

## Language Bindings, Server, and Advanced Search Features

### Related Pages

Related topics: [Project Overview and Architecture](#page-1), [Query Engine and Datalog Language](#page-2), [Storage Backends and Persistence](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md)
- [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md)
- [cozo-bin/src/server.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/server.rs)
- [cozo-bin/src/repl.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/repl.rs)
- [cozo-bin/src/client.rs](https://github.com/cozodb/cozo/blob/main/cozo-bin/src/client.rs)
- [cozo-lib-c/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/src/lib.rs)
- [cozo-lib-c/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/README.md)
- [cozo-lib-nodejs/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/src/lib.rs)
- [cozo-lib-nodejs/package.json](https://github.com/cozodb/cozo/blob/main/cozo-lib-nodejs/package.json)
- [cozo-lib-wasm/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/src/lib.rs)
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md)
- [cozo-lib-java/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozo-lib-java/src/lib.rs)
- [cozorocks/README.md](https://github.com/cozodb/cozo/blob/main/cozorocks/README.md)
- [cozorocks/src/lib.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/lib.rs)
- [cozorocks/src/bridge/mod.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/mod.rs)
- [cozorocks/src/bridge/tx.rs](https://github.com/cozodb/cozo/blob/main/cozorocks/src/bridge/tx.rs)
</details>

# Language Bindings, Server, and Advanced Search Features

## Overview

Cozo is a graph-relational database that uses Datalog as its query language. The project ships a Rust core engine and exposes it through multiple language bindings (C, Node.js, WebAssembly, Java), a standalone HTTP server with a REPL, and an embeddable storage engine that links RocksDB via C++ FFI. Recent versions extend the query language with advanced search primitives: HNSW vector indices, MinHash-LSH for near-duplicate detection, and full-text search.

The following diagram summarizes the major surfaces a user can interact with:

```mermaid
flowchart TB
    Core[("cozo core (Rust Datalog engine)")]
    Core --> Bin["cozo-bin (REPL + HTTP server)"]
    Core --> C["cozo-lib-c (C ABI)"]
    Core --> Node["cozo-lib-nodejs (N-API)"]
    Core --> Java["cozo-lib-java (JNI)"]
    Core --> Wasm["cozo-lib-wasm (wasm-bindgen)"]
    Core --> Rocks["cozorocks (RocksDB C++ FFI)"]
    Bin -->|HTTP / SSE| Client[External clients]
    C --> Client
    Node --> Client
    Java --> Client
    Wasm --> Browser[Browser]
    Rocks -->|storage backend| Bin
    Rocks -->|storage backend| C
    Rocks -->|storage backend| Node
```

## Language Bindings

### C Library

The C library exposes a flat C ABI. The library is built as `cozo_c` and exports a single header [`cozo_c.h`](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/cozo_c.h). Database handles are tracked in a process-global `BTreeMap<i32, DbInstance>` and looked up by integer ID ([Source: cozo-lib-c/src/lib.rs:14-26]()).

Key entry points include:

| Function | Purpose |
|---|---|
| `cozo_open_db(engine, path, options)` | Open a database; returns an integer handle |
| `cozo_close_db(id)` | Release a handle |
| `cozo_run_query(id, script, params, immutable)` | Execute a CozoScript string and return a JSON result |
| `cozo_export_relations(id, data)` | Export relations to JSON |
| `cozo_import_relations(id, data)` | Import relations from JSON |
| `cozo_free_str(s)` | Free a returned C-string |

`engine` accepts `"mem"`, `"sqlite"`, or `"rocksdb"`; `path` is required for non-memory engines ([Source: cozo-lib-c/src/lib.rs:31-42]()). Pre-built binaries (`libcozo_c`) are published on the GitHub release page. From source, the build is `cargo build --release -p cozo_c -F compact -F storage-rocksdb` ([Source: cozo-lib-c/README.md:21-23]()).

### Node.js / N-API Binding

The Node.js binding is published as the npm package `cozo-node` (current version `0.7.6`) ([Source: cozo-lib-nodejs/package.json:5, 25]()). It uses `napi` v6 and is distributed as a prebuilt binary fetched via `node-pre-gyp` ([Source: cozo-lib-nodejs/package.json:7-20]()). The Rust side ([Source: cozo-lib-nodejs/src/lib.rs:1-40]()) converts `DataValue` rows and `NamedRows` results to JavaScript values using the `neon` crate, supporting paged results via a `next` field that holds the next batch of rows.

### Java JNI Binding

The Java binding uses `jni` and exposes a `CozoJavaBridge` class. Like the C binding, it stores open databases in a `BTreeMap<i32, DbInstance>` keyed by a process-global `AtomicI32` counter ([Source: cozo-lib-java/src/lib.rs:11-29]()). The native method `Java_org_cozodb_CozoJavaBridge_openDb` accepts `engine`, `path`, and `options` strings and returns a handle ID ([Source: cozo-lib-java/src/lib.rs:32-50]()).

### WebAssembly Binding

The WASM binding is intentionally minimal: a single `CozoDb` class is exported with `new()`, `run(script, params, immutable)`, `export_relations(data)`, and `import_relations(data)` methods ([Source: cozo-lib-wasm/src/lib.rs:26-50]()). The default constructor opens an in-memory database, because persistent storage is not available in the browser. The published module is built with `wasm-pack build --target web --release` and the environment variable `CARGO_PROFILE_RELEASE_LTO=fat` ([Source: cozo-lib-wasm/README.md:9-14]()). The `--target no-modules` option is required to run Cozo inside a web worker across browsers.

### RocksDB C++ Binding

`cozorocks` is a thin Rust wrapper over the RocksDB C++ API using the `cxx` crate. The module re-exports `DbBuilder`, `RocksDb`, `Tx`, `TxBuilder`, `DbIter`, `IterBuilder`, `PinSlice`, and the `RocksDbStatus` family of FFI types ([Source: cozorocks/src/lib.rs:13-25]()). The bridge FFI ([Source: cozorocks/src/bridge/mod.rs:1-40]()) declares opaque C++ types such as `TxBridge`, `SstFileWriterBridge`, and a full `RocksDbStatus` struct mirroring RocksDB's native error code (kOk, kNotFound, kCorruption, kIOError, kTimedOut, kBusy, kAborted, etc.). `RocksDbStatus::is_ok` and `is_not_found` provide convenient predicates ([Source: cozorocks/src/bridge/mod.rs: end of file]()). Transaction support is exposed via `TxBuilder` and `Tx`, which forward calls such as `set_snapshot`, `get`, `put`, and `start` to the underlying C++ handle ([Source: cozorocks/src/bridge/tx.rs:34-60]()).

## HTTP Server and REPL

`cozo-bin` produces an executable that combines a REPL, a client, and an HTTP server. The server ([Source: cozo-bin/src/server.rs:1-45]()) is built on `axum` and `tower-http`, exposing the following routes:

| Method | Path | Purpose |
|---|---|---|
| `POST` | `/text-query` | Run a CozoScript query; body `{"script": "...", "params": {}}` |
| `GET` | `/export/{relations}` | Export a comma-separated list of relations |
| `PUT` | `/import` | Import relations from a JSON body |
| `POST` | `/backup` | Write a backup to a path |
| `POST` | `/import-from-backup` | Restore specific relations from a backup |
| `GET (SSE)` | `/changes/{relation}` | Stream change events for a relation |
| `GET` | `/` | A minimal browser-based client |

Authentication is implemented with `AsyncRequireAuthorizationLayer` and supports a bearer token in the `auth` header or query parameter ([Source: cozo-bin/README.md: backend API section]()). The default listen address is `http://127.0.0.1:9070` ([Source: cozo-bin/README.md: query API section]()). The same source also documents that `import` and `import-from-backup` do **not** fire triggers, and recommends using parameterized queries if trigger activation is required.

The REPL ([Source: cozo-bin/src/repl.rs:1-50]()) supports meta commands such as `%set`, `%unset`, `%clear`, `%params`, `%run`, `%import`, `%save`, `%backup`, and `%restore` ([Source: cozo-bin/README.md: REPL section]()). The client mode ([Source: cozo-bin/src/client.rs:1-10]()) shares the same underlying `DbInstance` API and is useful for batch scripts or remote invocation.

## Advanced Search Features

Recent releases add search primitives to the Datalog surface itself, so the bindings above automatically inherit them through `run_query`/`run_script_str`.

### HNSW Vector Search (v0.6+)

HNSW (hierarchical navigable small world) indices can be created on relations that contain vectors. Multiple HNSW indices can be attached to the same relation, each governed by a filter that selects which rows and which vectors are indexed ([Source: README.md: HNSW section]()). Vector search participates in Datalog unification, which means vectors can be expressed as pivots to join against indexed relations in a single query.

This feature is highly relevant to community discussion #238 ("Cozodb as a vector store for llama index or langchain") because Cozo can already serve as a vector backend through its standard query API.

### MinHash-LSH Near-Duplicate Search (v0.7+)

MinHash-LSH is exposed inside Datalog for detecting near-duplicates. Like HNSW, it is integrated with the rest of the query engine, so MinHash-based similarity can be combined with relational filters, joins, and recursive rules.

### Full-Text Search and JSON Values (v0.7+)

Version 0.7 also introduced full-text search and JSON value support ([Source: README.md: New versions section]()). Because these are core query features, every binding (C, Node.js, Java, WASM) can use them as soon as the library loads v0.7.

## Community Topics and Caveats

- **Vector store integrations (issue #238):** Cozo's HNSW support is sufficient to act as a backend for LLM frameworks. Integrators typically call `cozo_run_query` (or the equivalent in their binding) with a parametrized vector query.
- **Binary / quantized embeddings (issue #256):** Vector storage is currently f32-based. Reduced-precision formats would require core engine changes, since the `cozorocks` storage layer preserves whatever the engine serializes.
- **CRDT and cross-device sync (issues #240, #252):** No CRDT layer ships in this repository. The standalone server, the export/import endpoints, and the per-relation SSE stream (`/changes/{relation}`) provide the building blocks, but a full sync protocol is left to integrators.
- **Maintenance status (issue #301):** A one-year gap in commits was reported, but the v0.7.6 release was published and continues to ship prebuilt binaries for every binding listed above.

## See Also

- [README.md](https://github.com/cozodb/cozo/blob/main/README.md) — project-level overview and feature highlights
- [cozo-bin/README.md](https://github.com/cozodb/cozo/blob/main/cozo-bin/README.md) — full server API reference
- [cozo-lib-c/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-c/README.md) — C ABI build and usage
- [cozo-lib-wasm/README.md](https://github.com/cozodb/cozo/blob/main/cozo-lib-wasm/README.md) — WASM build targets

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: cozodb/cozo

Summary: Found 14 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/202

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/287

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/298

## 4. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/cozodb/cozo

## 5. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/307

## 6. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/289

## 7. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/306

## 8. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: packet_text.keyword_scan | https://github.com/cozodb/cozo

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/cozodb/cozo

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/cozodb/cozo

## 11. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/cozodb/cozo

## 12. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/cozodb/cozo/issues/291

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/cozodb/cozo

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/cozodb/cozo

<!-- canonical_name: cozodb/cozo; human_manual_source: deepwiki_human_wiki -->
