Doramagic Project Pack ยท Human Manual

tinysearch

๐Ÿ” Tiny, full-text search engine for static websites built with Rust and Wasm

Overview and Architecture

Related topics: Configuration with tinysearch.toml, Static Site Generator Integration, Rust Library API and Programmatic Usage

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Library API

Continue reading this section for the full explanation and source context.

Section Command-Line Interface

Continue reading this section for the full explanation and source context.

Section Configuration via tinysearch.toml

Continue reading this section for the full explanation and source context.

Related topics: Configuration with tinysearch.toml, Static Site Generator Integration, Rust Library API and Programmatic Usage

Overview and Architecture

Purpose and Scope

tinysearch is a lightweight, fast, full-text search engine purpose-built for static websites. The project positions itself as a dependency-free alternative to heavier JavaScript search libraries such as lunr.js and elasticlunr (Source: README.md). It is implemented in Rust and compiled to WebAssembly (WASM) so that the entire search index and engine can run client-side in the browser, without a backend.

The core value proposition is size: a test index of approximately 40 posts produces a WASM payload of 99kB (49kB gzipped, 40kB brotli) โ€” smaller than the project's own demo image (Source: README.md). This makes tinysearch attractive to sites where JavaScript bundle weight matters.

The scope of the project covers three primary use cases:

Community interest in extending the engine has surfaced in feature requests covering: configurable fields via tinysearch.toml (resolved in v0.10.0), filterable numeric/boolean fields, and library API documentation (Sources: README.md, issue #183, issue #116).

System Architecture

The tinysearch architecture follows a build-time index generation and runtime browser-side query model. The pipeline consists of three stages:

  1. Index Build (offline): A static site generator produces a JSON array describing each page. The tinysearch CLI consumes that JSON and emits either a storage file (binary index) or a fully linked WASM module.
  2. Distribution: The WASM module is shipped as a static asset next to the site, typically inside a path such as static/wasm_output/.
  3. Runtime Query: A small JavaScript glue layer (provided by the project) loads the WASM module, calls a search function, and renders results.

The data flow is illustrated below:

flowchart LR
    A[Static Site<br>Content] --> B[SSG Template<br>e.g. Zola, Pelican]
    B --> C[JSON Index<br>index.json]
    C --> D[tinysearch CLI<br>-m storage | wasm]
    D --> E[Binary Storage<br>files]
    D --> F[WASM Module<br>+ JS glue]
    F --> G[Static Site<br>/wasm_output/]
    G --> H[Browser<br>search.js]
    H --> I[User Query<br>Results]

Under the hood, the engine is a Rust/WASM port of the Python code from the article "Writing a full-text search engine using Bloom filters". Internally it uses a Xor Filter โ€” a space-efficient probabilistic data structure for fast set membership (Source: README.md). The library depends on xorf for the filter implementation, bincode for binary serialization, and serde for JSON deserialization (Source: src/lib.rs).

Core Components

Library API

src/lib.rs re-exports a public api module and provides a library-level entry point. The core types documented in the module doc-comment are:

  • BasicPost โ€” a default post struct with title, url, optional body, and a HashMap for arbitrary metadata.
  • TinySearch โ€” the engine, constructed via TinySearch::new() and optionally configured with a custom stopword list (.with_stopwords(...)).
  • SearchIndex โ€” the built index, produced by search.build_index(&posts).

Two methods drive usage: build_index(&posts) to compile posts into filters, and search(&index, query, limit) to query the compiled index (Source: src/lib.rs). The library is marked experimental and the API may change (Source: README.md).

Command-Line Interface

The CLI exposes three operating modes referenced in the example READMEs:

ModePurposeExample invocation
storageBuild the binary index onlytinysearch -m storage -p ./output docs.json
searchRun a one-shot query against a storage dirtinysearch -m search -S "rust" -N 3 ./output/storage
wasmEmit a complete WASM module + JS gluetinysearch --release -m wasm -p ./wasm_output docs.json

(Source: examples/blog/README.md, examples/documentation/README.md, README.md)

The --optimize / -o flag enables wasm-opt compression from binaryen, which typically reduces output size by 20โ€“30% (Source: README.md). Users on Windows or non-Rust environments can run the same workflow through nightly-built Docker images (Source: README.md).

Configuration via `tinysearch.toml`

Introduced in v0.10.0, a TOML configuration file lets users declare which JSON fields are indexed for full-text search, which are stored as display-only metadata, and which field provides the result URL (Source: README.md). Example schemas for blogs, documentation, and e-commerce catalogs are provided in the repository. When the file is absent, the default schema indexes title and body and uses url as the link field (Source: README.md).

Integration Patterns

Static Site Generators

Each supported SSG has a small "index template" that emits a JSON array of pages:

After zola build / pelican content produces the JSON, the typical call is:

tinysearch --optimize --path static public/tinysearch.json/index.html

(Source: examples/zola/README.md)

Library Use from Rust

Custom post types can be indexed without running the executable. The advanced example defines a BlogPost struct, constructs a TinySearch with custom stopwords ("the", "with"), and runs a series of queries over the resulting SearchIndex (Source: examples/library_advanced/main.rs). This addresses the long-standing community request for a public Rust API (Source: issue #183).

Sample Data Layout

The examples folder ships ready-to-use JSON samples illustrating the schema flexibility: blog posts with tags and authors (examples/blog/posts.json), documentation pages with versioning and difficulty (examples/documentation/docs.json), and product catalogs with prices and ratings (examples/ecommerce/products.json). A minimal three-post sample lives at examples/index.json.

Known Operational Considerations

  • WASM hosting: Production deployments must serve .wasm with the correct MIME type (application/wasm); gzipped content must be advertised via Content-Encoding (Source: README.md).
  • Browser WASM loading: Changes to how browsers fetch and instantiate WASM have caused some glue-script implementations to break, particularly when cross-origin isolation or streaming compilation is involved (Source: issue #175).
  • CJK text: The engine indexes whole tokens; queries against Chinese or Japanese text may miss matches when the input is split in ways the tokenizer does not understand (Source: issue #179).
  • Build directory output: The default invocation emits several files alongside the WASM binary; community members have requested a flag to copy only the final .wasm (Source: issue #169).

See Also

  • Configuration Reference (schema options in tinysearch.toml)
  • Library API Guide (BasicPost, TinySearch, SearchIndex)
  • Static Site Generator integration recipes (Jekyll, Hugo, Zola, Cobalt, Pelican)
  • WebAssembly deployment checklist (MIME types, hosting)

Source: https://github.com/tinysearch/tinysearch / Human Manual

Configuration with tinysearch.toml

Related topics: Overview and Architecture, Static Site Generator Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section How the engine consumes the file

Continue reading this section for the full explanation and source context.

Section Pointing the CLI at the right config

Continue reading this section for the full explanation and source context.

Section Backward compatibility

Continue reading this section for the full explanation and source context.

Related topics: Overview and Architecture, Static Site Generator Integration

Configuration with tinysearch.toml

Overview

The tinysearch.toml configuration file lets users customize which JSON fields are indexed for full-text search versus which fields are stored as metadata for display. It was introduced in v0.10.0 (PR #181) to address a long-standing roadmap request: making it possible to add arbitrary fields such as product images, descriptions, dates, authors, and categories without modifying the engine itself (Issue #116).

The file is optional. When absent, tinysearch falls back to a default schema that indexes title and body, with url as the URL field. Source: README.md โ€” *"tinysearch will use the default schema (indexing title and body fields with url as the URL field)."*

Schema Structure

A tinysearch.toml file contains a single [schema] table with three keys:

[schema]
indexed_fields  = ["title", "content", "tags"]
metadata_fields = ["author", "date", "category"]
url_field       = "permalink"
KeyPurposeRequired
indexed_fieldsList of JSON fields whose tokenized text is placed into the search index.No (defaults to ["title", "body"])
metadata_fieldsList of fields passed through verbatim to the output and returned with each search hit.No (defaults to empty)
url_fieldThe single field whose value becomes the clickable link of a result.No (defaults to "url")

Source: README.md โ€” example e-commerce configuration. Source: examples/blog/tinysearch.toml and examples/ecommerce/tinysearch.toml for the three concrete configurations that ship with the repository.

How the engine consumes the file

The tinysearch binary looks for tinysearch.toml in the working directory by default, and can be pointed at an alternate path with the --config CLI flag. It then walks the input JSON array and:

  1. Concatenates the string values of all indexed_fields for each document and feeds them into the tokenizer that builds the Xor filter.
  2. Copies each entry in metadata_fields into the output struct so the WASM module can return it alongside the URL when a query matches.
  3. Reads the value of url_field and uses it as the link target.

Source: src/lib.rs โ€” the library exposes TinySearch and SearchIndex types and is built on top of the xorf crate, confirming the schema values flow into the filter that backs every query.

Real-World Schemas from the Bundled Examples

The examples/ directory ships three ready-made schemas that demonstrate how the same [schema] table adapts to very different content shapes.

flowchart LR
    A[JSON input file] --> B{Read tinysearch.toml}
    B -- indexed_fields --> C[Tokenizer + Xor filter]
    B -- metadata_fields --> D[Pass-through struct]
    B -- url_field --> E[Result links]
    C --> F[WASM / search output]
    D --> F
    E --> F
Exampleindexed_fieldsmetadata_fieldsurl_field
E-commerceproduct_name, description, category, tagsprice, image_url, brand, availability, rating, reviews_countproduct_url
Blogtitle, content, excerpt, tagsauthor, publish_date, category, reading_time, featured_imagepermalink
Documentationtitle, content, section, keywordsversion, last_updated, contributor, difficulty, typedoc_url

Sources: examples/ecommerce/README.md, examples/blog/README.md, examples/documentation/README.md, and the comparison table in examples/README.md.

These three configurations illustrate the typical patterns:

  • E-commerce combines human-readable text (description) with categorical data (category, tags) for matching, while pricing and availability are surfaced as metadata. This pattern directly answers the "is there a way to return the page description or body in the results?" question in Issue #159.
  • Blog adds excerpt so the body field remains optional and snippet-friendly, addressing part of the roadmap in Issue #116.
  • Documentation uses section and keywords to bias results toward navigational structure, with version and difficulty available for filtered UI.

Usage Patterns and Common Pitfalls

Pointing the CLI at the right config

Running tinysearch from the example directory picks up the adjacent tinysearch.toml automatically. When integrating with a static site generator it is usually placed at the project root; the Zola guide, for instance, assumes the file is present at the workspace level so all tinysearch invocations resolve the same schema. Source: examples/zola/README.md โ€” *"Run tinysearch: tinysearch --optimize --path static โ€ฆ"*.

Backward compatibility

Projects that upgraded from v0.8.x or v0.9.x relied on hard-coded title, body, and url keys. Because v0.10.0 keeps the same default schema, legacy JSON input files continue to work without modification. The CI error reported in Issue #182 (failed to select a version for the requirement tinysearch = "^0.9.0") is unrelated to the config file but illustrates why pinning to v0.10.0 is important when adopting the new schema.

Library usage and custom fields

The tinysearch crate can also be driven programmatically, and any custom field exposed via the BasicPost.meta HashMap is preserved through the same code path the TOML file controls at the CLI level. Source: README.md โ€” *"Add tinysearch to your Cargo.toml: cargo add tinysearch"*. This addresses the request in Issue #183 for a public API usable from Rust without invoking the binary.

Common failure modes

  1. Mismatched field names โ€” the schema is case-sensitive; a TOML key of Title will silently produce an empty index because the engine reads title. Verify by running the storage mode and inspecting the generated index.html or by performing a sanity check with tinysearch -m search -S "..." ./output/storage.
  2. CORS / MIME issues at the WASM layer โ€” the schema is processed before WASM emission, but the resulting module still must be served with application/wasm. See the "Changes in the way browsers work with wasm" discussion in Issue #175.
  3. Non-Latin tokenization โ€” because the engine matches whole words, partial Chinese substrings may miss, as reported in Issue #179. Splitting indexed text into whitespace-delimited tokens in the input JSON mitigates this regardless of schema.

See Also

Sources: examples/ecommerce/README.md, examples/blog/README.md, examples/documentation/README.md, and the comparison table in examples/README.md.

Static Site Generator Integration

Related topics: Overview and Architecture, Configuration with tinysearch.toml

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Zola (Tera templates)

Continue reading this section for the full explanation and source context.

Section Pelican (Jinja templates)

Continue reading this section for the full explanation and source context.

Section Hugo, Jekyll, and Cobalt

Continue reading this section for the full explanation and source context.

Related topics: Overview and Architecture, Configuration with tinysearch.toml

Static Site Generator Integration

Purpose and Scope

tinysearch is a lightweight, fast, full-text search engine written in Rust and compiled to WebAssembly. It is specifically designed for static websites, making Static Site Generator (SSG) integration its primary use case. The README states: "It can be used together with static site generators such as Jekyll, Hugo, Zola, Cobalt, or Pelican." (README.md)

Integration with an SSG follows a two-phase model: the SSG generates a JSON index from site content, and tinysearch consumes that JSON to produce a WASM payload that runs entirely client-side in the browser. The approach is generator-agnostic โ€” any SSG capable of emitting a flat JSON array of {title, body/url, ...} records can be wired in.

The v0.10.0 release further strengthened integration by introducing a tinysearch.toml configuration file for declaring indexed vs. metadata fields, and by exposing tinysearch as a Rust library for programmatic index construction (README.md).

Integration Architecture

The end-to-end pipeline can be visualised as a data flow from source markdown content to a browser-resident search engine:

flowchart LR
    A[Markdown Pages<br/>in SSG] --> B[SSG Template<br/>Renders JSON]
    B --> C[posts.json<br/>index.json]
    C --> D[tinysearch CLI<br/>or Library]
    D --> E[storage/<br/>binary index]
    E --> F[WASM Module<br/>+ JS glue]
    F --> G[Browser<br/>Client-Side Search]

    style D fill:#f9f,stroke:#333
    style F fill:#bbf,stroke:#333

Each post is internally converted into an Xor Filter โ€” "a datastructure for fast approximation of set membership that is smaller than bloom and cuckoo filters" โ€” and then serialised with bincode into a single binary blob that ships with the site (README.md). The library entry point in src/lib.rs exposes TinySearch::new() and build_index(...) so the same pipeline can be driven from Rust code instead of the CLI (src/lib.rs).

Generator-Specific Integration Patterns

Zola (Tera templates)

The Zola example iterates over section.pages in a Tera template, skips drafts, strips HTML tags, and emits a JSON array. A key gotcha โ€” discussed in community issue #169 and #178 โ€” is that Zola emits public/tinysearch.json/index.html rather than tinysearch.json, so the CLI invocation is:

tinysearch --optimize --path static public/tinysearch.json/index.html

Source: examples/zola/README.md

The template uses json_encode and explicit replace filters to escape braces, quotes, and backslashes that survive striptags. Zola now also supports a native fuse_json index format that is conceptually compatible with tinysearch, though this route has not yet been officially adopted (see community discussion #178).

Pelican (Jinja templates)

Pelican uses a near-identical Jinja template. The pattern is the same: loop over articles, filter article.status != "draft", and emit records with article.title, article.url, and article.content (examples/pelican/README.md). The template is wired to a static page via frontmatter (Template: json) so pelican content writes the JSON to output/pages/json.html, which is then fed to tinysearch --optimize --path output output/pages/json.html.

Hugo, Jekyll, and Cobalt

The main README explicitly lists Hugo and Jekyll as supported, and the examples/ directory demonstrates the same JSON-as-bridge pattern. Hugo users can rely on its native JSON output formats, while Jekyll users can use Liquid templates to iterate over site.posts and emit records. For any homegrown SSG, community issue #183 raised the request for a Rust library API โ€” a need fulfilled in v0.10.0 with TinySearch and BasicPost (src/lib.rs).

Configuration with `tinysearch.toml` (v0.10.0)

Prior to v0.10.0, tinysearch assumed a fixed schema of title, body, and url. The new configuration file, introduced in PR #181, lets users declare arbitrary schemas. The README example shows a documentation site with four indexed fields and four metadata fields:

[schema]
indexed_fields = ["title", "content", "section", "keywords"]
metadata_fields = ["version", "last_updated", "contributor", "difficulty"]
url_field = "doc_url"

Source: README.md

This schema is reflected in the documentation example JSON, which extends the basic {title, body, url} shape with section, keywords, doc_url, version, last_updated, contributor, difficulty, and type (examples/documentation/docs.json). The blog example adds excerpt, tags, permalink, author, publish_date, category, reading_time, and featured_image (examples/blog/posts.json).

When no tinysearch.toml is present, tinysearch falls back to the default schema (title and body indexed, url as link).

Build Commands and Optimisation

The CLI exposes three modes relevant to SSG workflows: storage (build the binary index), wasm (build the WASM bundle), and search (run a query against a built index). Common invocation patterns from the README and examples:

# Dev build with demo HTML
tinysearch -m wasm -p wasm_output posts.json

# Production build, no demo
tinysearch --release -m wasm -p wasm_output posts.json

# Optimised WASM (requires binaryen's wasm-opt)
tinysearch --release -o -m wasm -p wasm_output posts.json

Source: README.md

The --optimize flag invokes wasm-opt, typically shrinking the payload by 20โ€“30% (mentioned in the docs example under Performance Optimization). Production deployment requires the web server to serve .wasm with application/wasm MIME type โ€” a common source of the "Demo broken" error reported in issue #177.

Library API for Programmatic Integration

For SSG authors who prefer not to shell out to the CLI, the library API introduced in PR #184 exposes the same pipeline. The advanced example demonstrates a custom BlogPost struct with title, slug, content, tags, and author, configured via .with_stopwords(...) (examples/library_advanced/main.rs):

let search = TinySearch::new().with_stopwords(vec!["the".to_string(), "with".to_string()]);
let index = search.build_index(&blog_posts)?;
let results = search.search(&index, "rust", 10);

Source: src/lib.rs

This addresses the request in issue #183 for a way to drive tinysearch from Rust without invoking an executable.

Known Limitations and Community-Reported Issues

Several recurring limitations surface in community discussions and constrain SSG integration choices:

LimitationSourceImpact on SSGs
Whole-word search only โ€” no prefix or suggestion matchingREADME.mdChinese and other non-space-separated languages see partial matches (issue #179)
Recommended size: small to medium sites (~2 kB/article uncompressed)README.mdLarge doc sets may exceed browser memory
Search relevance can feel non-deterministic#120May require query expansion in template
Metadata such as descriptions and images is not returned in result objects#159UI must fetch page separately
No native keyword highlighting#119Must be implemented in JS glue layer
7 files emitted by default into --path#169Staging step required to copy only the WASM file

The library API and tinysearch.toml schema in v0.10.0 directly address the first three of these gaps by giving SSG authors full control over what gets indexed and how posts are represented.

See Also

  • Configuration Reference โ€” full tinysearch.toml schema documentation
  • Library API โ€” programmatic use of TinySearch from Rust
  • WebAssembly Output Format โ€” what the browser actually loads
  • Performance and Size Optimisation โ€” --optimize and brotli/gzip trade-offs

Source: https://github.com/tinysearch/tinysearch / Human Manual

Rust Library API and Programmatic Usage

Related topics: Overview and Architecture, Configuration with tinysearch.toml

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview and Architecture, Configuration with tinysearch.toml

Rust Library API and Programmatic Usage

Overview

tinysearch began life as a standalone command-line tool that ingests a JSON index file and emits a WebAssembly (WASM) blob for browser-side search. Starting with release v0.10.0, the crate can also be consumed directly from Rust, letting developers build and query search indexes in-process without shelling out to the tinysearch binary. This was added in response to a long-standing community request (see issue #183) and is documented as the "Library Usage (Experimental)" section of the README.md.

The library surface is intentionally small. It exposes one trait (Post), one ready-made struct (BasicPost), one engine (TinySearch), and one index type alias (SearchIndex). The design goal โ€” and the reason the project remains a few tens of kilobytes after compilation โ€” is that the same Xor-Filter-based representation used on the wire is also the in-memory representation in the library. Source: src/lib.rs.

Core Types and the `Post` Trait

The library is built around a single trait, Post, defined in src/lib.rs. Anything the engine can index must implement it:

MethodSignaturePurpose
titlefn title(&self) -> &strRequired. The display title of the document.
urlfn url(&self) -> &strRequired. The link target returned with each hit.
bodyfn body(&self) -> Option<&str>Optional. The main searchable text.
metafn meta(&self) -> HashMap<String, String>Optional. Extra fields stored alongside the hit (e.g., author, category).

BasicPost is the only concrete type shipped by the crate; it is a plain owned-struct that satisfies Post for callers who do not want to write their own implementation. Source: src/lib.rs.

The TinySearch engine is constructed with TinySearch::new() and configured fluently. The crate-level rustdoc shows the minimal path:

use tinysearch::{BasicPost, TinySearch, SearchIndex};
use std::collections::HashMap;

let posts = vec![
    BasicPost {
        title: "First Post".to_string(),
        url: "/first".to_string(),
        body: Some("This is the first post content".to_string()),
        meta: HashMap::new(),
    },
];

let search = TinySearch::new();
let index: SearchIndex = search.build_index(&posts).expect("Failed to build index");
let results = search.search(&index, "rust", 10);

Source: src/lib.rs.

Basic Library Usage

The examples/library_basic/ example in the repository demonstrates the shortest possible integration. A user supplies a Vec of BasicPost, calls build_index, and then calls search(&index, query, limit) to obtain a vector of SearchResult values. The result type carries the title, the URL, and the metadata map that was passed in, so consumers can render hits without consulting the original document. Source: examples/library_basic/main.rs and README.md.

For projects that already have JSON in the standard tinysearch shape (title, url, optional body, optional meta), the API offers a convenience parser so callers do not have to deserialize by hand. The signature is documented in the rustdoc of src/api.rs:

let json = r#"[
  {
    "title": "My Post",
    "url": "/my-post",
    "body": "Post content goes here",
    "meta": {"category": "programming", "author": "John"}
  }
]"#;

let search = TinySearch::new();
let posts = search.parse_posts(json).expect("parse error");
let index = search.build_index(&posts)?;
let results = search.search(&index, "post", 10);

Source: src/api.rs. The parse_posts helper lowers the friction for users migrating from the CLI workflow to the library workflow.

Advanced Usage with Custom Post Types

Most real sites do not have documents that look like BasicPost. The examples/library_advanced/ example defines a domain struct (BlogPost) and implements Post for it. The pattern is straightforward: map the four trait methods to whatever fields your content store exposes. Source: examples/library_advanced/main.rs.

struct BlogPost {
    title: String,
    slug: String,
    content: String,
    tags: Vec<String>,
    author: String,
}

impl Post for BlogPost {
    fn title(&self) -> &str { &self.title }
    fn url(&self) -> &str { &self.slug }
    fn body(&self) -> Option<&str> { Some(&self.content) }
    fn meta(&self) -> HashMap<String, String> {
        let mut meta = HashMap::new();
        meta.insert("author".to_string(), self.author.clone());
        meta.insert("tags".to_string(), self.tags.join(", "));
        meta
    }
}

This is the integration point that community issue #183 was specifically asking for: a way to call tinysearch from inside a home-grown static site generator without spawning the executable. The meta map is what enables richer result rendering (e.g., showing an author or a thumbnail), which is the same use case behind issue #119 ("Highlight matched keywords in search results") and issue #159 ("Is there a way to return the page description or body in the results?"). Source: examples/library_advanced/main.rs.

A Yew (WebAssembly frontend) integration example lives in examples/yew-example-crate/. It shows the same trait being implemented inside a frontend crate, which is the most direct way to keep a single content schema across build-time index generation and runtime search. Source: examples/yew-example-crate/src/main.rs.

Configuration Options

The library exposes a small but useful configuration surface, all on the TinySearch builder:

MethodEffectSource
TinySearch::new()Construct a default engine.src/api.rs
.with_stopwords(words)Override the default stopword list with a custom collection.src/api.rs
search(&index, query, limit)Run a query and cap the number of returned hits.src/lib.rs
build_index(&posts)Convert any iterable of &impl Post into a SearchIndex.src/lib.rs
parse_posts(json)Parse the canonical CLI JSON shape into Vec<BasicPost>.src/api.rs

Stopword tuning is the most common customization in practice. The default list is replaced wholesale by the words you pass, which is what the advanced example relies on to drop common English words that would otherwise bloat the index. Source: src/api.rs and examples/library_advanced/main.rs.

A separate SearchIndex type alias โ€” re-exported by the library โ€” represents the fully built, serializable form of the index. The examples/search_index_type.rs example demonstrates how to materialize an index and feed it into a custom sink (for example, a non-WASM target such as a CLI or a server). Source: examples/search_index_type.rs.

End-to-End Data Flow

The library path mirrors the CLI path; the only thing that changes is who drives each step.

flowchart LR
    A["Source documents<br/>(any Rust type)"] -->|impl Post| B["Vec&lt;BasicPost&gt;<br/>or user struct"]
    B --> C["TinySearch::new()"]
    C -->|build_index| D["SearchIndex<br/>(Xor Filters + bincode)"]
    D -->|search| E["Vec&lt;SearchResult&gt;<br/>(title, url, meta)"]
    C -.optional.-> F["with_stopwords"]
    C -.optional.-> G["parse_posts<br/>(from JSON)"]

Source: src/lib.rs, src/api.rs, and README.md.

Limitations and Caveats

The README is explicit that the library surface is experimental and may change. Source: README.md. Inherited engine limitations also apply: only full-word matches are supported (no prefix or fuzzy search), and the index for an entire site must fit in memory because it is one contiguous blob. The roadmap issue #116 tracks adding richer query features (filters, booleans, optional body field) that, once shipped, will land on the library API as well. Source: README.md.

For consumers who need WASM as the final artifact, the library path composes with the CLI path: build the index in Rust, serialize it with the same bincode format the CLI uses, and either embed it directly in a frontend crate (see the Yew example) or write it to disk and run tinysearch --optimize on it. Source: examples/yew-example-crate/src/main.rs and README.md.

See Also

  • CLI Usage and JSON Index Format
  • Static Site Generator Integration (Hugo, Zola, Pelican, Jekyll)
  • tinysearch.toml Configuration
  • WASM Deployment and MIME Types

Source: https://github.com/tinysearch/tinysearch / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 16 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: identity.distribution | https://github.com/tinysearch/tinysearch

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/169

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/182

4. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/177

5. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/174

6. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/173

7. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/116

8. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/170

9. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/tinysearch/tinysearch

10. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/175

11. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/tinysearch/tinysearch

12. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/tinysearch/tinysearch

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using tinysearch with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence