Doramagic Project Pack ยท Human Manual
tinysearch
๐ Tiny, full-text search engine for static websites built with Rust and Wasm
Overview and Architecture
Related topics: Configuration with tinysearch.toml, Static Site Generator Integration, Rust Library API and Programmatic Usage
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Configuration with tinysearch.toml, Static Site Generator Integration, Rust Library API and Programmatic Usage
Overview and Architecture
Purpose and Scope
tinysearch is a lightweight, fast, full-text search engine purpose-built for static websites. The project positions itself as a dependency-free alternative to heavier JavaScript search libraries such as lunr.js and elasticlunr (Source: README.md). It is implemented in Rust and compiled to WebAssembly (WASM) so that the entire search index and engine can run client-side in the browser, without a backend.
The core value proposition is size: a test index of approximately 40 posts produces a WASM payload of 99kB (49kB gzipped, 40kB brotli) โ smaller than the project's own demo image (Source: README.md). This makes tinysearch attractive to sites where JavaScript bundle weight matters.
The scope of the project covers three primary use cases:
- Static site search integrated with generators such as Jekyll, Hugo, Zola, Cobalt, and Pelican (Source: README.md).
- Documentation and blog search through configurable schemas (Source: examples/documentation/README.md, examples/blog/README.md).
- Programmatic library usage from Rust code, introduced as an experimental API in v0.10.0 (Source: src/lib.rs).
Community interest in extending the engine has surfaced in feature requests covering: configurable fields via tinysearch.toml (resolved in v0.10.0), filterable numeric/boolean fields, and library API documentation (Sources: README.md, issue #183, issue #116).
System Architecture
The tinysearch architecture follows a build-time index generation and runtime browser-side query model. The pipeline consists of three stages:
- Index Build (offline): A static site generator produces a JSON array describing each page. The
tinysearchCLI consumes that JSON and emits either astoragefile (binary index) or a fully linked WASM module. - Distribution: The WASM module is shipped as a static asset next to the site, typically inside a path such as
static/wasm_output/. - Runtime Query: A small JavaScript glue layer (provided by the project) loads the WASM module, calls a search function, and renders results.
The data flow is illustrated below:
flowchart LR
A[Static Site<br>Content] --> B[SSG Template<br>e.g. Zola, Pelican]
B --> C[JSON Index<br>index.json]
C --> D[tinysearch CLI<br>-m storage | wasm]
D --> E[Binary Storage<br>files]
D --> F[WASM Module<br>+ JS glue]
F --> G[Static Site<br>/wasm_output/]
G --> H[Browser<br>search.js]
H --> I[User Query<br>Results]Under the hood, the engine is a Rust/WASM port of the Python code from the article "Writing a full-text search engine using Bloom filters". Internally it uses a Xor Filter โ a space-efficient probabilistic data structure for fast set membership (Source: README.md). The library depends on xorf for the filter implementation, bincode for binary serialization, and serde for JSON deserialization (Source: src/lib.rs).
Core Components
Library API
src/lib.rs re-exports a public api module and provides a library-level entry point. The core types documented in the module doc-comment are:
BasicPostโ a default post struct withtitle,url, optionalbody, and aHashMapfor arbitrary metadata.TinySearchโ the engine, constructed viaTinySearch::new()and optionally configured with a custom stopword list (.with_stopwords(...)).SearchIndexโ the built index, produced bysearch.build_index(&posts).
Two methods drive usage: build_index(&posts) to compile posts into filters, and search(&index, query, limit) to query the compiled index (Source: src/lib.rs). The library is marked experimental and the API may change (Source: README.md).
Command-Line Interface
The CLI exposes three operating modes referenced in the example READMEs:
| Mode | Purpose | Example invocation |
|---|---|---|
storage | Build the binary index only | tinysearch -m storage -p ./output docs.json |
search | Run a one-shot query against a storage dir | tinysearch -m search -S "rust" -N 3 ./output/storage |
wasm | Emit a complete WASM module + JS glue | tinysearch --release -m wasm -p ./wasm_output docs.json |
(Source: examples/blog/README.md, examples/documentation/README.md, README.md)
The --optimize / -o flag enables wasm-opt compression from binaryen, which typically reduces output size by 20โ30% (Source: README.md). Users on Windows or non-Rust environments can run the same workflow through nightly-built Docker images (Source: README.md).
Configuration via `tinysearch.toml`
Introduced in v0.10.0, a TOML configuration file lets users declare which JSON fields are indexed for full-text search, which are stored as display-only metadata, and which field provides the result URL (Source: README.md). Example schemas for blogs, documentation, and e-commerce catalogs are provided in the repository. When the file is absent, the default schema indexes title and body and uses url as the link field (Source: README.md).
Integration Patterns
Static Site Generators
Each supported SSG has a small "index template" that emits a JSON array of pages:
- Zola uses a Tera template iterating
section.pages, skipping drafts, and renderingtitle,permalink, and a sanitizedbody(Source: examples/zola/README.md). - Pelican uses a Jinja template that mirrors the same structure with
tojsonfilters (Source: examples/pelican/README.md). - Blog and documentation sites follow the same pattern, with
tinysearch.tomlconfiguring richer schemas (Sources: examples/blog/README.md, examples/documentation/README.md).
After zola build / pelican content produces the JSON, the typical call is:
tinysearch --optimize --path static public/tinysearch.json/index.html
(Source: examples/zola/README.md)
Library Use from Rust
Custom post types can be indexed without running the executable. The advanced example defines a BlogPost struct, constructs a TinySearch with custom stopwords ("the", "with"), and runs a series of queries over the resulting SearchIndex (Source: examples/library_advanced/main.rs). This addresses the long-standing community request for a public Rust API (Source: issue #183).
Sample Data Layout
The examples folder ships ready-to-use JSON samples illustrating the schema flexibility: blog posts with tags and authors (examples/blog/posts.json), documentation pages with versioning and difficulty (examples/documentation/docs.json), and product catalogs with prices and ratings (examples/ecommerce/products.json). A minimal three-post sample lives at examples/index.json.
Known Operational Considerations
- WASM hosting: Production deployments must serve
.wasmwith the correct MIME type (application/wasm); gzipped content must be advertised viaContent-Encoding(Source: README.md). - Browser WASM loading: Changes to how browsers fetch and instantiate WASM have caused some glue-script implementations to break, particularly when cross-origin isolation or streaming compilation is involved (Source: issue #175).
- CJK text: The engine indexes whole tokens; queries against Chinese or Japanese text may miss matches when the input is split in ways the tokenizer does not understand (Source: issue #179).
- Build directory output: The default invocation emits several files alongside the WASM binary; community members have requested a flag to copy only the final
.wasm(Source: issue #169).
See Also
- Configuration Reference (schema options in
tinysearch.toml) - Library API Guide (
BasicPost,TinySearch,SearchIndex) - Static Site Generator integration recipes (Jekyll, Hugo, Zola, Cobalt, Pelican)
- WebAssembly deployment checklist (MIME types, hosting)
Source: https://github.com/tinysearch/tinysearch / Human Manual
Configuration with tinysearch.toml
Related topics: Overview and Architecture, Static Site Generator Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and Architecture, Static Site Generator Integration
Configuration with tinysearch.toml
Overview
The tinysearch.toml configuration file lets users customize which JSON fields are indexed for full-text search versus which fields are stored as metadata for display. It was introduced in v0.10.0 (PR #181) to address a long-standing roadmap request: making it possible to add arbitrary fields such as product images, descriptions, dates, authors, and categories without modifying the engine itself (Issue #116).
The file is optional. When absent, tinysearch falls back to a default schema that indexes title and body, with url as the URL field. Source: README.md โ *"tinysearch will use the default schema (indexing title and body fields with url as the URL field)."*
Schema Structure
A tinysearch.toml file contains a single [schema] table with three keys:
[schema]
indexed_fields = ["title", "content", "tags"]
metadata_fields = ["author", "date", "category"]
url_field = "permalink"
| Key | Purpose | Required |
|---|---|---|
indexed_fields | List of JSON fields whose tokenized text is placed into the search index. | No (defaults to ["title", "body"]) |
metadata_fields | List of fields passed through verbatim to the output and returned with each search hit. | No (defaults to empty) |
url_field | The single field whose value becomes the clickable link of a result. | No (defaults to "url") |
Source: README.md โ example e-commerce configuration. Source: examples/blog/tinysearch.toml and examples/ecommerce/tinysearch.toml for the three concrete configurations that ship with the repository.
How the engine consumes the file
The tinysearch binary looks for tinysearch.toml in the working directory by default, and can be pointed at an alternate path with the --config CLI flag. It then walks the input JSON array and:
- Concatenates the string values of all
indexed_fieldsfor each document and feeds them into the tokenizer that builds the Xor filter. - Copies each entry in
metadata_fieldsinto the output struct so the WASM module can return it alongside the URL when a query matches. - Reads the value of
url_fieldand uses it as the link target.
Source: src/lib.rs โ the library exposes TinySearch and SearchIndex types and is built on top of the xorf crate, confirming the schema values flow into the filter that backs every query.
Real-World Schemas from the Bundled Examples
The examples/ directory ships three ready-made schemas that demonstrate how the same [schema] table adapts to very different content shapes.
flowchart LR
A[JSON input file] --> B{Read tinysearch.toml}
B -- indexed_fields --> C[Tokenizer + Xor filter]
B -- metadata_fields --> D[Pass-through struct]
B -- url_field --> E[Result links]
C --> F[WASM / search output]
D --> F
E --> F| Example | indexed_fields | metadata_fields | url_field |
|---|---|---|---|
| E-commerce | product_name, description, category, tags | price, image_url, brand, availability, rating, reviews_count | product_url |
| Blog | title, content, excerpt, tags | author, publish_date, category, reading_time, featured_image | permalink |
| Documentation | title, content, section, keywords | version, last_updated, contributor, difficulty, type | doc_url |
Sources: examples/ecommerce/README.md, examples/blog/README.md, examples/documentation/README.md, and the comparison table in examples/README.md.
These three configurations illustrate the typical patterns:
- E-commerce combines human-readable text (
description) with categorical data (category,tags) for matching, while pricing and availability are surfaced as metadata. This pattern directly answers the "is there a way to return the page description or body in the results?" question in Issue #159. - Blog adds
excerptso the body field remains optional and snippet-friendly, addressing part of the roadmap in Issue #116. - Documentation uses
sectionandkeywordsto bias results toward navigational structure, withversionanddifficultyavailable for filtered UI.
Usage Patterns and Common Pitfalls
Pointing the CLI at the right config
Running tinysearch from the example directory picks up the adjacent tinysearch.toml automatically. When integrating with a static site generator it is usually placed at the project root; the Zola guide, for instance, assumes the file is present at the workspace level so all tinysearch invocations resolve the same schema. Source: examples/zola/README.md โ *"Run tinysearch: tinysearch --optimize --path static โฆ"*.
Backward compatibility
Projects that upgraded from v0.8.x or v0.9.x relied on hard-coded title, body, and url keys. Because v0.10.0 keeps the same default schema, legacy JSON input files continue to work without modification. The CI error reported in Issue #182 (failed to select a version for the requirement tinysearch = "^0.9.0") is unrelated to the config file but illustrates why pinning to v0.10.0 is important when adopting the new schema.
Library usage and custom fields
The tinysearch crate can also be driven programmatically, and any custom field exposed via the BasicPost.meta HashMap is preserved through the same code path the TOML file controls at the CLI level. Source: README.md โ *"Add tinysearch to your Cargo.toml: cargo add tinysearch"*. This addresses the request in Issue #183 for a public API usable from Rust without invoking the binary.
Common failure modes
- Mismatched field names โ the schema is case-sensitive; a TOML key of
Titlewill silently produce an empty index because the engine readstitle. Verify by running thestoragemode and inspecting the generatedindex.htmlor by performing a sanity check withtinysearch -m search -S "..." ./output/storage. - CORS / MIME issues at the WASM layer โ the schema is processed before WASM emission, but the resulting module still must be served with
application/wasm. See the "Changes in the way browsers work with wasm" discussion in Issue #175. - Non-Latin tokenization โ because the engine matches whole words, partial Chinese substrings may miss, as reported in Issue #179. Splitting indexed text into whitespace-delimited tokens in the input JSON mitigates this regardless of schema.
See Also
- Library Usage (Experimental) โ driving the same engine from Rust.
- examples/ โ full e-commerce, blog, and documentation sample projects.
- Zola integration guide โ generating a JSON index from a Tera template.
- Roadmap (Issue #116) โ discussion of future schema features such as filters and boolean fields.
Sources: examples/ecommerce/README.md, examples/blog/README.md, examples/documentation/README.md, and the comparison table in examples/README.md.
Static Site Generator Integration
Related topics: Overview and Architecture, Configuration with tinysearch.toml
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and Architecture, Configuration with tinysearch.toml
Static Site Generator Integration
Purpose and Scope
tinysearch is a lightweight, fast, full-text search engine written in Rust and compiled to WebAssembly. It is specifically designed for static websites, making Static Site Generator (SSG) integration its primary use case. The README states: "It can be used together with static site generators such as Jekyll, Hugo, Zola, Cobalt, or Pelican." (README.md)
Integration with an SSG follows a two-phase model: the SSG generates a JSON index from site content, and tinysearch consumes that JSON to produce a WASM payload that runs entirely client-side in the browser. The approach is generator-agnostic โ any SSG capable of emitting a flat JSON array of {title, body/url, ...} records can be wired in.
The v0.10.0 release further strengthened integration by introducing a tinysearch.toml configuration file for declaring indexed vs. metadata fields, and by exposing tinysearch as a Rust library for programmatic index construction (README.md).
Integration Architecture
The end-to-end pipeline can be visualised as a data flow from source markdown content to a browser-resident search engine:
flowchart LR
A[Markdown Pages<br/>in SSG] --> B[SSG Template<br/>Renders JSON]
B --> C[posts.json<br/>index.json]
C --> D[tinysearch CLI<br/>or Library]
D --> E[storage/<br/>binary index]
E --> F[WASM Module<br/>+ JS glue]
F --> G[Browser<br/>Client-Side Search]
style D fill:#f9f,stroke:#333
style F fill:#bbf,stroke:#333Each post is internally converted into an Xor Filter โ "a datastructure for fast approximation of set membership that is smaller than bloom and cuckoo filters" โ and then serialised with bincode into a single binary blob that ships with the site (README.md). The library entry point in src/lib.rs exposes TinySearch::new() and build_index(...) so the same pipeline can be driven from Rust code instead of the CLI (src/lib.rs).
Generator-Specific Integration Patterns
Zola (Tera templates)
The Zola example iterates over section.pages in a Tera template, skips drafts, strips HTML tags, and emits a JSON array. A key gotcha โ discussed in community issue #169 and #178 โ is that Zola emits public/tinysearch.json/index.html rather than tinysearch.json, so the CLI invocation is:
tinysearch --optimize --path static public/tinysearch.json/index.html
Source: examples/zola/README.md
The template uses json_encode and explicit replace filters to escape braces, quotes, and backslashes that survive striptags. Zola now also supports a native fuse_json index format that is conceptually compatible with tinysearch, though this route has not yet been officially adopted (see community discussion #178).
Pelican (Jinja templates)
Pelican uses a near-identical Jinja template. The pattern is the same: loop over articles, filter article.status != "draft", and emit records with article.title, article.url, and article.content (examples/pelican/README.md). The template is wired to a static page via frontmatter (Template: json) so pelican content writes the JSON to output/pages/json.html, which is then fed to tinysearch --optimize --path output output/pages/json.html.
Hugo, Jekyll, and Cobalt
The main README explicitly lists Hugo and Jekyll as supported, and the examples/ directory demonstrates the same JSON-as-bridge pattern. Hugo users can rely on its native JSON output formats, while Jekyll users can use Liquid templates to iterate over site.posts and emit records. For any homegrown SSG, community issue #183 raised the request for a Rust library API โ a need fulfilled in v0.10.0 with TinySearch and BasicPost (src/lib.rs).
Configuration with `tinysearch.toml` (v0.10.0)
Prior to v0.10.0, tinysearch assumed a fixed schema of title, body, and url. The new configuration file, introduced in PR #181, lets users declare arbitrary schemas. The README example shows a documentation site with four indexed fields and four metadata fields:
[schema]
indexed_fields = ["title", "content", "section", "keywords"]
metadata_fields = ["version", "last_updated", "contributor", "difficulty"]
url_field = "doc_url"
Source: README.md
This schema is reflected in the documentation example JSON, which extends the basic {title, body, url} shape with section, keywords, doc_url, version, last_updated, contributor, difficulty, and type (examples/documentation/docs.json). The blog example adds excerpt, tags, permalink, author, publish_date, category, reading_time, and featured_image (examples/blog/posts.json).
When no tinysearch.toml is present, tinysearch falls back to the default schema (title and body indexed, url as link).
Build Commands and Optimisation
The CLI exposes three modes relevant to SSG workflows: storage (build the binary index), wasm (build the WASM bundle), and search (run a query against a built index). Common invocation patterns from the README and examples:
# Dev build with demo HTML
tinysearch -m wasm -p wasm_output posts.json
# Production build, no demo
tinysearch --release -m wasm -p wasm_output posts.json
# Optimised WASM (requires binaryen's wasm-opt)
tinysearch --release -o -m wasm -p wasm_output posts.json
Source: README.md
The --optimize flag invokes wasm-opt, typically shrinking the payload by 20โ30% (mentioned in the docs example under Performance Optimization). Production deployment requires the web server to serve .wasm with application/wasm MIME type โ a common source of the "Demo broken" error reported in issue #177.
Library API for Programmatic Integration
For SSG authors who prefer not to shell out to the CLI, the library API introduced in PR #184 exposes the same pipeline. The advanced example demonstrates a custom BlogPost struct with title, slug, content, tags, and author, configured via .with_stopwords(...) (examples/library_advanced/main.rs):
let search = TinySearch::new().with_stopwords(vec!["the".to_string(), "with".to_string()]);
let index = search.build_index(&blog_posts)?;
let results = search.search(&index, "rust", 10);
Source: src/lib.rs
This addresses the request in issue #183 for a way to drive tinysearch from Rust without invoking an executable.
Known Limitations and Community-Reported Issues
Several recurring limitations surface in community discussions and constrain SSG integration choices:
| Limitation | Source | Impact on SSGs |
|---|---|---|
| Whole-word search only โ no prefix or suggestion matching | README.md | Chinese and other non-space-separated languages see partial matches (issue #179) |
| Recommended size: small to medium sites (~2 kB/article uncompressed) | README.md | Large doc sets may exceed browser memory |
| Search relevance can feel non-deterministic | #120 | May require query expansion in template |
| Metadata such as descriptions and images is not returned in result objects | #159 | UI must fetch page separately |
| No native keyword highlighting | #119 | Must be implemented in JS glue layer |
7 files emitted by default into --path | #169 | Staging step required to copy only the WASM file |
The library API and tinysearch.toml schema in v0.10.0 directly address the first three of these gaps by giving SSG authors full control over what gets indexed and how posts are represented.
See Also
- Configuration Reference โ full
tinysearch.tomlschema documentation - Library API โ programmatic use of
TinySearchfrom Rust - WebAssembly Output Format โ what the browser actually loads
- Performance and Size Optimisation โ
--optimizeand brotli/gzip trade-offs
Source: https://github.com/tinysearch/tinysearch / Human Manual
Rust Library API and Programmatic Usage
Related topics: Overview and Architecture, Configuration with tinysearch.toml
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview and Architecture, Configuration with tinysearch.toml
Rust Library API and Programmatic Usage
Overview
tinysearch began life as a standalone command-line tool that ingests a JSON index file and emits a WebAssembly (WASM) blob for browser-side search. Starting with release v0.10.0, the crate can also be consumed directly from Rust, letting developers build and query search indexes in-process without shelling out to the tinysearch binary. This was added in response to a long-standing community request (see issue #183) and is documented as the "Library Usage (Experimental)" section of the README.md.
The library surface is intentionally small. It exposes one trait (Post), one ready-made struct (BasicPost), one engine (TinySearch), and one index type alias (SearchIndex). The design goal โ and the reason the project remains a few tens of kilobytes after compilation โ is that the same Xor-Filter-based representation used on the wire is also the in-memory representation in the library. Source: src/lib.rs.
Core Types and the `Post` Trait
The library is built around a single trait, Post, defined in src/lib.rs. Anything the engine can index must implement it:
| Method | Signature | Purpose |
|---|---|---|
title | fn title(&self) -> &str | Required. The display title of the document. |
url | fn url(&self) -> &str | Required. The link target returned with each hit. |
body | fn body(&self) -> Option<&str> | Optional. The main searchable text. |
meta | fn meta(&self) -> HashMap<String, String> | Optional. Extra fields stored alongside the hit (e.g., author, category). |
BasicPost is the only concrete type shipped by the crate; it is a plain owned-struct that satisfies Post for callers who do not want to write their own implementation. Source: src/lib.rs.
The TinySearch engine is constructed with TinySearch::new() and configured fluently. The crate-level rustdoc shows the minimal path:
use tinysearch::{BasicPost, TinySearch, SearchIndex};
use std::collections::HashMap;
let posts = vec![
BasicPost {
title: "First Post".to_string(),
url: "/first".to_string(),
body: Some("This is the first post content".to_string()),
meta: HashMap::new(),
},
];
let search = TinySearch::new();
let index: SearchIndex = search.build_index(&posts).expect("Failed to build index");
let results = search.search(&index, "rust", 10);
Source: src/lib.rs.
Basic Library Usage
The examples/library_basic/ example in the repository demonstrates the shortest possible integration. A user supplies a Vec of BasicPost, calls build_index, and then calls search(&index, query, limit) to obtain a vector of SearchResult values. The result type carries the title, the URL, and the metadata map that was passed in, so consumers can render hits without consulting the original document. Source: examples/library_basic/main.rs and README.md.
For projects that already have JSON in the standard tinysearch shape (title, url, optional body, optional meta), the API offers a convenience parser so callers do not have to deserialize by hand. The signature is documented in the rustdoc of src/api.rs:
let json = r#"[
{
"title": "My Post",
"url": "/my-post",
"body": "Post content goes here",
"meta": {"category": "programming", "author": "John"}
}
]"#;
let search = TinySearch::new();
let posts = search.parse_posts(json).expect("parse error");
let index = search.build_index(&posts)?;
let results = search.search(&index, "post", 10);
Source: src/api.rs. The parse_posts helper lowers the friction for users migrating from the CLI workflow to the library workflow.
Advanced Usage with Custom Post Types
Most real sites do not have documents that look like BasicPost. The examples/library_advanced/ example defines a domain struct (BlogPost) and implements Post for it. The pattern is straightforward: map the four trait methods to whatever fields your content store exposes. Source: examples/library_advanced/main.rs.
struct BlogPost {
title: String,
slug: String,
content: String,
tags: Vec<String>,
author: String,
}
impl Post for BlogPost {
fn title(&self) -> &str { &self.title }
fn url(&self) -> &str { &self.slug }
fn body(&self) -> Option<&str> { Some(&self.content) }
fn meta(&self) -> HashMap<String, String> {
let mut meta = HashMap::new();
meta.insert("author".to_string(), self.author.clone());
meta.insert("tags".to_string(), self.tags.join(", "));
meta
}
}
This is the integration point that community issue #183 was specifically asking for: a way to call tinysearch from inside a home-grown static site generator without spawning the executable. The meta map is what enables richer result rendering (e.g., showing an author or a thumbnail), which is the same use case behind issue #119 ("Highlight matched keywords in search results") and issue #159 ("Is there a way to return the page description or body in the results?"). Source: examples/library_advanced/main.rs.
A Yew (WebAssembly frontend) integration example lives in examples/yew-example-crate/. It shows the same trait being implemented inside a frontend crate, which is the most direct way to keep a single content schema across build-time index generation and runtime search. Source: examples/yew-example-crate/src/main.rs.
Configuration Options
The library exposes a small but useful configuration surface, all on the TinySearch builder:
| Method | Effect | Source |
|---|---|---|
TinySearch::new() | Construct a default engine. | src/api.rs |
.with_stopwords(words) | Override the default stopword list with a custom collection. | src/api.rs |
search(&index, query, limit) | Run a query and cap the number of returned hits. | src/lib.rs |
build_index(&posts) | Convert any iterable of &impl Post into a SearchIndex. | src/lib.rs |
parse_posts(json) | Parse the canonical CLI JSON shape into Vec<BasicPost>. | src/api.rs |
Stopword tuning is the most common customization in practice. The default list is replaced wholesale by the words you pass, which is what the advanced example relies on to drop common English words that would otherwise bloat the index. Source: src/api.rs and examples/library_advanced/main.rs.
A separate SearchIndex type alias โ re-exported by the library โ represents the fully built, serializable form of the index. The examples/search_index_type.rs example demonstrates how to materialize an index and feed it into a custom sink (for example, a non-WASM target such as a CLI or a server). Source: examples/search_index_type.rs.
End-to-End Data Flow
The library path mirrors the CLI path; the only thing that changes is who drives each step.
flowchart LR
A["Source documents<br/>(any Rust type)"] -->|impl Post| B["Vec<BasicPost><br/>or user struct"]
B --> C["TinySearch::new()"]
C -->|build_index| D["SearchIndex<br/>(Xor Filters + bincode)"]
D -->|search| E["Vec<SearchResult><br/>(title, url, meta)"]
C -.optional.-> F["with_stopwords"]
C -.optional.-> G["parse_posts<br/>(from JSON)"]Source: src/lib.rs, src/api.rs, and README.md.
Limitations and Caveats
The README is explicit that the library surface is experimental and may change. Source: README.md. Inherited engine limitations also apply: only full-word matches are supported (no prefix or fuzzy search), and the index for an entire site must fit in memory because it is one contiguous blob. The roadmap issue #116 tracks adding richer query features (filters, booleans, optional body field) that, once shipped, will land on the library API as well. Source: README.md.
For consumers who need WASM as the final artifact, the library path composes with the CLI path: build the index in Rust, serialize it with the same bincode format the CLI uses, and either embed it directly in a frontend crate (see the Yew example) or write it to disk and run tinysearch --optimize on it. Source: examples/yew-example-crate/src/main.rs and README.md.
See Also
- CLI Usage and JSON Index Format
- Static Site Generator Integration (Hugo, Zola, Pelican, Jekyll)
- tinysearch.toml Configuration
- WASM Deployment and MIME Types
Source: https://github.com/tinysearch/tinysearch / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 16 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: identity.distribution | https://github.com/tinysearch/tinysearch
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/169
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/182
4. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/177
5. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/174
6. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/173
7. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/116
8. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/170
9. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/tinysearch/tinysearch
10. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/tinysearch/tinysearch/issues/175
11. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/tinysearch/tinysearch
12. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/tinysearch/tinysearch
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using tinysearch with real data or production workflows.
- [[SUGGESTION] - Add docs on how to implement this in rust code](https://github.com/tinysearch/tinysearch/issues/183) - github / github_issue
- Roadmap - github / github_issue
- CI error for tinysearch v0.9.0 - github / github_issue
- Changes in the way browsers work with wasm causes issues with some js im - github / github_issue
- Demo broken - github / github_issue
- Add a switch for build dir, and copy only the resulting wasm file to the - github / github_issue
- Search use Chinese character not always return exists result. - github / github_issue
- Zola supports fuse_json index now which should be compatible with tinyse - github / github_issue
- Is there a way to return the page description or body in the results? - github / github_issue
- Highlight matched keywords in search results - github / github_issue
- No such file or directory (os error 2) - github / github_issue
- On npm - github / github_issue
Source: Project Pack community evidence and pitfall evidence