minisearch Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node

Overview and Getting Started

Related topics: Core Architecture and Data Model, Configuration and Search API

Section Related Pages

Continue reading this section for the full explanation and source context.

Overview and Getting Started

What is MiniSearch

MiniSearch is a tiny but powerful in-memory full-text search engine written in JavaScript. According to README.md, it is "respectful of resources" and can comfortably run both in Node.js and the browser. The package has zero runtime dependencies (package.json:13) and ships UMD, ESM, and CJS bundles for flexible consumption.

The library is described by its author as a "Tiny but powerful full-text search engine for browser and Node" (package.json:3). It supports BM25-based relevance scoring (src/MiniSearch.ts), fuzzy search, prefix search, auto-suggestions, faceted filtering, and full serialization/deserialization of indexes. The index itself is built on a radix-tree data structure implemented in src/SearchableMap/types.ts, which provides memory-efficient term storage.

Installation

Install the package from npm:

npm install minisearch

The published artifact exposes three entry points (package.json:7-21): a CommonJS build, an ES module build, and a standalone UMD bundle served by both unpkg and jsDelivr. TypeScript type definitions are bundled at dist/es/index.d.ts. A separate entry point, ./SearchableMap, is exported for advanced use cases that need direct access to the radix-tree implementation (src/SearchableMap/types.ts).

Quick Start

The following example demonstrates the typical lifecycle: instantiate, index, search.

import MiniSearch from 'minisearch'

const documents = [
  { id: 1, title: 'Moby Dick', text: 'Call me Ishmael...' },
  { id: 2, title: 'Zen and the Art of Motorcycle Maintenance', text: 'I can see by my watch...' },
  { id: 3, title: 'Neuromancer', text: 'The sky above the port was...' }
]

const miniSearch = new MiniSearch({
  fields: ['title', 'text'],         // fields to index for full-text search
  storeFields: ['title']              // fields to return in search results
})

miniSearch.addAll(documents)

const results = miniSearch.search('zen art motorcycle')
// => [{ id: 2, title: 'Zen and the Art of Motorcycle Maintenance', score: 2.77258, ... }]

Every document must include a unique id field. Fields listed in fields are tokenized and added to the inverted index; fields listed in storeFields are retained verbatim and returned with each result (src/MiniSearch.ts).

How the Indexing Pipeline Works

Each document passes through a deterministic, four-stage pipeline before being searchable:

flowchart LR
    A[Document] --> B[extractField]
    B --> C[tokenize]
    C --> D[processTerm]
    D --> E[Inverted Index]

The stages are defined and documented in src/MiniSearch.ts:

Stage	Option	Default	Purpose
Extract	`extractField`	`document[fieldName]`	Pull a raw value from the document
Stringify	`stringifyField`	`fieldValue.toString()`	Convert non-string values to strings
Tokenize	`tokenize`	`string.split(SPACE_OR_PUNCTUATION)`	Split the string into terms
Process	`processTerm`	`term.toLowerCase()`	Normalize, stem, or expand a term

Both tokenize and processTerm receive the field name as their second argument, allowing per-field customization (src/MiniSearch.test.js:25-56). The processTerm function may return a string, a falsy value (to discard the term), or an array of strings (to expand one token into many indexable terms) (src/MiniSearch.ts).

For documents with nested fields or non-plain-object shapes, a custom extractField is the recommended approach (README.md:120-145):

const miniSearch = new MiniSearch({
  fields: ['title', 'author.name', 'pubYear'],
  extractField: (document, fieldName) => {
    if (fieldName === 'pubYear') return document.pubDate.getFullYear().toString()
    return fieldName.split('.').reduce((doc, key) => doc?.[key], document)
  }
})

Common Pitfalls and Community-Reported Issues

Several recurring themes appear in community discussions and are worth understanding before adopting MiniSearch.

Non-Latin and agglutinative languages. The default tokenizer splits on the Unicode property classes \p{Z} (separators) and \p{P} (punctuation) (src/MiniSearch.ts). This works poorly for Chinese (Issue #201) and Korean, since morphemes are not separated by spaces. A community-maintained Korean morphological tokenizer is available at garu-minisearch-tokenizer, referenced in Issue #314 and Issue #312.

Punctuation handling in the default tokenizer. Because the Unicode Punctuation class includes quotation marks and apostrophes, the default regex can split song's into song and s (Issue #309). Supply a custom tokenize function if your content contains many contractions or quoted strings.

Not all fields are searched. Only fields listed in fields are indexed. If a field appears in the stored JSON dump but not in results, verify it is included in the fields option (Issue #298).

Wildcard misuse. The built-in MiniSearch.wildcard sentinel (*) is intended for programmatic use. Passing arbitrary strings containing * can throw Cannot read properties of undefined (reading 'map') (Issue #307).

Combination queries. When building nested QueryCombination trees, the filter property is applied at the top-level result set, not per sub-query (Issue #304).

Running the Example

A self-contained browser demo lives in examples/plain_js/. To launch it:

cd examples/plain_js
python3 -m http.server   # or: npx http-server -p 8000

Then open the printed URL in a browser. The example showcases search, auto-completion, and several advanced configuration options (examples/plain_js/README.md:7-15).

Core Architecture and Data Model

Related topics: Overview and Getting Started, Configuration and Search API, Extensibility, Internationalization, and Troubleshooting

Section Related Pages

Continue reading this section for the full explanation and source context.

Core Architecture and Data Model

1. Purpose and Scope

MiniSearch is a tiny, dependency-free, in-memory full-text search engine for JavaScript that runs in both Node and the browser. As of version 7.2.0 (package.json), it provides BM25-ranked full-text search, fuzzy matching, prefix matching, auto-suggest, filtering, document serialization, and incremental add/remove operations entirely in RAM.

The "core architecture" describes how a single MiniSearch instance is structured internally, the pipeline that turns a document into searchable terms, and the data structures that store both the inverted index and the user-facing metadata. The "data model" describes the typed shapes used for documents, queries, results, and the on-disk JSON serialization.

2. Class Layout and Internal State

The single export is the MiniSearch<T> class (src/MiniSearch.ts). Each instance owns the following protected/private fields, which together constitute the live in-memory state:

Field	Type	Role
`_options`	`OptionsWithDefaults<T>`	Resolved indexer options with defaults applied
`_index`	`SearchableMap<FieldTermData>`	Inverted index of term → postings per field
`_documentIds` / `_idToShortId`	`Map<number, any>` / `Map<any, number>`	Bidirectional lookup between user IDs and internal numeric IDs
`_fieldIds`	`{ [name: string]: number }`	Compact numeric IDs for each indexed field
`_fieldLength`	`Map<number, number[]>`	Per-document field length (one entry per field)
`_avgFieldLength`	`number[]`	Running average field length, indexed by field ID
`_storedFields`	`Map<number, Record<string, unknown>>`	Stored fields returned verbatim in search results
`_documentCount`, `_nextId`, `_dirtCount`	numbers	Counters used by the indexer and vacuum logic

The internal _index is a SearchableMap, which is built on top of a radix tree as defined in src/SearchableMap/types.ts. That tree representation enables efficient exact lookup, prefix iteration, and fuzzy neighbor traversal without a separate trie. See RadixTree<T> in types.ts for the underlying typed interface.

3. The Indexing Pipeline

A document becomes searchable through three sequential, user-pluggable steps documented in README.md and the JSDoc in src/MiniSearch.ts:

flowchart LR
  A[Document object] -->|extractField| B[String field value]
  B -->|tokenize| C[Raw terms]
  C -->|processTerm| D[Normalized terms]
  D --> E[(Inverted index in _index)]

extractField(document, fieldName) retrieves the raw value for each entry in the user-supplied fields array. The default simply reads document[fieldName], but a custom extractor can reach nested keys (e.g. author.name) or convert Date values to strings (README.md).
tokenize(text, fieldName) splits that value into terms. The default tokenizer splits on the unicode-aware regex SPACE_OR_PUNCTUATION (/[\n\r\p{Z}\p{P}]+/u). This default has been a source of community discussion: see issue #309, where users note that \p{P} also matches ASCII apostrophes, causing words like song's to be split.
processTerm(term, fieldName) normalizes each token (default: lower-case). It may return a falsy value to discard the term, a single string, or an array of expanded strings (for stemming or synonyms) (src/MiniSearch.ts).

Test coverage in src/MiniSearch.test.js verifies that both tokenize and processTerm are invoked with the field name as the second argument, and that returning an array from processTerm correctly expands one token into several indexed terms.

4. Scoring: BM25 with Field Length Normalization

Relevance is computed using BM25+ in src/MiniSearch.ts. The default parameters are:

const defaultBM25params = { k: 1.2, b: 0.7, d: 0.5 }

Where k controls term-frequency saturation, b controls length normalization, and d is the BM25+ floor that guarantees a non-zero contribution from any matching term. The scoring formula requires four pieces of state maintained per document per field: term frequency, total matching document count, total document count, and field length — all already tracked by _index, _fieldLength, _avgFieldLength, and _documentCount (see Section 2).

Community feedback on scoring (issues #129 and #263) indicates that the default parameters can produce surprising rankings when documents are similar; users are expected to tune k, b, and d, or to add boostDocument and per-field boost weights when domain-specific ranking is required.

5. Query and Result Data Model

Queries are typed as Query = QueryCombination | string | Wildcard (src/MiniSearch.ts). A plain string is the common case, a QueryCombination ({ combineWith, queries, ...searchOptions }) expresses nested AND / OR / AND_NOT trees, and the Wildcard is the unique Symbol('*') exposed as MiniSearch.wildcard for matching every document.

Search results are typed by SearchResult, which includes id, score, terms, queryTerms, and a match: MatchInfo object describing which terms hit which fields. Stored fields are merged in via Object.assign(result, this._storedFields.get(docId)) and the user-supplied filter predicate is applied before sorting by score (src/MiniSearch.ts). Note that filter runs on the merged result set, not on each sub-query of a QueryCombination; this distinction has been raised in community discussion (see issue #304).

For persistence, MiniSearch exposes a serialization shape defined by AsPlainObject in src/MiniSearch.ts, containing documentCount, documentIds, fieldIds, fieldLength, averageFieldLength, storedFields, dirtCount, the index array, and a serializationVersion integer. Discarded documents accumulate as "dirt"; vacuum() cleans them in configurable batches (batchSize, batchWait) and can also be triggered automatically by minDirtCount / minDirtFactor thresholds.

Configuration and Search API

Related topics: Overview and Getting Started, Core Architecture and Data Model, Extensibility, Internationalization, and Troubleshooting

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Tokenization and Term Processing Pipeline

Continue reading this section for the full explanation and source context.

Configuration and Search API

MiniSearch exposes a single primary class, MiniSearch, whose surface area is split between construction options (how documents are indexed) and search options (how queries are evaluated at runtime). Both surfaces are declared in src/MiniSearch.ts, where the Options<T> and SearchOptions types live alongside their defaults. Mastering these two option bags is the central task for anyone integrating the library, since every downstream behavior — tokenization, scoring, filtering, suggestion generation — is controlled through them.

This page documents the configuration options passed to new MiniSearch(...), the search-time options accepted by miniSearch.search(...) and miniSearch.autoSuggest(...), and the query expression API (Query, QueryCombination, Wildcard).

Construction: Indexing Configuration

The MiniSearch constructor accepts an Options<T> object that controls how documents are broken into terms, stored, and ranked. The full list, with library defaults, is documented inline in src/MiniSearch.ts:1-30 and replicated in the README. The most consequential options are summarized below.

Option	Type	Default	Purpose
`idField`	`string`	`'id'`	Field used as the unique document identifier. Source: src/MiniSearch.ts
`fields`	`string[]`	`undefined` (required)	Document fields to index for full-text search.
`storeFields`	`string[]`	`[]`	Fields copied into search results without being indexed.
`extractField`	`(doc, field) => unknown`	`doc => doc[field]`	Custom field extractor (e.g. for nested documents).
`tokenize`	`(text, fieldName?) => string[]`	splits on `SPACE_OR_PUNCTUATION`	Splits raw text into terms before normalization.
`processTerm`	`(term, fieldName?) => string \	string[] \	falsy`	`term => term.toLowerCase()`	Normalizes or stems a term; falsy values are dropped.
`searchOptions`	`SearchOptions`	`undefined`	Default options merged into every `search()` call.
`BM25Params`	`{ k, b, d }`	`{ k: 1.2, b: 0.7, d: 0.5 }`	Okapi BM25+ parameters for ranking. Source: src/MiniSearch.ts

The default tokenizer is string.split(SPACE_OR_PUNCTUATION), where SPACE_OR_PUNCTUATION is the Unicode regex /[\n\r\p{Z}\p{P}]+/u. Community issue #309 shows that this regex also matches quotation marks (", '), so users whose text contains contractions such as song's may want to override tokenize (or processTerm) to keep the apostrophe intact. A more invasive option is to swap in a language-aware tokenizer, as demonstrated by the community-built garu-minisearch-tokenizer for Korean (#312, #314) and discussed for Chinese in #201.

The extractField function can return any value, but only stringish content will produce useful search terms. Issue #302 requests richer return-type support for extractField; for now, numeric or array fields used purely for storage should be placed in storeFields rather than fields.

Tokenization and Term Processing Pipeline

The indexing pipeline is a four-step chain: read field value → extractField → tokenize → processTerm. Each step can be overridden independently, and processTerm may return an array to expand one token into multiple indexed terms (useful for hyphenated words or morphological splitters). This same pipeline is reused on the query side unless searchOptions.tokenize or searchOptions.processTerm override it. Source: src/MiniSearch.ts:1-40.

Search-Time Configuration

The SearchOptions object can be passed to miniSearch.search(query, options) or pre-baked into the constructor via searchOptions. Notable members (see the type definitions in src/MiniSearch.ts) include:

prefix — boolean | (term, index, terms) => boolean. When truthy, the last (or all, depending on the function form) terms are matched as prefixes. Issue #299 notes that prefix matches are weighted *lower* than fuzzy matches by default; the prefixWeight and fuzzyWeight options can rebalance this.
fuzzy — boolean | number | function. Numeric values between 0 and 1 are interpreted as a relative Damerau–Levenshtein threshold; values >= 1 are absolute edit distances. Several scoring anomalies around fuzzy search are tracked in #129 and #263.
boostDocument — (id, term, storedFields?) => number returning >= 0. Lets callers reweight or zero-out results based on the document content, enabling business-logic-driven ranking.
combineWith — 'AND' | 'OR'. Determines how multiple query terms are intersected within a single string query.
filter — (result) => boolean. Applied to the *final* result set. Issue #304 clarifies that filter is *not* applied per sub-query when using QueryCombination trees, despite the type signature suggesting it is.
fields — restricts a search call to a subset of indexed fields. Issue #298 is a common pitfall: if the desired field is missing from results, it usually means the field was not declared in either fields (to be indexed) or storeFields (to be returned).

The search method signature is:

search(query: Query, searchOptions: SearchOptions = {}): SearchResult[]

Source: src/MiniSearch.ts.

Query Expression API

A Query is one of:

A plain string (e.g. 'zen art motorcycle').
A Wildcard — the symbol MiniSearch.wildcard (i.e. Symbol('*')). Issue #307 reports that passing the wildcard through certain wrappers can cause a Cannot read properties of undefined (reading 'map') crash in executeQuery; the safe form is to use MiniSearch.wildcard directly.
A QueryCombination object: { combineWith, queries: Query[] }, optionally with any other SearchOptions to scope the sub-query (e.g. boosting or filtering inside a clause).

miniSearch.search({
  combineWith: 'OR',
  queries: [
    { combineWith: 'AND', queries: ['apple', 'pear'] },
    'juice',
    'tree'
  ]
})

This expression-tree API lets external parsers build complex boolean queries. Issue #297 discusses how to translate a parenthesized mini-language into nested QueryCombination nodes, and #311 flags an edge case where combineWith: 'AND' misbehaves when one sub-query contributes no terms. The autoSuggest(queryString, options) method reuses the same SearchOptions, but defaults to prefix-searching the last term with combineWith: 'AND'. Source: src/MiniSearch.ts.

Lifecycle and Result Shape

MiniSearch.search returns SearchResult[], each entry containing id, score, terms (matched document terms), queryTerms (the original query terms that produced a hit), a match map of term → fields, and any stored fields. The full shape is declared in src/MiniSearch.ts. For sorting stability over equal-score ties, issue #301 recommends keeping the original document array in memory and breaking ties by the pre-search index, since MiniSearch does not preserve insertion order through scoring.

The library's runtime layout — single default export plus an optional SearchableMap companion — is declared in package.json ("exports": { ".": { ... }, "./SearchableMap": { ... } }), letting advanced users import just the data structure when needed. A minimal end-to-end usage example is provided in examples/plain_js/README.md, and integration scenarios (including a Rust+WebAssembly port) are referenced in issue #313.

Extensibility, Internationalization, and Troubleshooting

Related topics: Core Architecture and Data Model, Configuration and Search API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Custom Field Extraction (extractField)

Continue reading this section for the full explanation and source context.

Section Custom Tokenization (tokenize)

Continue reading this section for the full explanation and source context.

Section Term Processing and Expansion (processTerm)

Continue reading this section for the full explanation and source context.

Extensibility, Internationalization, and Troubleshooting

MiniSearch ships with sensible defaults — a whitespace/punctuation tokenizer, lowercased term normalization, and BM25 scoring — but real-world use cases frequently demand more. The library exposes a small but powerful set of extension hooks that allow developers to adapt indexing and searching to non-English text, domain-specific data, and unusual retrieval semantics. This page documents those hooks, explains how they enable internationalization (with particular attention to languages such as Korean and Chinese), and catalogs common failure modes reported by the community along with their mitigations.

Extension Points

Custom Field Extraction (`extractField`)

The extractField option determines how a configured field name is read from a document before being tokenized. The default implementation treats documents as plain objects, but the option can be replaced to support nested fields, computed fields, or alternate storage backends.

const miniSearch = new MiniSearch({
  fields: ['title', 'author.name', 'pubYear'],
  extractField: (document, fieldName) => {
    if (fieldName === 'pubYear') {
      return document.pubDate && document.pubDate.getFullYear().toString()
    }
    return fieldName.split('.').reduce((doc, key) => doc && doc[key], document)
  }
})

Source: README.md and extractField option documentation in src/MiniSearch.ts. Because the extracted value is fed directly into the tokenizer and processTerm pipeline, extractField can return any type — including numbers, dates, or arrays — as long as downstream stages can stringify it. This is the primary lever for supporting structured document shapes.

Custom Tokenization (`tokenize`)

tokenize controls how a string field value is split into terms. Its signature is (text: string, fieldName?: string) => string[], allowing different tokenization strategies per field. The default implementation splits on the regex /[\n\r\p{Z}\p{P}]+/u, which treats any Unicode whitespace or punctuation character as a delimiter.

The function signature and contract are defined in src/MiniSearch.ts:

tokenize?: (text: string, fieldName?: string) => string[],

Tests confirm that the field name is passed as the second argument, enabling per-field strategies. Source: src/MiniSearch.test.js: passes field value and name to tokenizer.

Term Processing and Expansion (`processTerm`)

processTerm is invoked on every tokenized term and may normalize, stem, or expand it. It may return a single string, an array of strings (each indexed as a separate term), or a falsy value (which discards the term). This is the canonical place to plug in stemming algorithms, synonyms, or morphological analyzers.

Source: processTerm option documentation in src/MiniSearch.ts. Tests verify expansion behavior:

const processTerm = (string) => string === 'foobar' ? ['foo', 'bar'] : string

Source: src/MiniSearch.test.js: allows processTerm to expand a single term into several terms.

Document-Level Boosting (`boostDocument`)

Search options expose boostDocument, a function returning a multiplicative factor for a (document, term) pair. Returning a number > 1 promotes the document, < 1 demotes it, and a falsy value excludes it entirely. This is distinct from field-level boosting and is useful for time-decay or popularity-based ranking. Source: boostDocument option in src/MiniSearch.ts.

Query Construction Hooks

Beyond indexing, search-time behavior is also extensible. prefix, fuzzy, boost, combineWith, and filter options — all defined in src/MiniSearch.ts — accept either literals or functions, enabling per-term or per-document decisions at query time.

Internationalization

Default Tokenizer Limitations

MiniSearch's default tokenization strategy — splitting on Unicode whitespace and punctuation — works well for languages with explicit word boundaries (English, most European languages). It does not segment languages that do not delimit words with whitespace, including Chinese, Japanese, and Thai, nor does it handle morphological complexity in agglutinative languages such as Korean.

The community has reported that this default behavior breaks Korean search in particular: a query for 학교 ("school") will not match documents containing 학교에 or 학교를 because the tokenizer treats the bound particle as part of the same token. Source: Issue #314: Community Korean tokenizer.

Korean Tokenization (Community Plugin)

A community-maintained morphological tokenizer, garu-minisearch-tokenizer, addresses this gap by integrating a Korean morphological analyzer into MiniSearch's tokenize/processTerm pipeline. The plugin decomposes stems from endings and particles, ensuring that 먹다, 먹었다, and 먹습니다 all share a common stem token. Source: Issue #312: Korean tokenizer plugin.

Practical Internationalization Strategies

Strategy	When to Use	Implementation Hook
Whitespace + punctuation split	English, European languages	Default
Morphological analyzer	Korean, Turkish, Finnish	`tokenize` or `processTerm`
n-gram segmentation	Chinese, Japanese, Thai	`tokenize` returning fixed-length substrings
Stemmer (Porter, Snowball)	English variants, Romance languages	`processTerm`
Synonym expansion	Domain jargon, plurals	`processTerm` returning arrays

For Chinese and similar languages, the most common pattern is to override tokenize to return character bigrams or character n-grams, ensuring every query segment has a chance of overlapping with an indexed segment. Source: Issue #201: Chinese search support.

Common Issues and Troubleshooting

Quotation Marks Split by `SPACE_OR_PUNCTUATION`

The default tokenizer regex — /[\n\r\p{Z}\p{P}]+/u — treats Unicode punctuation, including straight and curly quotation marks (", ', ', "), as delimiters. As a result, song's is split into song and s, which can produce unexpected matches. Developers handling user-generated content with apostrophes should override tokenize with a regex that preserves intra-word apostrophes. Source: Issue #309: tokenize() issue with SPACE_OR_PUNCTUATION and quotes.

`filter` Applied Only to Final Results

The filter search option is evaluated only once, against the aggregated result set, not against each sub-query in a QueryCombination. The TypeScript signature may suggest otherwise, but the implementation applies filter after combining sub-query results. Source: Issue #304: Filtering not working at sub-queries level. When sub-query filtering is required, post-process the results yourself, or restructure the query so each sub-query carries identical filterable stored fields.

`Cannot read properties of undefined (reading 'keys')`

This runtime error has been observed in production crash reports and traces to internal state during query execution when the index is in an unexpected configuration. Source: Issue #306: Cannot read properties of undefined (reading 'keys'). Mitigation typically involves ensuring that documents are added before searches are issued, and that no replace operations are interleaved with concurrent searches.

Wildcard Symbol Misuse

MiniSearch.wildcard is exposed as a Symbol('*'), but passing the *string* "*" (or invoking Symbol('*') inline) does not match the wildcard identity check. Searching with Symbol('*') as a string can produce Cannot read properties of undefined (reading 'map') because the internal code path expects the exact exported symbol. Source: Issue #307: BUG global wildcard symbol is not safe. Always use MiniSearch.wildcard rather than constructing a new symbol.

Scoring Surprises

Several reports (#263, #129) describe cases where documents containing the exact query term score lower than documents with only fuzzy or prefix matches. This typically stems from the BM25+ frequency normalization lower bound (d parameter) or from boostDocument functions that return values < 1 for exact matches. Source: Issue #263: Fuzzy score issues, Issue #129: Issues with scoring. Adjust BM25Params or per-term boost weights to rebalance.

Preserving External Sort Order

When results are pre-sorted by an external criterion (e.g., update time) and search is layered on top, equal-scoring results may be reordered unpredictably. Source: Issue #301: How to correctly retain sorting?. Stable sort behavior in JavaScript is guaranteed only since ES2019; if targeting older environments, sort with a tie-breaker that re-applies the original index.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

high Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

high Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 19 structured pitfall item(s), including 4 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

1. Capability evidence risk: Capability evidence risk requires verification

Severity: high
Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/304

2. Runtime risk: Runtime risk requires verification

Severity: high
Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/306

3. Runtime risk: Runtime risk requires verification

Severity: high
Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/307

4. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/309

5. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/302

6. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/300

7. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/308

8. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/lucaong/minisearch

9. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/lucaong/minisearch

10. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/lucaong/minisearch

11. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/lucaong/minisearch

12. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/lucaong/minisearch/issues/314

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using minisearch with real data or production workflows.

Community Korean tokenizer: garu-minisearch-tokenizer - github / github_issue
Cannot read properties of undefined (reading 'keys') - github / github_issue
Korean tokenizer plugin: garu-minisearch-tokenizer - github / github_issue
Edge case in combineWith:'AND' - github / github_issue
tokenize() issue with SPACE_OR_PUNCTUATION and quotes - github / github_issue
Question: is minisearch siutable for code search? - github / github_issue
[[BUG] global wildcard symbol is not safe.](https://github.com/lucaong/minisearch/issues/307) - github / github_issue
Filtering not working at sub-queries level - github / github_issue
Allow return types other than string in extractField function - github / github_issue
Increase default prefix weight when prefix is requested - github / github_issue
Question about combineWith - github / github_issue
Not including all fields in search - github / github_issue

Source: Project Pack community evidence and pitfall evidence

minisearch

Overview and Getting Started

Related Pages

Overview and Getting Started

What is MiniSearch

Installation

Quick Start

How the Indexing Pipeline Works

Common Pitfalls and Community-Reported Issues

Running the Example

See Also

Core Architecture and Data Model

Related Pages

Core Architecture and Data Model

1. Purpose and Scope

2. Class Layout and Internal State

3. The Indexing Pipeline

4. Scoring: BM25 with Field Length Normalization

5. Query and Result Data Model

See Also

Configuration and Search API

Related Pages

Configuration and Search API

Construction: Indexing Configuration

Tokenization and Term Processing Pipeline

Search-Time Configuration

Query Expression API

Lifecycle and Result Shape

See Also

Extensibility, Internationalization, and Troubleshooting

Related Pages

Extensibility, Internationalization, and Troubleshooting

Extension Points

Custom Field Extraction (`extractField`)

Custom Tokenization (`tokenize`)

Term Processing and Expansion (`processTerm`)

Document-Level Boosting (`boostDocument`)

Query Construction Hooks

Internationalization

Default Tokenizer Limitations

Korean Tokenization (Community Plugin)

Practical Internationalization Strategies

Common Issues and Troubleshooting

Quotation Marks Split by `SPACE_OR_PUNCTUATION`

`filter` Applied Only to Final Results

`Cannot read properties of undefined (reading 'keys')`

Wildcard Symbol Misuse

Scoring Surprises

Preserving External Sort Order

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Capability evidence risk: Capability evidence risk requires verification

2. Runtime risk: Runtime risk requires verification

3. Runtime risk: Runtime risk requires verification

4. Security or permission risk: Security or permission risk requires verification

5. Installation risk: Installation risk requires verification

6. Installation risk: Installation risk requires verification

7. Installation risk: Installation risk requires verification

8. Capability evidence risk: Capability evidence risk requires verification

9. Maintenance risk: Maintenance risk requires verification

10. Security or permission risk: Security or permission risk requires verification

11. Security or permission risk: Security or permission risk requires verification

12. Security or permission risk: Security or permission risk requires verification

Community Discussion Evidence

Community Discussion Evidence