kernel-memory Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

kernel-memory

Research project. A Memory solution for users, teams, and applications.

Overview & Core Architecture

Related topics: Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Three Logical Layers

Continue reading this section for the full explanation and source context.

Section The IKernelMemory Surface

Continue reading this section for the full explanation and source context.

Section The Ingestion Pipeline

Continue reading this section for the full explanation and source context.

Overview & Core Architecture

Purpose and Scope

Kernel Memory (KM) is an open-source, multi-modal retrieval-augmented generation (RAG) service. It is designed to ingest heterogeneous content (PDFs, Office documents, images, audio, web pages, raw text), transform it into embeddings, store it in a vector database, and expose a unified API for semantic search and answer generation. As described in service/Service/README.md, the project is a *Knowledge Management* system, not merely a vector store: a text-generation LLM is part of the default pipeline so that queries return synthesized answers grounded in retrieved passages.

The repository is organized around a small Core library and a set of pluggable extension projects. The Core provides the ingestion pipeline, the public IKernelMemory interface, and the dependency-injection builder (KernelMemoryBuilder). Extension projects contribute concrete implementations for storage, AI models, and vector databases. This separation lets the same code base run in three different deployment topologies, which is the central architectural decision of the project.

Architectural Pillars

Three Logical Layers

KM cleanly separates three concerns, each mapped to a folder in the repository:

Layer	Responsibility	Example Projects
Core	Pipeline, interfaces, builder, defaults	`service/Core`, `service/Abstractions`
Connectors	Concrete AI / Storage / Vector-DB adapters	`extensions/OpenAI`, `extensions/Qdrant`, `extensions/AzureBlobs`
Service & Apps	Web API, async pipeline host, evaluation	`service/Service`, `applications/evaluation`

The Core never references a specific vendor; every dependency (LLM, embedding generator, vector DB, document store, OCR engine) is injected through interfaces. This is what allows the same IKernelMemory instance to be reconfigured from OpenAI to Ollama, or from Azure AI Search to Qdrant, by changing builder extension methods (see examples/README.md).

The `IKernelMemory` Surface

The public entry point is the IKernelMemory interface, exposed by KernelMemoryBuilder.Build(). From the caller's perspective, KM is a small object with two main verbs: ImportXxxAsync (and its async counterpart ImportDocumentAsync) for ingestion, and AskAsync / SearchAsync for retrieval. The service README clarifies that the same interface is used in both *serverless* (in-process) and *service* (web + queue) modes, so application code does not change between the two. The Build() method also accepts options that control which optional services are wired up — this was generalized in 0.96.250116.1 (release notes: *Support Build() options in KM builder extension methods*).

The Ingestion Pipeline

Document ingestion is implemented as a sequence of named *steps* executed by handlers. A canonical sequence is visible in examples/005-dotnet-async-memory-custom-pipeline/README.md:

extract_text — decode the binary document into text (plain decoder or Azure AI Document Intelligence via extensions/AzureAIDocIntel/README.md).
split_text_in_partitions — chunking, delegated to the chunker package (extensions/Chunkers/README.md).
generate_embeddings — call the configured embedding model.
save_memory_records — persist vectors to the configured memory DB.

The pipeline is asynchronous and queue-driven when running as a service: the service README states that the *Core assembly includes also a basic in-memory queue called SimpleQueues, useful for tests and demos*, while production deployments use Azure Queues or RabbitMQ for *reliability and horizontal scaling*. The same pipeline can be customized by registering additional handlers, allowing custom enrichment (summarization, tagging, translation) — see the synthetic-memory example in examples/106-dotnet-retrieve-synthetics/README.md.

flowchart LR
    A[Document Upload] --> B[extract_text]
    B --> C[split_text_in_partitions]
    C --> D[generate_embeddings]
    D --> E[save_memory_records]
    E --> F[(Vector DB)]
    G[User Query] --> H[AskAsync]
    H --> I[Vector Search]
    I --> F
    F --> J[LLM Answer Generation]
    J --> K[Synthesized Answer]

Deployment Modes

KM supports three deployment topologies, all sharing the same IKernelMemory API:

Serverless (in-process) — KernelMemoryBuilder is built inside the host application. No external services are required beyond the configured LLM and vector DB. Suited for small files, tests, and single-tenant apps.
Service (web + async pipeline) — A stand-alone web service accepts uploads and exposes a documented REST API (Swagger UI at /swagger/index.html when running locally). Handlers run in background processes consuming a persistent queue. The official Docker image is published at kernelmemory/service; the source Dockerfile in the repository root can be used for custom builds (see service/Service/README.md).
.NET Aspire — The Aspire extension (extensions/Aspire/README.md) wires KM into an Aspire AppHost for local orchestration and cloud deployment, introduced in 0.95.241216.1 and expanded in subsequent releases.

The Service README warns that, since the 0.96.250115.1 release, *the system throws an exception when mixing volatile and persistent data*, so a deployment must be consistent about whether memory records are ephemeral or durable.

Ecosystem and Extensibility

Around the Core, the repository ships a rich set of official extensions, each published as a separate NuGet package:

AI — OpenAI, Ollama, LlamaSharp (local Llama), Anthropic, Semantic Kernel text completion, and Tiktoken/GPT tokenizers (extensions/OpenAI/README.md, extensions/Ollama/README.md, extensions/LlamaSharp/README.md, extensions/Tiktoken/README.md).
Vector DBs — Qdrant (with a documented caveat about its GUID/INT point-ID limitation forcing an extra round-trip on upsert — extensions/Qdrant/README.md), plus Azure AI Search, Elasticsearch, Postgres, Redis, and SQL Server.
Document Storage — Azure Blob Storage and AWS S3 (with ForcePathStyle support for MinIO added in 0.98.250324.1 — extensions/AzureBlobs/README.md, extensions/AWS/S3/README.md).
OCR / parsing — Azure AI Document Intelligence.

In addition, the tools/ directory (tools/README.md) provides CLI clients (km-cli/upload-file.sh, ask.sh, search.sh) and Docker launch scripts for local vector DBs (Elasticsearch, MSSQL, Qdrant, Redis). The applications/evaluation project (applications/evaluation/README.md) ships a TestSetGenerator that synthesizes evaluation queries from an existing index and computes standard RAG metrics — Faithfulness, Answer Relevancy, Context Recall/Precision, Context Relevancy, Context Entity Recall, Answer Semantic Similarity, and Answer Correctness.

A final architectural trait worth highlighting is the project's commitment to composability over monolithism: every public interface has multiple concrete implementations, and the README's example list explicitly groups topics into *Customizations* (custom handlers, embeddings, decoders, web scrapers) and *Local models and external connectors*. Recent releases reinforce this direction — for example, 0.98.250508.3 added a Japanese text split character and fixed OpenAPI specifications for upload tags/steps, while 0.94.241201.1 introduced response streaming. These changes were made possible precisely because the Core exposes a small, stable surface and defers everything else to extensions.

Ingestion Pipeline & Retrieval (RAG)

Related topics: Overview & Core Architecture, Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Default Step Sequence

Continue reading this section for the full explanation and source context.

Section Pipeline Data Flow

Continue reading this section for the full explanation and source context.

Section Selecting and Customizing Steps

Continue reading this section for the full explanation and source context.

Ingestion Pipeline & Retrieval (RAG)

Overview

Kernel Memory processes user content through a modular ingestion pipeline and answers user questions through a Retrieval-Augmented Generation (RAG) loop. The pipeline is composed of discrete step handlers that each advance a shared DataPipeline object, while retrieval combines vector search, prompt construction, and LLM-based answer generation. Source: service/Abstractions/Pipeline/IPipelineStepHandler.cs:1-

The IPipelineStepHandler interface defines the contract that every handler implements, making the pipeline composable and extensible. Standard handlers shipped in the Core project cover text extraction, partitioning (chunking), embedding generation, and persisting memory records to the configured vector store. Source: service/Core/Handlers/TextExtractionHandler.cs:1-

Ingestion Pipeline

Default Step Sequence

When a Document is submitted through ImportDocumentAsync, Kernel Memory enqueues a pipeline that flows through a sequence of named steps. Each step is a discrete handler hosted either in-process (serverless mode) or as a background service.

Step name	Handler	Responsibility
`extract_text`	`TextExtractionHandler`	Decode raw files (PDF, DOCX, images via Azure AI Doc Intel, etc.) into plain text
`split_text_in_partitions`	`TextPartitioningHandler`	Chunk text into smaller partitions suitable for embedding and retrieval
`generate_embeddings`	`GenerateEmbeddingsHandler`	Produce vector embeddings for each partition (sequential)
`generate_embeddings_parallel`	`GenerateEmbeddingsParallelHandler`	Variant that batches embedding calls concurrently for higher throughput
`summarize`	`SummarizationHandler`	Optional synthetic memory generation (LLM-based summary of the source)
`save_memory_records`	`SaveRecordsHandler`	Persist partitions and vectors to the configured memory DB

Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-, service/Core/Handlers/TextPartitioningHandler.cs:1-, service/Core/Handlers/GenerateEmbeddingsHandlerBase.cs:1-

Pipeline Data Flow

sequenceDiagram
    participant Client
    participant Queue
    participant Extract as TextExtractionHandler
    participant Chunk as TextPartitioningHandler
    participant Embed as GenerateEmbeddingsHandler
    participant Save as SaveRecordsHandler
    Client->>Queue: ImportDocumentAsync(file, tags, steps)
    Queue->>Extract: extract_text
    Extract->>Chunk: split_text_in_partitions
    Chunk->>Embed: generate_embeddings
    Embed->>Save: save_memory_records
    Save-->>Client: document ready (via IsDocumentReadyAsync)

Selecting and Customizing Steps

Steps can be chosen per request via the steps argument, and handlers can run as hosted background services through AddHandlerAsHostedService. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-

host.Services.AddHandlerAsHostedService<TextExtractionHandler>("extract_text");
host.Services.AddHandlerAsHostedService<TextPartitioningHandler>("split_text_in_partitions");
host.Services.AddHandlerAsHostedService<SummarizationHandler>("summarize");
host.Services.AddHandlerAsHostedService<GenerateEmbeddingsHandler>("generate_embeddings");
host.Services.AddHandlerAsHostedService<SaveRecordsHandler>("save_memory_records");

string docId = await memory.ImportDocumentAsync(
    new Document("inProcessTest")
        .AddFile("file1-Wikipedia-Carbon.txt")
        .AddTag("testName", "example3"),
    steps: new[] {
        "extract_text",
        "split_text_in_partitions",
        "generate_embeddings",
        "save_memory_records"
    });

By dropping summarize from the steps array, callers skip synthetic-data generation; by inserting a custom step name they can wire their own IPipelineStepHandler implementation into the same flow. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-

Retrieval and RAG

Kernel Memory exposes two retrieval primitives:

SearchAsync — returns relevant partitions (and citations) from the memory store without invoking an LLM.
AskAsync — performs full RAG: it searches, builds a grounded prompt from the hits, and asks the configured text generator to produce an answer.

The evaluation harness measures Faithfulness, Answer Relevancy, Context Recall, Context Precision, Context Relevancy, Context Entity Recall, Answer Semantic Similarity, and Answer Correctness. Source: applications/evaluation/README.md:1-

Since release 0.96.250115.1, duplicate facts are discarded by default during RAG answer synthesis, improving precision in the generated output. Source: community release note at packages-0.96.250115.1. Synthetic memories such as summaries are first-class retrieval targets — the summarize step writes them back through the same indexing path, so they can be returned alongside raw chunks at query time. Source: examples/106-dotnet-retrieve-synthetics/README.md:1-

Configuration Highlights

Chunkers: shipped as a dedicated package, Microsoft.KernelMemory.Chunkers, configurable per deployment. Source: extensions/Chunkers/README.md:1-
Embedding generator: pluggable; defaults to the configured text-embedding model, but custom generators can be substituted. Source: service/Service/README.md:1-
LLM: used both at ingestion (synthetic data) and at answer time; the service has been tested primarily with OpenAI GPT-3.5 and GPT-4. Source: service/Service/README.md:1-
Queue: in-process SimpleQueues for tests and demos; production deployments use Azure Queues or RabbitMQ for reliability and horizontal scaling. Source: service/Service/README.md:1-

Common Failure Modes and Tips

Mixing volatile and persistent data in the same pipeline raises an exception by design (added in 0.96.250115.1). Source: packages-0.96.250115.1
Step name typos cause the pipeline to wait indefinitely — the strings passed to AddHandlerAsHostedService must exactly match those passed in the steps array. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-
Async completion: poll IsDocumentReadyAsync after ImportDocumentAsync to confirm that the background handlers finished. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-
AWS S3 with MinIO requires ForcePathStyle = true on AWSS3Config (added in 0.98.250324.1). Source: packages-0.98.250324.1
Localization: chunker split characters must match the language; a Japanese split character was added in 0.98.250508.3. Source: packages-0.98.250508.3

Extensions, Connectors & Client Integrations

Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Deployment, Configuration & Customization

Section Related Pages

Continue reading this section for the full explanation and source context.

Extensions, Connectors & Client Integrations

The Kernel Memory repository is built around a small Core package and a large set of satellite extension projects published as independent NuGet packages. The extensions/ folder is the home for these integrations, and it spans three broad families: LLM/embedding connectors, storage and content-extraction connectors, and developer-tooling projects such as .NET Aspire, chunkers, tokenizers, and the evaluation harness. The examples/ folder provides runnable, step-by-step demos for the most common customizations.

Extension Architecture

Core defines the abstract interfaces that any connector must implement, while each project under extensions/ provides a concrete implementation. The service overview makes the separation explicit: Kernel Memory has a clear boundary between the orchestration engine and the underlying storage, embeddings, and LLM dependencies, which is what makes plug-in style extensions practical (Source: service/Service/README.md:18-24). Extensions follow a consistent shape — they expose a typed configuration class plus one or more KernelMemoryBuilder extension methods (e.g. WithOllamaTextGeneration, WithOllamaTextEmbeddingGeneration) that register the dependency in the DI container used by the memory pipeline (Source: extensions/Ollama/README.md:11-23).

Catalog of Official Extensions

Package / Project	Role	Reference
`Microsoft.KernelMemory.AI.Ollama`	LLM and embedding generation via a local Ollama daemon	extensions/Ollama/README.md:1-23
`Microsoft.KernelMemory.AI.LlamaSharp`	On-device Llama inference using LLamaSharp	extensions/LlamaSharp/README.md:1-12
`Microsoft.KernelMemory.AI.Tiktoken`	Token counting/clamping via Tiktoken	extensions/Tiktoken/README.md:1-9
`Microsoft.KernelMemory.Chunkers`	Standalone text partitioning primitives	extensions/Chunkers/README.md:1-9
`Microsoft.KernelMemory.AI` (Aspire)	.NET Aspire AppHost integration for local/cloud	extensions/Aspire/README.md:1-9
`Microsoft.KernelMemory.DataFormats.AzureAIDocIntel`	Azure AI Document Intelligence for OCR/layout	extensions/AzureAIDocIntel/README.md:1-8
AWS S3 adapter	S3-backed binary content storage (MinIO compatible)	extensions/AWS/S3/README.md:1-9
`km-cli/` shell scripts	`upload`, `ask`, `search` clients over HTTP	tools/README.md:1-30
`applications/evaluation`	Offline RAG quality harness (faithfulness, recall, etc.)	applications/evaluation/README.md:3-13

The catalog is intentionally open: contributors are encouraged to add new connectors under extensions/, and the examples/ folder ships a curated list of sample projects covering custom partitioning, embeddings, content decoders, web scrapers, handlers, and provider integrations (Source: examples/README.md:1-30).

LLM and Embedding Connectors

Every LLM connector wraps a third-party model API and exposes it through the ITextGenerator and ITextEmbeddingGenerator interfaces defined in Core. The Ollama connector is a representative example: it accepts an OllamaConfig containing an endpoint URL plus two OllamaModelConfig entries (one for chat, one for embeddings) and is wired in with two builder calls (Source: extensions/Ollama/README.md:13-23). The same pattern is used by the LlamaSharp connector for fully local Llama inference (Source: extensions/LlamaSharp/README.md:1-12), by the Azure OpenAI and OpenAI connectors, and by the Anthropic connector. The service README recommends GPT-3.5/GPT-4 for production and warns that the available token budget directly impacts summarization and answer quality (Source: service/Service/README.md:12-18).

Token management is a first-class concern. The Tiktoken extension is a tokenizer implementation that any connector can be configured to use for accurate token counts, which is critical for chunking and prompt assembly (Source: extensions/Tiktoken/README.md:1-9). The Chunkers extension complements it with reusable text-splitting primitives (Source: extensions/Chunkers/README.md:1-9) that other pipelines can consume without pulling in the full Core.

Storage, Document Intelligence, and Tooling

The repository ships adapters for storing the binary content that backs memory records outside the vector DB. The AWS S3 adapter uploads and retrieves documents using the standard S3 API; recent work added a ForcePathStyle flag to make the same code path work against MinIO (Source: extensions/AWS/S3/README.md:1-9). For richer content extraction, the Azure AI Document Intelligence adapter enables high-accuracy OCR and layout-aware parsing of images and PDFs (Source: extensions/AzureAIDocIntel/README.md:1-8).

On the developer-experience side, the Aspire extension provides a curated set of AppHost extension methods so the service, vector store, and LLM can be orchestrated through .NET Aspire for local and cloud deployments (Source: extensions/Aspire/README.md:1-9). Shell-based clients for upload, ask, and search live under tools/km-cli/ and are documented alongside Docker helpers for spinning up Elasticsearch, MS SQL, Qdrant, and Redis for local debugging (Source: tools/README.md:1-30). The applications/evaluation project adds an offline quality harness that scores a RAG pipeline on faithfulness, answer relevancy, context recall/precision, context relevancy, context entity recall, answer semantic similarity, and answer correctness (Source: applications/evaluation/README.md:3-13). A TestSetGenerator is also provided, which synthesizes a test set from an existing memory and index using a configurable distribution of question types (Source: applications/evaluation/README.md:13-30).

Integration Pattern

In practice a connector is selected at build time and then ignored by application code. The example for async memory with a custom pipeline shows the typical flow: a KernelMemoryBuilder is created, extensions register their services via methods such as AddHandlerAsHostedService, the builder produces a Memory (or async equivalent), and the application calls ImportDocumentAsync / AskAsync against the same high-level API regardless of which LLM, embedder, vector DB, or storage backend is wired in (Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:30-58). The list of example projects under examples/README.md covers the most common customizations, including custom partitioning, custom embeddings, custom content decoders, custom web scrapers, custom handlers, and Anthropic/Ollama/LlamaSharp/LM Studio integrations (Source: examples/README.md:6-30). This uniform contract is what makes the extension ecosystem composable: swapping one connector for another is a builder change, not an application-code change.

Deployment, Configuration & Customization

Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Serverless (In-Process)

Continue reading this section for the full explanation and source context.

Section Async Pipeline (Custom Handlers)

Continue reading this section for the full explanation and source context.

Section Kernel Memory as a Service

Continue reading this section for the full explanation and source context.

Deployment, Configuration & Customization

Overview

Kernel Memory supports a wide spectrum of deployment topologies, from fully in-process "serverless" use to a horizontally scalable web service backed by persistent queues. Customization is achieved through extension packages (LLM connectors, vector stores, content decoders, chunkers) and through the KernelMemoryBuilder fluent API. This page summarizes how the project is deployed, configured, and extended, drawing on the official service README, example projects, extensions, and infrastructure deployment guides.

The service exposes a web API for upload and query, plus an asynchronous data pipeline that ingests documents in the background. Source: service/Service/README.md.

Deployment Topologies

Serverless (In-Process)

For small workloads and demos, all logic runs locally inside the host process. No service is deployed; the application uses MemoryServerless and the default C# handlers. Files can be stored on disk or in Azure Blobs depending on configuration. Source: examples/002-dotnet-Serverless/README.md.

var memory = new KernelMemoryBuilder()
    .WithOpenAIDefaults(Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
    .Build<MemoryServerless>();

await memory.ImportDocumentAsync(new Document("doc012")
    .AddFiles([ "file2.txt", "file3.docx", "file4.pdf" ])
    .AddTag("user", "Blake"));

Async Pipeline (Custom Handlers)

When reliability and scale matter, ingestion can run via hosted background services, with explicit pipeline steps such as extract_text, split_text_in_partitions, generate_embeddings, and save_memory_records. Source: examples/005-dotnet-AsyncMemoryCustomPipeline/README.md.

Kernel Memory as a Service

The reference deployment packages a web service and an asynchronous handler pipeline as separate, independently scalable components. Persistent queues (Azure Queues, RabbitMQ, or the built-in SimpleQueues for tests) decouple ingestion from the API. Source: service/Service/README.md.

Docker and Azure Infrastructure

A pre-built image is published on Docker Hub (kernelmemory/service). A quick-start in demo mode only requires the OPENAI_API_KEY environment variable:

docker run -e OPENAI_API_KEY="..." -p 9001:9001 -it --rm kernelmemory/service

A production-style run mounts an appsettings.Production.json file into /app. Source: service/Service/README.md.

For full cloud provisioning, the infra/ folder contains an ARM/Bicep template that registers the Microsoft.AlertsManagement, Microsoft.App, and Microsoft.ContainerService resource providers and deploys the entire stack via the "Deploy to Azure" button. The deployment typically takes up to 20 minutes. Source: infra/README.md.

flowchart LR
    Client[Client / Web App] -->|HTTP| API[KM Web Service<br/>:9001]
    API -->|enqueue| Q[(Queue: Azure / RabbitMQ / SimpleQueues)]
    Q --> Worker[Async Pipeline Handlers]
    Worker -->|read/write| Blob[(Blob Storage)]
    Worker -->|embeddings + chunks| Vec[(Vector DB)]
    API -->|search| Vec

Configuration

Configuration follows standard ASP.NET Core conventions. Endpoints and authentication details are stored in appsettings.json and can be overridden by appsettings.Development.json when ASPNETCORE_ENVIRONMENT=Development. Source: examples/007-dotnet-serverless-azure/README.md.

Common configuration areas include:

Area	Notes
LLM endpoint	OpenAI, Azure OpenAI, Anthropic, Ollama, LlamaSharp, LM Studio
Embedding generator	Pluggable; bring your own via `WithCustomEmbeddingGeneration`
Vector store	Azure AI Search, Elasticsearch, Postgres, Qdrant, Redis, MS SQL
Content storage	Local disk, Azure Blobs, AWS S3
Queues	`SimpleQueues` (default), Azure Queues, RabbitMQ
Tokenizer	Selectable via configuration (GA 1.0.0)

Source: service/Service/README.md and examples/README.md.

When running the service, we recommend persistent queues for reliability and horizontal scaling, like Azure Queues and RabbitMQ. Source: service/Service/README.md.

A "service config check" was introduced in release 0.96.250115.1 to validate the configuration at startup, and version 0.96.250115.1 also began throwing an exception when callers mix volatile and persistent data inadvertently. Source: release notes referenced in community context.

Customization & Extensions

Kernel Memory is designed for plug-and-play customization. The extensions/ folder ships first-party adapters, while the examples/ folder demonstrates common customization patterns. Source: examples/README.md.

Extensions

Ollama — Connects to a local Ollama service for both text generation and embeddings. Configure endpoint and per-model token limits. Source: extensions/Ollama/README.md.
AWS S3 — Storage adapter that uploads documents and tracks pipeline state in S3 buckets. Source: extensions/AWS/S3/README.md.
Chunkers — Standalone Microsoft.KernelMemory.Chunkers package for advanced text partitioning, including language-specific separators such as the Japanese split character added in 0.98.250508.3. Source: extensions/Chunkers/README.md.
Aspire — .NET Aspire extensions for local and cloud orchestration of Kernel Memory components. Source: extensions/Aspire/README.md.

Custom Pipelines, Prompts, and Decoders

The example catalogue covers custom partitioning (102), custom embedding generators (103), custom LLMs (104), custom content decoders (108), custom web scrapers (109), and custom ingestion handlers (201). RAG prompts and summarization prompts can also be overridden (101), and context parameters can tune the prompt per request (209). Source: examples/README.md.

For advanced scenarios, a single asynchronous pipeline handler can be deployed as a standalone service (202), and Memory instances can be constructed without KernelMemoryBuilder (210). Source: examples/README.md.

CLI and Operational Tools

The tools/ folder includes shell scripts (upload-file.sh, ask.sh, search.sh) for command-line interaction, scripts to launch Elasticsearch, MS SQL, Qdrant, and Redis containers, and an InteractiveSetup project that generates appsettings.Development.json. Source: tools/README.md.

Common Failure Modes

Mixing volatile and persistent data without explicit configuration now raises an exception (release 0.96.250115.1). Plan your index and storage choices before deployment.
SQL Server-backed deployments require the ICU library, which was added to the Docker image in release 0.98.250323.1. Missing ICU causes globalization-related runtime failures.
MinIO compatibility with AWS S3 requires ForcePathStyle = true in AWSS3Config, added in release 0.98.250324.1.
OpenAPI clients should regenerate against the latest schema, as the /upload endpoint specification for tags and steps was corrected in release 0.98.250508.3.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 7 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: identity.distribution | https://github.com/microsoft/kernel-memory

2. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/microsoft/kernel-memory

3. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory

4. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/microsoft/kernel-memory

5. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/microsoft/kernel-memory

6. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory

7. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 11

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using kernel-memory with real data or production workflows.

Kernel Memory 0.98.250508.3 - github / github_release
Kernel Memory 0.98.250324.1 - github / github_release
Kernel Memory 0.98.250323.1 - github / github_release
Kernel Memory 0.97.250211.1 - github / github_release
Kernel Memory 0.96.250120.1 - github / github_release
Kernel Memory 0.96.250116.1 - github / github_release
Kernel Memory 0.96.250115.1 - github / github_release
Kernel Memory 0.95.241216.2 - github / github_release
Kernel Memory 0.95.241216.1 - github / github_release
Kernel Memory 0.94.241201.1 - github / github_release
Installation risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

kernel-memory

Overview & Core Architecture

Related Pages

Overview & Core Architecture

Purpose and Scope

Architectural Pillars

Three Logical Layers

The `IKernelMemory` Surface

The Ingestion Pipeline

Deployment Modes

Ecosystem and Extensibility

See Also

Ingestion Pipeline & Retrieval (RAG)

Related Pages

Ingestion Pipeline & Retrieval (RAG)

Overview

Ingestion Pipeline

Default Step Sequence

Pipeline Data Flow

Selecting and Customizing Steps

Retrieval and RAG

Configuration Highlights

Common Failure Modes and Tips

See Also

Extensions, Connectors & Client Integrations

Related Pages

Extensions, Connectors & Client Integrations

Extension Architecture

Catalog of Official Extensions

LLM and Embedding Connectors

Storage, Document Intelligence, and Tooling

Integration Pattern

See Also

Deployment, Configuration & Customization

Related Pages

Deployment, Configuration & Customization

Overview

Deployment Topologies

Serverless (In-Process)

Async Pipeline (Custom Handlers)

Kernel Memory as a Service

Docker and Azure Infrastructure

Configuration

Customization & Extensions

Extensions

Custom Pipelines, Prompts, and Decoders

CLI and Operational Tools

Common Failure Modes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Installation risk: Installation risk requires verification

2. Capability evidence risk: Capability evidence risk requires verification

3. Maintenance risk: Maintenance risk requires verification

4. Security or permission risk: Security or permission risk requires verification

5. Security or permission risk: Security or permission risk requires verification

6. Maintenance risk: Maintenance risk requires verification

7. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence