Doramagic Project Pack · Human Manual
kernel-memory
Research project. A Memory solution for users, teams, and applications.
Overview & Core Architecture
Related topics: Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization
Overview & Core Architecture
Purpose and Scope
Kernel Memory (KM) is an open-source, multi-modal retrieval-augmented generation (RAG) service. It is designed to ingest heterogeneous content (PDFs, Office documents, images, audio, web pages, raw text), transform it into embeddings, store it in a vector database, and expose a unified API for semantic search and answer generation. As described in service/Service/README.md, the project is a *Knowledge Management* system, not merely a vector store: a text-generation LLM is part of the default pipeline so that queries return synthesized answers grounded in retrieved passages.
The repository is organized around a small Core library and a set of pluggable extension projects. The Core provides the ingestion pipeline, the public IKernelMemory interface, and the dependency-injection builder (KernelMemoryBuilder). Extension projects contribute concrete implementations for storage, AI models, and vector databases. This separation lets the same code base run in three different deployment topologies, which is the central architectural decision of the project.
Architectural Pillars
Three Logical Layers
KM cleanly separates three concerns, each mapped to a folder in the repository:
| Layer | Responsibility | Example Projects |
|---|---|---|
| Core | Pipeline, interfaces, builder, defaults | service/Core, service/Abstractions |
| Connectors | Concrete AI / Storage / Vector-DB adapters | extensions/OpenAI, extensions/Qdrant, extensions/AzureBlobs |
| Service & Apps | Web API, async pipeline host, evaluation | service/Service, applications/evaluation |
The Core never references a specific vendor; every dependency (LLM, embedding generator, vector DB, document store, OCR engine) is injected through interfaces. This is what allows the same IKernelMemory instance to be reconfigured from OpenAI to Ollama, or from Azure AI Search to Qdrant, by changing builder extension methods (see examples/README.md).
The `IKernelMemory` Surface
The public entry point is the IKernelMemory interface, exposed by KernelMemoryBuilder.Build(). From the caller's perspective, KM is a small object with two main verbs: ImportXxxAsync (and its async counterpart ImportDocumentAsync) for ingestion, and AskAsync / SearchAsync for retrieval. The service README clarifies that the same interface is used in both *serverless* (in-process) and *service* (web + queue) modes, so application code does not change between the two. The Build() method also accepts options that control which optional services are wired up — this was generalized in 0.96.250116.1 (release notes: *Support Build() options in KM builder extension methods*).
The Ingestion Pipeline
Document ingestion is implemented as a sequence of named *steps* executed by handlers. A canonical sequence is visible in examples/005-dotnet-async-memory-custom-pipeline/README.md:
extract_text— decode the binary document into text (plain decoder or Azure AI Document Intelligence via extensions/AzureAIDocIntel/README.md).split_text_in_partitions— chunking, delegated to the chunker package (extensions/Chunkers/README.md).generate_embeddings— call the configured embedding model.save_memory_records— persist vectors to the configured memory DB.
The pipeline is asynchronous and queue-driven when running as a service: the service README states that the *Core assembly includes also a basic in-memory queue called SimpleQueues, useful for tests and demos*, while production deployments use Azure Queues or RabbitMQ for *reliability and horizontal scaling*. The same pipeline can be customized by registering additional handlers, allowing custom enrichment (summarization, tagging, translation) — see the synthetic-memory example in examples/106-dotnet-retrieve-synthetics/README.md.
flowchart LR
A[Document Upload] --> B[extract_text]
B --> C[split_text_in_partitions]
C --> D[generate_embeddings]
D --> E[save_memory_records]
E --> F[(Vector DB)]
G[User Query] --> H[AskAsync]
H --> I[Vector Search]
I --> F
F --> J[LLM Answer Generation]
J --> K[Synthesized Answer]Deployment Modes
KM supports three deployment topologies, all sharing the same IKernelMemory API:
- Serverless (in-process) —
KernelMemoryBuilderis built inside the host application. No external services are required beyond the configured LLM and vector DB. Suited for small files, tests, and single-tenant apps. - Service (web + async pipeline) — A stand-alone web service accepts uploads and exposes a documented REST API (Swagger UI at
/swagger/index.htmlwhen running locally). Handlers run in background processes consuming a persistent queue. The official Docker image is published atkernelmemory/service; the source Dockerfile in the repository root can be used for custom builds (see service/Service/README.md). - .NET Aspire — The Aspire extension (extensions/Aspire/README.md) wires KM into an Aspire AppHost for local orchestration and cloud deployment, introduced in 0.95.241216.1 and expanded in subsequent releases.
The Service README warns that, since the 0.96.250115.1 release, *the system throws an exception when mixing volatile and persistent data*, so a deployment must be consistent about whether memory records are ephemeral or durable.
Ecosystem and Extensibility
Around the Core, the repository ships a rich set of official extensions, each published as a separate NuGet package:
- AI — OpenAI, Ollama, LlamaSharp (local Llama), Anthropic, Semantic Kernel text completion, and Tiktoken/GPT tokenizers (extensions/OpenAI/README.md, extensions/Ollama/README.md, extensions/LlamaSharp/README.md, extensions/Tiktoken/README.md).
- Vector DBs — Qdrant (with a documented caveat about its GUID/INT point-ID limitation forcing an extra round-trip on upsert — extensions/Qdrant/README.md), plus Azure AI Search, Elasticsearch, Postgres, Redis, and SQL Server.
- Document Storage — Azure Blob Storage and AWS S3 (with
ForcePathStylesupport for MinIO added in 0.98.250324.1 — extensions/AzureBlobs/README.md, extensions/AWS/S3/README.md). - OCR / parsing — Azure AI Document Intelligence.
In addition, the tools/ directory (tools/README.md) provides CLI clients (km-cli/upload-file.sh, ask.sh, search.sh) and Docker launch scripts for local vector DBs (Elasticsearch, MSSQL, Qdrant, Redis). The applications/evaluation project (applications/evaluation/README.md) ships a TestSetGenerator that synthesizes evaluation queries from an existing index and computes standard RAG metrics — Faithfulness, Answer Relevancy, Context Recall/Precision, Context Relevancy, Context Entity Recall, Answer Semantic Similarity, and Answer Correctness.
A final architectural trait worth highlighting is the project's commitment to composability over monolithism: every public interface has multiple concrete implementations, and the README's example list explicitly groups topics into *Customizations* (custom handlers, embeddings, decoders, web scrapers) and *Local models and external connectors*. Recent releases reinforce this direction — for example, 0.98.250508.3 added a Japanese text split character and fixed OpenAPI specifications for upload tags/steps, while 0.94.241201.1 introduced response streaming. These changes were made possible precisely because the Core exposes a small, stable surface and defers everything else to extensions.
See Also
- Ingestion Pipeline & Handlers
- Vector Database Connectors
- LLM & Embedding Connectors
- Service Deployment & Docker
- Evaluation & Test-Set Generation
Source: https://github.com/microsoft/kernel-memory / Human Manual
Ingestion Pipeline & Retrieval (RAG)
Related topics: Overview & Core Architecture, Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & Core Architecture, Extensions, Connectors & Client Integrations, Deployment, Configuration & Customization
Ingestion Pipeline & Retrieval (RAG)
Overview
Kernel Memory processes user content through a modular ingestion pipeline and answers user questions through a Retrieval-Augmented Generation (RAG) loop. The pipeline is composed of discrete step handlers that each advance a shared DataPipeline object, while retrieval combines vector search, prompt construction, and LLM-based answer generation. Source: service/Abstractions/Pipeline/IPipelineStepHandler.cs:1-
The IPipelineStepHandler interface defines the contract that every handler implements, making the pipeline composable and extensible. Standard handlers shipped in the Core project cover text extraction, partitioning (chunking), embedding generation, and persisting memory records to the configured vector store. Source: service/Core/Handlers/TextExtractionHandler.cs:1-
Ingestion Pipeline
Default Step Sequence
When a Document is submitted through ImportDocumentAsync, Kernel Memory enqueues a pipeline that flows through a sequence of named steps. Each step is a discrete handler hosted either in-process (serverless mode) or as a background service.
| Step name | Handler | Responsibility |
|---|---|---|
extract_text | TextExtractionHandler | Decode raw files (PDF, DOCX, images via Azure AI Doc Intel, etc.) into plain text |
split_text_in_partitions | TextPartitioningHandler | Chunk text into smaller partitions suitable for embedding and retrieval |
generate_embeddings | GenerateEmbeddingsHandler | Produce vector embeddings for each partition (sequential) |
generate_embeddings_parallel | GenerateEmbeddingsParallelHandler | Variant that batches embedding calls concurrently for higher throughput |
summarize | SummarizationHandler | Optional synthetic memory generation (LLM-based summary of the source) |
save_memory_records | SaveRecordsHandler | Persist partitions and vectors to the configured memory DB |
Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-, service/Core/Handlers/TextPartitioningHandler.cs:1-, service/Core/Handlers/GenerateEmbeddingsHandlerBase.cs:1-
Pipeline Data Flow
sequenceDiagram
participant Client
participant Queue
participant Extract as TextExtractionHandler
participant Chunk as TextPartitioningHandler
participant Embed as GenerateEmbeddingsHandler
participant Save as SaveRecordsHandler
Client->>Queue: ImportDocumentAsync(file, tags, steps)
Queue->>Extract: extract_text
Extract->>Chunk: split_text_in_partitions
Chunk->>Embed: generate_embeddings
Embed->>Save: save_memory_records
Save-->>Client: document ready (via IsDocumentReadyAsync)Selecting and Customizing Steps
Steps can be chosen per request via the steps argument, and handlers can run as hosted background services through AddHandlerAsHostedService. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-
host.Services.AddHandlerAsHostedService<TextExtractionHandler>("extract_text");
host.Services.AddHandlerAsHostedService<TextPartitioningHandler>("split_text_in_partitions");
host.Services.AddHandlerAsHostedService<SummarizationHandler>("summarize");
host.Services.AddHandlerAsHostedService<GenerateEmbeddingsHandler>("generate_embeddings");
host.Services.AddHandlerAsHostedService<SaveRecordsHandler>("save_memory_records");
string docId = await memory.ImportDocumentAsync(
new Document("inProcessTest")
.AddFile("file1-Wikipedia-Carbon.txt")
.AddTag("testName", "example3"),
steps: new[] {
"extract_text",
"split_text_in_partitions",
"generate_embeddings",
"save_memory_records"
});
By dropping summarize from the steps array, callers skip synthetic-data generation; by inserting a custom step name they can wire their own IPipelineStepHandler implementation into the same flow. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1-
Retrieval and RAG
Kernel Memory exposes two retrieval primitives:
SearchAsync— returns relevant partitions (and citations) from the memory store without invoking an LLM.AskAsync— performs full RAG: it searches, builds a grounded prompt from the hits, and asks the configured text generator to produce an answer.
The evaluation harness measures Faithfulness, Answer Relevancy, Context Recall, Context Precision, Context Relevancy, Context Entity Recall, Answer Semantic Similarity, and Answer Correctness. Source: applications/evaluation/README.md:1-
Since release 0.96.250115.1, duplicate facts are discarded by default during RAG answer synthesis, improving precision in the generated output. Source: community release note at packages-0.96.250115.1. Synthetic memories such as summaries are first-class retrieval targets — the summarize step writes them back through the same indexing path, so they can be returned alongside raw chunks at query time. Source: examples/106-dotnet-retrieve-synthetics/README.md:1-
Configuration Highlights
- Chunkers: shipped as a dedicated package,
Microsoft.KernelMemory.Chunkers, configurable per deployment. Source: extensions/Chunkers/README.md:1- - Embedding generator: pluggable; defaults to the configured text-embedding model, but custom generators can be substituted. Source: service/Service/README.md:1-
- LLM: used both at ingestion (synthetic data) and at answer time; the service has been tested primarily with OpenAI GPT-3.5 and GPT-4. Source: service/Service/README.md:1-
- Queue: in-process
SimpleQueuesfor tests and demos; production deployments use Azure Queues or RabbitMQ for reliability and horizontal scaling. Source: service/Service/README.md:1-
Common Failure Modes and Tips
- Mixing volatile and persistent data in the same pipeline raises an exception by design (added in
0.96.250115.1). Source: packages-0.96.250115.1 - Step name typos cause the pipeline to wait indefinitely — the strings passed to
AddHandlerAsHostedServicemust exactly match those passed in thestepsarray. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1- - Async completion: poll
IsDocumentReadyAsyncafterImportDocumentAsyncto confirm that the background handlers finished. Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:1- - AWS S3 with MinIO requires
ForcePathStyle = trueonAWSS3Config(added in0.98.250324.1). Source: packages-0.98.250324.1 - Localization: chunker split characters must match the language; a Japanese split character was added in
0.98.250508.3. Source: packages-0.98.250508.3
See Also
- Service architecture overview: service/Service/README.md
- Examples index (serverless, async, custom pipelines, RAG): examples/README.md
- Text chunkers extension: extensions/Chunkers/README.md
- Evaluation harness (RAG quality metrics): applications/evaluation/README.md
Source: https://github.com/microsoft/kernel-memory / Human Manual
Extensions, Connectors & Client Integrations
Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Deployment, Configuration & Customization
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Deployment, Configuration & Customization
Extensions, Connectors & Client Integrations
The Kernel Memory repository is built around a small Core package and a large set of satellite extension projects published as independent NuGet packages. The extensions/ folder is the home for these integrations, and it spans three broad families: LLM/embedding connectors, storage and content-extraction connectors, and developer-tooling projects such as .NET Aspire, chunkers, tokenizers, and the evaluation harness. The examples/ folder provides runnable, step-by-step demos for the most common customizations.
Extension Architecture
Core defines the abstract interfaces that any connector must implement, while each project under extensions/ provides a concrete implementation. The service overview makes the separation explicit: Kernel Memory has a clear boundary between the orchestration engine and the underlying storage, embeddings, and LLM dependencies, which is what makes plug-in style extensions practical (Source: service/Service/README.md:18-24). Extensions follow a consistent shape — they expose a typed configuration class plus one or more KernelMemoryBuilder extension methods (e.g. WithOllamaTextGeneration, WithOllamaTextEmbeddingGeneration) that register the dependency in the DI container used by the memory pipeline (Source: extensions/Ollama/README.md:11-23).
Catalog of Official Extensions
| Package / Project | Role | Reference |
|---|---|---|
Microsoft.KernelMemory.AI.Ollama | LLM and embedding generation via a local Ollama daemon | extensions/Ollama/README.md:1-23 |
Microsoft.KernelMemory.AI.LlamaSharp | On-device Llama inference using LLamaSharp | extensions/LlamaSharp/README.md:1-12 |
Microsoft.KernelMemory.AI.Tiktoken | Token counting/clamping via Tiktoken | extensions/Tiktoken/README.md:1-9 |
Microsoft.KernelMemory.Chunkers | Standalone text partitioning primitives | extensions/Chunkers/README.md:1-9 |
Microsoft.KernelMemory.AI (Aspire) | .NET Aspire AppHost integration for local/cloud | extensions/Aspire/README.md:1-9 |
Microsoft.KernelMemory.DataFormats.AzureAIDocIntel | Azure AI Document Intelligence for OCR/layout | extensions/AzureAIDocIntel/README.md:1-8 |
| AWS S3 adapter | S3-backed binary content storage (MinIO compatible) | extensions/AWS/S3/README.md:1-9 |
km-cli/ shell scripts | upload, ask, search clients over HTTP | tools/README.md:1-30 |
applications/evaluation | Offline RAG quality harness (faithfulness, recall, etc.) | applications/evaluation/README.md:3-13 |
The catalog is intentionally open: contributors are encouraged to add new connectors under extensions/, and the examples/ folder ships a curated list of sample projects covering custom partitioning, embeddings, content decoders, web scrapers, handlers, and provider integrations (Source: examples/README.md:1-30).
LLM and Embedding Connectors
Every LLM connector wraps a third-party model API and exposes it through the ITextGenerator and ITextEmbeddingGenerator interfaces defined in Core. The Ollama connector is a representative example: it accepts an OllamaConfig containing an endpoint URL plus two OllamaModelConfig entries (one for chat, one for embeddings) and is wired in with two builder calls (Source: extensions/Ollama/README.md:13-23). The same pattern is used by the LlamaSharp connector for fully local Llama inference (Source: extensions/LlamaSharp/README.md:1-12), by the Azure OpenAI and OpenAI connectors, and by the Anthropic connector. The service README recommends GPT-3.5/GPT-4 for production and warns that the available token budget directly impacts summarization and answer quality (Source: service/Service/README.md:12-18).
Token management is a first-class concern. The Tiktoken extension is a tokenizer implementation that any connector can be configured to use for accurate token counts, which is critical for chunking and prompt assembly (Source: extensions/Tiktoken/README.md:1-9). The Chunkers extension complements it with reusable text-splitting primitives (Source: extensions/Chunkers/README.md:1-9) that other pipelines can consume without pulling in the full Core.
Storage, Document Intelligence, and Tooling
The repository ships adapters for storing the binary content that backs memory records outside the vector DB. The AWS S3 adapter uploads and retrieves documents using the standard S3 API; recent work added a ForcePathStyle flag to make the same code path work against MinIO (Source: extensions/AWS/S3/README.md:1-9). For richer content extraction, the Azure AI Document Intelligence adapter enables high-accuracy OCR and layout-aware parsing of images and PDFs (Source: extensions/AzureAIDocIntel/README.md:1-8).
On the developer-experience side, the Aspire extension provides a curated set of AppHost extension methods so the service, vector store, and LLM can be orchestrated through .NET Aspire for local and cloud deployments (Source: extensions/Aspire/README.md:1-9). Shell-based clients for upload, ask, and search live under tools/km-cli/ and are documented alongside Docker helpers for spinning up Elasticsearch, MS SQL, Qdrant, and Redis for local debugging (Source: tools/README.md:1-30). The applications/evaluation project adds an offline quality harness that scores a RAG pipeline on faithfulness, answer relevancy, context recall/precision, context relevancy, context entity recall, answer semantic similarity, and answer correctness (Source: applications/evaluation/README.md:3-13). A TestSetGenerator is also provided, which synthesizes a test set from an existing memory and index using a configurable distribution of question types (Source: applications/evaluation/README.md:13-30).
Integration Pattern
In practice a connector is selected at build time and then ignored by application code. The example for async memory with a custom pipeline shows the typical flow: a KernelMemoryBuilder is created, extensions register their services via methods such as AddHandlerAsHostedService, the builder produces a Memory (or async equivalent), and the application calls ImportDocumentAsync / AskAsync against the same high-level API regardless of which LLM, embedder, vector DB, or storage backend is wired in (Source: examples/005-dotnet-async-memory-custom-pipeline/README.md:30-58). The list of example projects under examples/README.md covers the most common customizations, including custom partitioning, custom embeddings, custom content decoders, custom web scrapers, custom handlers, and Anthropic/Ollama/LlamaSharp/LM Studio integrations (Source: examples/README.md:6-30). This uniform contract is what makes the extension ecosystem composable: swapping one connector for another is a builder change, not an application-code change.
See Also
- Service deployment and Docker: service/Service/README.md
- Example catalog: examples/README.md
- Evaluation harness: applications/evaluation/README.md
- Tooling scripts and CLI: tools/README.md
Source: https://github.com/microsoft/kernel-memory / Human Manual
Deployment, Configuration & Customization
Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview & Core Architecture, Ingestion Pipeline & Retrieval (RAG), Extensions, Connectors & Client Integrations
Deployment, Configuration & Customization
Overview
Kernel Memory supports a wide spectrum of deployment topologies, from fully in-process "serverless" use to a horizontally scalable web service backed by persistent queues. Customization is achieved through extension packages (LLM connectors, vector stores, content decoders, chunkers) and through the KernelMemoryBuilder fluent API. This page summarizes how the project is deployed, configured, and extended, drawing on the official service README, example projects, extensions, and infrastructure deployment guides.
The service exposes a web API for upload and query, plus an asynchronous data pipeline that ingests documents in the background. Source: service/Service/README.md.
Deployment Topologies
Serverless (In-Process)
For small workloads and demos, all logic runs locally inside the host process. No service is deployed; the application uses MemoryServerless and the default C# handlers. Files can be stored on disk or in Azure Blobs depending on configuration. Source: examples/002-dotnet-Serverless/README.md.
var memory = new KernelMemoryBuilder()
.WithOpenAIDefaults(Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.Build<MemoryServerless>();
await memory.ImportDocumentAsync(new Document("doc012")
.AddFiles([ "file2.txt", "file3.docx", "file4.pdf" ])
.AddTag("user", "Blake"));
Async Pipeline (Custom Handlers)
When reliability and scale matter, ingestion can run via hosted background services, with explicit pipeline steps such as extract_text, split_text_in_partitions, generate_embeddings, and save_memory_records. Source: examples/005-dotnet-AsyncMemoryCustomPipeline/README.md.
Kernel Memory as a Service
The reference deployment packages a web service and an asynchronous handler pipeline as separate, independently scalable components. Persistent queues (Azure Queues, RabbitMQ, or the built-in SimpleQueues for tests) decouple ingestion from the API. Source: service/Service/README.md.
Docker and Azure Infrastructure
A pre-built image is published on Docker Hub (kernelmemory/service). A quick-start in demo mode only requires the OPENAI_API_KEY environment variable:
docker run -e OPENAI_API_KEY="..." -p 9001:9001 -it --rm kernelmemory/service
A production-style run mounts an appsettings.Production.json file into /app. Source: service/Service/README.md.
For full cloud provisioning, the infra/ folder contains an ARM/Bicep template that registers the Microsoft.AlertsManagement, Microsoft.App, and Microsoft.ContainerService resource providers and deploys the entire stack via the "Deploy to Azure" button. The deployment typically takes up to 20 minutes. Source: infra/README.md.
flowchart LR
Client[Client / Web App] -->|HTTP| API[KM Web Service<br/>:9001]
API -->|enqueue| Q[(Queue: Azure / RabbitMQ / SimpleQueues)]
Q --> Worker[Async Pipeline Handlers]
Worker -->|read/write| Blob[(Blob Storage)]
Worker -->|embeddings + chunks| Vec[(Vector DB)]
API -->|search| VecConfiguration
Configuration follows standard ASP.NET Core conventions. Endpoints and authentication details are stored in appsettings.json and can be overridden by appsettings.Development.json when ASPNETCORE_ENVIRONMENT=Development. Source: examples/007-dotnet-serverless-azure/README.md.
Common configuration areas include:
| Area | Notes |
|---|---|
| LLM endpoint | OpenAI, Azure OpenAI, Anthropic, Ollama, LlamaSharp, LM Studio |
| Embedding generator | Pluggable; bring your own via WithCustomEmbeddingGeneration |
| Vector store | Azure AI Search, Elasticsearch, Postgres, Qdrant, Redis, MS SQL |
| Content storage | Local disk, Azure Blobs, AWS S3 |
| Queues | SimpleQueues (default), Azure Queues, RabbitMQ |
| Tokenizer | Selectable via configuration (GA 1.0.0) |
Source: service/Service/README.md and examples/README.md.
When running the service, we recommend persistent queues for reliability and horizontal scaling, like Azure Queues and RabbitMQ. Source: service/Service/README.md.
A "service config check" was introduced in release 0.96.250115.1 to validate the configuration at startup, and version 0.96.250115.1 also began throwing an exception when callers mix volatile and persistent data inadvertently. Source: release notes referenced in community context.
Customization & Extensions
Kernel Memory is designed for plug-and-play customization. The extensions/ folder ships first-party adapters, while the examples/ folder demonstrates common customization patterns. Source: examples/README.md.
Extensions
- Ollama — Connects to a local Ollama service for both text generation and embeddings. Configure endpoint and per-model token limits. Source: extensions/Ollama/README.md.
- AWS S3 — Storage adapter that uploads documents and tracks pipeline state in S3 buckets. Source: extensions/AWS/S3/README.md.
- Chunkers — Standalone
Microsoft.KernelMemory.Chunkerspackage for advanced text partitioning, including language-specific separators such as the Japanese split character added in 0.98.250508.3. Source: extensions/Chunkers/README.md. - Aspire — .NET Aspire extensions for local and cloud orchestration of Kernel Memory components. Source: extensions/Aspire/README.md.
Custom Pipelines, Prompts, and Decoders
The example catalogue covers custom partitioning (102), custom embedding generators (103), custom LLMs (104), custom content decoders (108), custom web scrapers (109), and custom ingestion handlers (201). RAG prompts and summarization prompts can also be overridden (101), and context parameters can tune the prompt per request (209). Source: examples/README.md.
For advanced scenarios, a single asynchronous pipeline handler can be deployed as a standalone service (202), and Memory instances can be constructed without KernelMemoryBuilder (210). Source: examples/README.md.
CLI and Operational Tools
The tools/ folder includes shell scripts (upload-file.sh, ask.sh, search.sh) for command-line interaction, scripts to launch Elasticsearch, MS SQL, Qdrant, and Redis containers, and an InteractiveSetup project that generates appsettings.Development.json. Source: tools/README.md.
Common Failure Modes
- Mixing volatile and persistent data without explicit configuration now raises an exception (release 0.96.250115.1). Plan your index and storage choices before deployment.
- SQL Server-backed deployments require the ICU library, which was added to the Docker image in release 0.98.250323.1. Missing ICU causes globalization-related runtime failures.
- MinIO compatibility with AWS S3 requires
ForcePathStyle = trueinAWSS3Config, added in release 0.98.250324.1. - OpenAPI clients should regenerate against the latest schema, as the
/uploadendpoint specification fortagsandstepswas corrected in release 0.98.250508.3.
See Also
- extensions/Chunkers/README.md — Text partitioning extensions
- extensions/Aspire/README.md — .NET Aspire orchestration
- extensions/Ollama/README.md — Local LLM via Ollama
- extensions/AWS/S3/README.md — S3 storage adapter
- infra/README.md — Azure deployment accelerator
- tools/README.md — CLI scripts and dev tools
Source: https://github.com/microsoft/kernel-memory / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 7 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: identity.distribution | https://github.com/microsoft/kernel-memory
2. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/microsoft/kernel-memory
3. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory
4. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/microsoft/kernel-memory
5. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/microsoft/kernel-memory
6. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory
7. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/kernel-memory
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using kernel-memory with real data or production workflows.
- Kernel Memory 0.98.250508.3 - github / github_release
- Kernel Memory 0.98.250324.1 - github / github_release
- Kernel Memory 0.98.250323.1 - github / github_release
- Kernel Memory 0.97.250211.1 - github / github_release
- Kernel Memory 0.96.250120.1 - github / github_release
- Kernel Memory 0.96.250116.1 - github / github_release
- Kernel Memory 0.96.250115.1 - github / github_release
- Kernel Memory 0.95.241216.2 - github / github_release
- Kernel Memory 0.95.241216.1 - github / github_release
- Kernel Memory 0.94.241201.1 - github / github_release
- Installation risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence