ktx Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

ktx

ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, or other AI agents to query analytical databases accurately and with full context of your company

Overview & System Architecture

Related topics: Database Connectors & Context Source Adapters, Semantic Layer, Wiki & Hybrid Search (Context Engine), Agent Integration — CLI, MCP Server & LLM Runtime

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 4.1 CLI and Project Bootstrap

Continue reading this section for the full explanation and source context.

Section 4.2 Context Engine and Semantic Layer

Continue reading this section for the full explanation and source context.

Section 4.3 LookML and Embeddings

Continue reading this section for the full explanation and source context.

Overview & System Architecture

1. What ktx Is

ktx is a self-improving context layer that teaches agents how to query a warehouse accurately — from approved metric definitions, joinable columns, and business knowledge it builds and maintains for you. It is local-first, read-only by design, and works on top of an existing SQL warehouse rather than replacing it Source: README.md.

The project maps a warehouse, builds a semantic layer, absorbs wiki / Notion / team knowledge, flags contradictions across sources, and ships CLI plus MCP tools that agents (Claude Code, Codex, Cursor, OpenCode) can call at execution time. It targets PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite, and integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion Source: README.md.

The repository is licensed under Apache-2.0 and published as the npm package @kaelio/ktx Source: package.json.

2. Workspace Layout

The repository is a pnpm + uv workspace. The root package.json coordinates TypeScript packages and Python projects Source: package.json. The CLI lives in packages/cli and is the published npm package; it embeds the core context engine, LLM and embedding providers, and database scan connectors Source: README.md.

The Python side hosts two projects under python/:

Path	Purpose
`python/ktx-sl`	Semantic-layer query planning
`python/ktx-daemon`	Portable compute service (HTTP server) for heavy/portable work

The daemon exposes a long-lived HTTP server (default 127.0.0.1:8765) and command entry points such as lookml-parse, embedding-compute, embedding-compute-bulk, semantic-generate-sources, database-introspect, and code-execute Source: python/ktx-daemon/src/ktx_daemon/__main__.py.

3. High-Level System Architecture

The system separates user-facing orchestration (Node/TypeScript CLI) from portable, language-agnostic compute (Python daemon). The CLI owns project state, command parsing, agent integration, and provider wiring; the daemon owns deterministic planning work — LookML parsing, embedding computation, semantic-layer plan generation, and SQL generation — that benefits from Python's data tooling.

flowchart LR
    User([User / Agent]) -->|CLI / MCP| CLI["packages/cli<br/>(TypeScript)"]
    CLI -->|HTTP JSON| Daemon["python/ktx-daemon<br/>(uv-managed Python)"]
    CLI -->|SQL read-only| Warehouse[(SQL Warehouse)]
    CLI -->|catalog fetch| Sources["Context Sources<br/>(dbt, LookML, Looker,<br/>Metabase, Notion, ...)"]
    Daemon --> SL["ktx-sl<br/>(semantic-layer planner)"]
    Daemon --> Embed["sentence-transformers<br/>(local embeddings)"]
    CLI --> LLM["LLM Providers<br/>(Anthropic, Vertex,<br/>AI Gateway, Claude Agent,<br/>Codex SDK)"]
    CLI --> State[(".ktx/<br/>local state")]
    Sources --> Wiki["wiki/ tree"]
    Sources --> SL

4. Core Subsystems

4.1 CLI and Project Bootstrap

The CLI is published from packages/cli and depends on @modelcontextprotocol/sdk, commander, ink, @notionhq/client, snowflake-sdk, mysql2, mssql, pg, better-sqlite3, openai, @openai/codex-sdk, posthog-node, and zod Source: packages/cli/package.json. The setup command orchestrates project creation, provider configuration, connection setup, agent integration, and runtime installation Source: packages/cli/src/commands/setup-commands.ts. Supported agent targets include claude-code, claude-desktop, codex, cursor, opencode, and universal, with --global / --local scopes and an --install-dir override Source: packages/cli/src/commands/setup-commands.ts.

4.2 Context Engine and Semantic Layer

The context engine ingests databases, BI tools, modeling code, and knowledge content, then organizes, deduplicates, and flags contradictions for human review. The semantic layer combines raw tables and high-level metrics through a join graph that automatically resolves chasm and fan traps, so agents fetch metrics declaratively instead of rewriting canonical SQL each time Source: README.md.

The Python semantic-layer module validates duplicate measure names, builds column provenance, and plans queries through query_semantic_layer, emitting sl_plan_completed and sql_gen_completed telemetry events Source: python/ktx-daemon/src/ktx_daemon/semantic_layer.py. Source generation from schema scans infers column roles (time / default), generates measures from numeric columns, and normalizes relationship declarations (MANY_TO_ONE, ONE_TO_MANY, ONE_TO_ONE) Source: python/ktx-daemon/src/ktx_daemon/source_generation.py.

4.3 LookML and Embeddings

LookML projects are parsed by parse_lookml_project, which collects constants, parses files, resolves extends/refinements and column references, and emits views and joins filtered against skipped views Source: python/ktx-daemon/src/ktx_daemon/lookml.py.

Embeddings use a local sentence-transformers backend by default (all-MiniLM-L6-v2, 384 dimensions, max batch size 100) and expose both single and bulk endpoints Source: python/ktx-daemon/src/ktx_daemon/embeddings.py.

4.4 Telemetry and Operations

Both the CLI and the daemon emit a shared, schema-validated telemetry catalog. The daemon emits a Python-specific subset (daemon_started, daemon_stopped, sl_plan_completed, sql_gen_completed) and builds a common envelope including cliVersion, nodeVersion (Python version), osPlatform, arch, runtime (daemon-py), and isCi Source: python/ktx-daemon/src/ktx_daemon/telemetry/events.py. The schema enumerates each event's allowed fields and rejects unknown events Source: python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json.

5. Project Layout on Disk

A ktx setup project lays out files in a predictable shape Source: README.md:

my-project/
├── ktx.yaml                         # Project configuration
├── semantic-layer/<connection-id>/  # YAML semantic sources
├── wiki/global/                     # Shared business context
├── wiki/user/<user-id>/             # User-scoped notes
├── raw-sources/<connection-id>/     # Ingest artifacts and reports
└── .ktx/                            # Local state and secrets, git-ignored

Project resolution defaults to KTX_PROJECT_DIR, then the nearest ktx.yaml, then the current directory; pass --project-dir <path> when scripting Source: README.md. Users commit ktx.yaml, semantic-layer/, and wiki/, while .ktx/ stays local.

6. Extension Points Relevant to the Community

Recent issues describe planned adapters and connectors that plug into the same architecture: Sigma context sources modeled on the Looker adapter (#168), Amazon Redshift read-only connectors (#161), Confluence page-based adapters modeled on Notion (#169), and MongoDB treated as a primary context source alongside ClickHouse (#305). Version 0.13.1 (2026-06-23) shipped a SQL Server CTE-hoisting fix, and v0.13.0 introduced cross-database federation for DuckDB and a ktx setup --agents install directory override — all of which extend the existing CLI + daemon split rather than restructuring it.

Database Connectors & Context Source Adapters

Related topics: Overview & System Architecture, Semantic Layer, Wiki & Hybrid Search (Context Engine)

Section Related Pages

Continue reading this section for the full explanation and source context.

Database Connectors & Context Source Adapters

Architecture & Scope

ktx distinguishes two complementary extension surfaces that together build the context an AI agent consumes:

Database connectors under packages/cli/src/connectors/ provide read-only access to SQL warehouses. They drive schema introspection, table/column sampling, and (where supported) query-history ingestion used by ktx ingest Source: [README.md:1-118].
Context source adapters under packages/context/src/ingest/adapters/ ingest business knowledge (BI tools, transformation manifests, wiki pages) and convert it into the on-disk wiki/ and semantic-layer/ trees Source: [README.md:1-118].

The README makes the split explicit: ktx "works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite" on the warehouse side, and "integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion" on the knowledge side Source: [README.md:67-72].

A separate Python daemon (python/ktx-daemon) owns the heavy parsing work that the CLI orchestrates — LookML projects, semantic-layer planning, embedding compute, and source generation from raw schema scans Source: [python/ktx-daemon/src/ktx_daemon/__main__.py:1-79].

flowchart LR
  CLI[ktx CLI<br/>packages/cli] -->|ingest| DB[Database Connectors]
  CLI -->|ingest| AD[Context Source Adapters]
  DB -->|schema/sample| DAEMON[ktx-daemon<br/>Python]
  AD -->|raw artifacts| DAEMON
  DAEMON -->|semantic-layer/| FS[semantic-layer/ tree]
  DAEMON -->|wiki/| FS2[wiki/ tree]
  FS --> AGENT[Agents via CLI/MCP]
  FS2 --> AGENT

Database Connectors

The CLI is a pnpm + uv workspace with the connector code living under packages/cli/src/connectors/; the retired packages/connector-* package layout is explicitly not the convention Source: [README.md:137-150]. Each driver is shipped via a focused npm dependency so the install footprint stays minimal:

Driver	Runtime dependency	Source
PostgreSQL	`pg ^8.21.0`	packages/cli/package.json:1-41
MySQL	`mysql2 ^3.22.3`	packages/cli/package.json:1-41
SQL Server	`mssql ^12.5.4`	packages/cli/package.json:1-41
Snowflake	`snowflake-sdk ^2.4.2`	packages/cli/package.json:1-41
BigQuery	`@google-cloud/bigquery` (transitive)	packages/cli/package.json:1-41
ClickHouse	`@clickhouse/client`	packages/cli/package.json:1-41
SQLite	`better-sqlite3 ^12.10.0`	packages/cli/package.json:1-41

These connectors are read-only by design, matching the "Read-only by design" column in the README's comparison table Source: [README.md:31-35]. A first SQL Server bug fix in v0.13.1 ("hoist leading CTEs out of row-limit derived-table wrap") is illustrative of the kind of correctness work the connectors receive as new dialects and features are exercised Source: [v0.13.1 release notes].

Community requests expand the supported matrix: Amazon Redshift and Amazon Athena are tracked as scoped "read-only connector" issues that mirror the existing SQL-server pattern Source: [issue #161, issue #164]. MongoDB is requested as a *primary* context source alongside the SQL warehouses, on the same model as the existing ClickHouse support Source: [issue #305].

Context Source Adapters

Context source adapters transform third-party artifacts into the on-disk semantic-layer/ and wiki/ trees that the semantic-layer daemon consumes. The adapters in scope today are:

dbt — manifests and models become semantic-layer sources.
MetricFlow — metric definitions become reusable measures.
LookML — parsed by the daemon via lookml-parser and lkml, then resolved into KSL-ready structures Source: [python/ktx-daemon/src/ktx_daemon/lookml.py:1-97].
Looker — modeled as the reference adapter for new BI-tool integrations Source: [issue #168].
Metabase — questions/cards/dashboards become wiki pages.
Notion — pages are converted to Markdown and dropped into wiki/global/ or wiki/user/ Source: [README.md:104-114].

Adapter outputs feed two production paths:

Semantic-layer planning. The daemon loads SourceDefinition models and validates that no two measures on the same source share a name before any join planning runs Source: [python/ktx-daemon/src/ktx_daemon/semantic_layer.py:1-58].
Wiki ingestion. Page-based adapters (Notion today; Confluence proposed) produce Markdown files that the CLI later indexes and the daemon later embeds via ComputeEmbeddingBulkRequest (default 384-dim all-MiniLM-L6-v2) Source: [python/ktx-daemon/src/ktx_daemon/embeddings.py:1-99].

Open community work targets Sigma Computing (workbooks, datasets, calculated columns), Confluence Cloud spaces, and MongoDB; each issue explicitly names an existing adapter to model the new one on, which keeps the contract uniform across BI tools and page-based knowledge bases Source: [issue #168, issue #169, issue #305].

Configuration via Setup

Connectors and adapters are wired in by ktx setup, which scans setup-commands.ts for the option names that gate each integration. The whitelist of options that trigger the source-configuration flow is explicit and centralized:

sourceAuthTokenRef
sourceUrl
sourceApiKeyRef
sourceClientId, sourceClientSecretRef
sourceWarehouseConnectionId
sourceProjectName
sourceProfilesPath
sourceTarget
metabaseDatabaseId
notionCrawlMode
skipSources

Source: [packages/cli/src/commands/setup-commands.ts:1-21

The same command also drives agent integration with explicit --agents, --target (claude-code | claude-desktop | codex | cursor | opencode | universal), --global, --local, --install-dir, and --skip-agents options, plus connection-setup recovery added in v0.9.0 Source: [packages/cli/src/commands/setup-commands.ts:23-41, v0.9.0 release notes]. v0.6.0 introduced a --skip-context-sources clack-style tree picker so users can opt out of individual adapters without re-running the whole wizard Source: [v0.6.0 release notes].

After setup, every configured connection and source feeds ktx ingest, which always builds the enriched semantic layer (the legacy --fast flag was removed in v0.8.0) Source: [v0.8.0 release notes]. Successful runs emit typed telemetry events (ingest_completed, wiki_query_completed, mcp_request_completed) with strict JSON-Schema field whitelists — no file paths, SQL, table names, or argv are recorded in the catalog payload Source: [python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json:1-118, python/ktx-daemon/src/ktx_daemon/telemetry/events.py:1-58].

Common Failure Modes

Wrong package location. New contributors often reach for packages/connector-<name>/; that layout was retired in favor of modules inside packages/cli Source: [README.md:137-150, issue #161].
Skipping --skip-context-sources. Without the picker, a misconfigured adapter (for example a Notion token without the right scopes) blocks the entire setup run; v0.6.0 added the picker specifically to make individual sources skippable Source: [v0.6.0 release notes].
Treating ingest as optional. Since v0.8.0 the --fast flag is gone — ktx ingest always builds the enriched context, so a connection that only completes setup without ingest will report "Context sources configured: yes" but no wiki/ or semantic-layer/ artifacts on disk Source: [v0.8.0 release notes].
Dialect-specific regressions. Cross-dialect fixes (e.g., the SQL Server CTE hoisting bug fixed in v0.13.1) ship in patch releases; always pin to the latest patch before filing dialect issues Source: [v0.13.1 release notes].

Semantic Layer, Wiki & Hybrid Search (Context Engine)

Related topics: Overview & System Architecture, Database Connectors & Context Source Adapters, Agent Integration — CLI, MCP Server & LLM Runtime

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Plan and Generate SQL

Continue reading this section for the full explanation and source context.

Section Bootstrap Sources from Scan and LookML

Continue reading this section for the full explanation and source context.

Section Engine Endpoints

Continue reading this section for the full explanation and source context.

Semantic Layer, Wiki & Hybrid Search (Context Engine)

Purpose and Scope

The Context Engine is the heart of ktx. It ingests warehouse metadata, business knowledge, and approved metric definitions, then exposes a single, hybrid search surface that AI agents (Claude Code, Codex, Cursor, OpenCode) can query at execution time. The engine is split into two cooperating corpora:

A semantic layer (ktx-sl, Python) that resolves a join graph of raw tables and metric definitions into dialect-correct SQL.
A wiki tree (wiki/global/, wiki/user/<user-id>/) of Markdown pages that capture human context — definitions, runbooks, and policies.

Both corpora are indexed with local embeddings so the CLI and MCP server can return combined full-text and semantic matches from a single query such as ktx sl "revenue" or ktx wiki "refund policy" (see README.md §First commands). The engine is read-only by design: agents consume definitions, they do not author them.

The portable compute half of the engine lives in python/ktx-daemon, exposed as a FastAPI service in python/ktx-daemon/src/ktx_daemon/app.py and as subcommands in python/ktx-daemon/src/ktx_daemon/__main__.py.

Semantic Layer Pipeline

Plan and Generate SQL

A semantic-layer query begins with a SemanticLayerQueryRequest carrying a sources list, a declarative query, a dialect, and an optional projectId (python/ktx-daemon/src/ktx_daemon/semantic_layer.py:SemanticLayerQueryRequest). The query_semantic_layer function:

Loads raw sources into SourceDefinition objects and rejects duplicate source names.
Validates measure uniqueness per source via validate_measure_duplicates (from semantic_layer.duplicate_check).
Runs the SemanticEngine planner and emits a QueryResult with resolved measures, joined dimensions, and provenance.
Rewrites the response columns so that dimension expressions and qualified measure references surface under stable names — dimension.expr is preferred, otherwise the qualified_ref of the measure is used (python/ktx-daemon/src/ktx_daemon/semantic_layer.py:_response_columns).

The same module exposes validate_semantic_layer for a ValidateSourcesRequest, returning per-source warnings and error strings without executing a query. The Python engine is shared by the CLI (python/ktx-sl) and the ktx-daemon HTTP service, so the planner behavior is identical in both code paths.

Bootstrap Sources from Scan and LookML

Two adapters build the corpus the planner consumes:

Schema scan → YAML. python/ktx-daemon/src/ktx_daemon/source_generation.py infers each column's time vs default role from its type via _TIME_PATTERN, classifies primary keys with _ID_PATTERN, derives a grain from the PK columns, and synthesizes sum/avg/count/min/max measures with description text from column comments. Joins are inverted (many_to_one ↔ one_to_many) before being attached to the right source.
LookML → KSL. python/ktx-daemon/src/ktx_daemon/lookml.py parses manifest.lkml constants, resolves extends / refinements, substitutes ${TABLE} and ${view.field} references, normalizes type strings through LOOKML_TYPE_MAP, and emits views + joins ready for the semantic-layer planner. Joins whose target view was skipped are dropped.

Both adapters are reachable from the daemon CLI (semantic-generate-sources, lookml-parse, database-introspect) and from HTTP routes registered in python/ktx-daemon/src/ktx_daemon/app.py.

Engine Endpoints

The FastAPI app wires the planner, validator, embedder, and adapters into the routes that ktx setup enables:

Endpoint group	Purpose
`POST /semantic-layer/query`	Plan + render SQL for a metric query
`POST /semantic-layer/validate`	Source-level validation with warnings
`POST /lookml/parse`	Convert a LookML bundle into KSL sources
`POST /semantic-generate-sources`	Build sources from a schema scan
`POST /embedding/compute` and `.../compute-bulk`	Local sentence-transformer vectors
`POST /code/execute` (opt-in via `--enable-code-execution`)	Sandboxed Python with `numpy` JSON helpers

Subcommands mirror the same operations (__main__.py) so they can be invoked from uv run ktx-daemon … in CI or local scripts.

Wiki and Hybrid Search

The wiki half of the engine is a Markdown tree under the project root. ktx wiki "<query>" performs hybrid search — keyword plus semantic embedding match — over wiki/global/ and the per-user wiki/user/<user-id>/ namespace (README.md §Project Layout). Embeddings are produced locally through python/ktx-daemon/src/ktx_daemon/embeddings.py, which exposes a SentenceTransformer provider with these defaults:

Setting	Value	Source
Model	`all-MiniLM-L6-v2`	`DEFAULT_SENTENCE_TRANSFORMER_MODEL`
Dimensions	`384`	`DEFAULT_EMBEDDING_DIMENSIONS`
Batch size cap	`100`	`DEFAULT_MAX_BATCH_SIZE`

At search time, the CLI and MCP server combine a full-text scan of the indexed Markdown with a vector similarity lookup against the same index, so a single natural-language query returns both metric definitions from the semantic layer and human-authored notes. Telemetry confirms the loop: wiki_query_completed records queryLength, resultCount, durationMs, and outcome (python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json), and mcp_request_completed is emitted for sampled MCP tool calls.

flowchart LR
    A[Source: dbt / LookML / schema scan] --> B[source_generation.py / lookml.py]
    B --> S[(semantic-layer/ YAML)]
    W[wiki/global + wiki/user Markdown] --> I[(Indexer: BM25 + 384-d embeddings)]
    S --> I
    I --> H[Hybrid Search]
    Q[ktx sl / ktx wiki / MCP tool] --> H
    H --> R[SQL plan or wiki excerpt]
    R --> Agent[Claude Code / Codex / Cursor]

Configuration, Telemetry, and Failure Modes

Project resolution defaults to KTX_PROJECT_DIR, then the nearest ktx.yaml, then the current directory (README.md §Project Layout). The engine surfaces its readiness through ktx status, which lists LLM readiness, embedding readiness, configured databases, configured context sources, and whether ktx context built: yes — all of which the hybrid search layer depends on.

Telemetry is opt-out and schema-validated by events.schema.json. sl_plan_completed, sql_gen_completed, wiki_query_completed, and mcp_request_completed are the events most relevant to diagnosing context-engine behavior; the runtime field distinguishes node from daemon-py so mixed-runtime runs are visible. Error payloads explicitly redact SQL text, row data, secrets, argv, and user-typed prompt text, per README.md §Telemetry.

Common failure modes that users report:

Plan fails on duplicate measure names. The planner raises on a duplicate measure inside the same source, surfaced through _validate_duplicate_measure_names.
LookML joins dropped silently. When a referenced view is skipped (e.g., unsupported type), parse_lookml_project filters the join out; the skipped_views and warnings lists are the only signal.
Stale wiki after re-ingest. Hybrid search returns outdated pages until the index is rebuilt; re-run ktx ingest after edits to wiki/ or semantic-layer/.
Missing context sources. Community issues (#168 Sigma, #169 Confluence, #161 Redshift, #305 MongoDB) all request adapters that slot into the same ingest pipeline under packages/context/src/ingest/adapters/, modeled on the existing LookML and Notion adapters.

Agent Integration — CLI, MCP Server & LLM Runtime

Related topics: Overview & System Architecture, Semantic Layer, Wiki & Hybrid Search (Context Engine)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Setup, Status, and Ingest

Continue reading this section for the full explanation and source context.

Section Knowledge and Semantic-Layer Search

Continue reading this section for the full explanation and source context.

Section Shell Completion

Continue reading this section for the full explanation and source context.

Agent Integration — CLI, MCP Server & LLM Runtime

Overview

ktx is a self-improving context layer that gives agents (Claude Code, Codex, Cursor, OpenCode) approved metric definitions, joinable columns, and business knowledge for querying a warehouse. Source: README.md.

The agent integration layer is split across three runtimes that cooperate at execution time:

TypeScript CLI (packages/cli) — the user-facing command surface, project bootstrap, status checks, ingestion orchestration, and shell completions.
MCP server — exposed by the CLI both over stdio (for in-process agent clients) and HTTP (for concurrent multi-agent use), backed by a managed local daemon.
Python compute daemon (python/ktx-daemon) — a FastAPI portable compute service that handles LookML parsing, semantic-layer SQL generation, schema introspection, embeddings, and optional sandboxed code execution.

LLM backends are pluggable: Anthropic API, Google Vertex AI, an AI Gateway, the local Claude Code session via the Claude Agent SDK, or Codex authentication via the Codex SDK. The Codex backend was added in v0.9.0 (issue #253). Source: packages/cli/package.json (dependencies include @anthropic-ai/claude-agent-sdk and @openai/codex-sdk).

flowchart LR
    User[User / Agent client] -->|invokes| CLI[ktx CLI - packages/cli]
    CLI -->|reads/writes| Project[(ktx.yaml, semantic-layer/, wiki/)]
    CLI -->|spawns| Daemon[ktx-daemon - FastAPI :8765]
    CLI -->|registers| MCP[MCP server stdio / HTTP]
    MCP -->|tool calls| Daemon
    Daemon -->|SQL/embeddings| Warehouse[(Warehouse)]
    Daemon -->|LLM calls| LLM[Anthropic / Vertex / Codex / AI Gateway]
    MCP -->|tools| Agents[Claude Code / Codex / Cursor / OpenCode]

CLI Command Surface

The CLI is the entry point and is published to npm as @kaelio/ktx. After npm install -g @kaelio/ktx and ktx setup, the project is ready and ktx status reports readiness flags (LLM ready, Embeddings ready, Databases configured, Context sources configured, ktx context built, Agent integration ready). Source: README.md.

First-command table from the README:

Command	Purpose
`ktx setup`	Create, resume, or update a ktx project
`ktx status`	Check project readiness
`ktx ingest`	Build context for every configured connection
`ktx sl "revenue"`	Search semantic sources
`ktx wiki "refund policy"`	Search local wiki pages
`ktx mcp start`	Start the MCP server for agent clients

Setup, Status, and Ingest

ktx setup is interactive and walks the user through provider and connection configuration, agent integration install, and ingestion. v0.13.0 introduced ktx setup --agents to choose an install directory (issue #298); v0.12.0 added a wordmark banner (issue #290). Source: README.md.

ktx ingest accepts a connection id, --all, query-history toggles, and a --fast flag (added in v0.5.0, retired in v0.8.0 — ingestion always builds enriched context). v0.8.0 added profile ingest runs that split model vs tool time (issue #249). Source: packages/cli/src/commands/ingest-commands.ts.

Knowledge and Semantic-Layer Search

The wiki subcommand lists, searches, or reads local wiki pages, with --user-id, --limit, and --output (pretty / plain / json) options. Source: packages/cli/src/commands/knowledge-commands.ts. A matching sl subcommand searches semantic-layer entities by name or description.

Shell Completion

ktx completion <shell> prints an evaluation-ready script for zsh or bash; the hidden __complete subcommand streams newline-separated candidates so TAB-press never fails. Source: packages/cli/src/commands/completion-commands.ts.

MCP Server

The mcp command group manages the MCP HTTP server; the bare ktx mcp invocation prints current daemon status (URL, PID, token-auth state, project dir). A bare ktx mcp start --project-dir ... printout in ktx status indicates the agent client must be started manually before opening the agent. Source: README.md.

Two transport modes are supported:

stdio — ktx mcp stdio runs the MCP server in the calling process; ideal for single-agent in-process integration.
HTTP — a managed local daemon bound by default to 127.0.0.1:8765 (configurable via --host / --port), serving concurrent agent clients with optional token auth.

Source: packages/cli/src/commands/mcp-commands.ts and the daemon entry point at python/ktx-daemon/src/ktx_daemon/__main__.py (the serve-http subcommand accepts --host, --port, --log-level, and --enable-code-execution).

The daemon's FastAPI app exposes endpoints for semantic-layer query/validation, LookML parsing, source generation, SQL analysis (read-only validation + batch analysis), database introspection, embedding computation (single and bulk), and optionally POST /code/execute when launched with --enable-code-execution. Source: python/ktx-daemon/src/ktx_daemon/app.py.

The --enable-code-execution flag is gated because exposing POST /code/execute over HTTP executes arbitrary Python in the daemon process and is therefore opt-in.

LLM Runtime and Managed Python

The CLI delegates heavy compute (LookML resolution, sqlglot SQL transforms, local embeddings) to the Python daemon, which the CLI self-provisions via a pinned uv runtime. v0.12.0 introduced deferred MCP Python runtime installation; v0.9.0 added the Codex LLM backend. Source: README.md and the dependency list at packages/cli/package.json (the CLI depends on @modelcontextprotocol/sdk, @openai/codex-sdk, and ai for multi-provider routing).

Project layout under a configured repo:

my-project/
├── ktx.yaml                         # Project configuration
├── semantic-layer/<connection-id>/  # YAML semantic sources
├── wiki/global/                     # Shared business context
├── wiki/user/<user-id>/             # User-scoped notes
├── raw-sources/<connection-id>/     # Ingest artifacts and reports
└── .ktx/                            # Local state and secrets, git-ignored

Source: README.md. The first three directories are committed; .ktx/ stays local. Project resolution order is KTX_PROJECT_DIR, then nearest ktx.yaml, then the current directory; --project-dir <path> overrides for scripted use.

Telemetry from the daemon is shaped by an allow-list schema (x-ktx-catalog) covering daemon_started, daemon_stopped, sl_plan_completed, and sql_gen_completed. Common envelope fields include cliVersion, nodeVersion (actually Python runtime here), osPlatform, osRelease, arch, runtime ("daemon-py" or "node"), and isCi. Source: python/ktx-daemon/src/ktx_daemon/telemetry/events.py and python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json. Per the README, no file paths, hostnames, SQL text, schema/table/column names, raw env values, argv, or typed prompt text are recorded by catalog telemetry.

Common Failure Modes

ktx mcp start --project-dir ... shown in ktx status — the MCP daemon is not running; start it before opening the agent client. Source: README.md.
Wrong project directory — scripts that change cwd need --project-dir, otherwise resolution falls through KTX_PROJECT_DIR and the nearest ktx.yaml. Source: README.md.
Long LookML projects — LookML is parsed in-memory by the daemon; complex projects with many views produce warnings and skipped views that the response surfaces through ParseLookMLResponse.skipped_views and .warnings. Source: python/ktx-daemon/src/ktx_daemon/lookml.py.
Telemetry noise — disabled via the documented opt-out; the README also warns that Posthog error tracking can include stack frames with local paths.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

Developers may expose sensitive permissions or credentials: Add Sigma context source

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 26 structured pitfall item(s), including 5 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

Severity: high
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/Kaelio/ktx/issues/305

2. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Developers should check this security_permissions risk before relying on the project: Add Sigma context source
User impact: Developers may expose sensitive permissions or credentials: Add Sigma context source
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Add Sigma context source. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_issue | https://github.com/Kaelio/ktx/issues/168

3. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/Kaelio/ktx/issues/161

4. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/Kaelio/ktx/issues/169

5. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/Kaelio/ktx/issues/168

6. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v0.12.0
User impact: Upgrade or migration may change expected behavior: v0.12.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v0.12.0. Context: Observed when using python
Evidence: failure_mode_cluster:github_release | https://github.com/Kaelio/ktx/releases/tag/v0.12.0

7. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v0.13.0
User impact: Upgrade or migration may change expected behavior: v0.13.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v0.13.0. Context: Observed during installation or first-run setup.
Evidence: failure_mode_cluster:github_release | https://github.com/Kaelio/ktx/releases/tag/v0.13.0

8. Installation risk: Installation risk requires verification

Severity: medium
Finding: Developers should check this installation risk before relying on the project: v0.6.0
User impact: Upgrade or migration may change expected behavior: v0.6.0
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: v0.6.0. Context: Observed when using node, windows
Evidence: failure_mode_cluster:github_release | https://github.com/Kaelio/ktx/releases/tag/v0.6.0

9. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/Kaelio/ktx

10. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Add Amazon Redshift connector
User impact: Developers may misconfigure credentials, environment, or host setup: Add Amazon Redshift connector
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Add Amazon Redshift connector. Context: Observed during installation or first-run setup.
Evidence: failure_mode_cluster:github_issue | https://github.com/Kaelio/ktx/issues/161

11. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Add Confluence context source
User impact: Developers may misconfigure credentials, environment, or host setup: Add Confluence context source
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Add Confluence context source. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_issue | https://github.com/Kaelio/ktx/issues/169

12. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Developers should check this configuration risk before relying on the project: Add MongoDB connector
User impact: Developers may misconfigure credentials, environment, or host setup: Add MongoDB connector
Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Add MongoDB connector. Context: Source discussion did not expose a precise runtime context.
Evidence: failure_mode_cluster:github_issue | https://github.com/Kaelio/ktx/issues/305

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using ktx with real data or production workflows.

Add Sigma context source - github / github_issue
Add Amazon Redshift connector - github / github_issue
Add Confluence context source - github / github_issue
Add MongoDB connector - github / github_issue
v0.13.1 - github / github_release
v0.13.0 - github / github_release
v0.12.0 - github / github_release
v0.11.0 - github / github_release
v0.10.0 - github / github_release
v0.9.0 - github / github_release
v0.8.0 - github / github_release
v0.7.0 - github / github_release

Source: Project Pack community evidence and pitfall evidence

ktx

Overview & System Architecture

Related Pages

Overview & System Architecture

1. What ktx Is

2. Workspace Layout

3. High-Level System Architecture

4. Core Subsystems

4.1 CLI and Project Bootstrap

4.2 Context Engine and Semantic Layer

4.3 LookML and Embeddings

4.4 Telemetry and Operations

5. Project Layout on Disk

6. Extension Points Relevant to the Community

See Also

Database Connectors & Context Source Adapters

Related Pages

Database Connectors & Context Source Adapters

Architecture & Scope

Database Connectors

Context Source Adapters

Configuration via Setup

Common Failure Modes

See Also

Semantic Layer, Wiki & Hybrid Search (Context Engine)

Related Pages

Semantic Layer, Wiki & Hybrid Search (Context Engine)

Purpose and Scope

Semantic Layer Pipeline

Plan and Generate SQL

Bootstrap Sources from Scan and LookML

Engine Endpoints

Wiki and Hybrid Search

Configuration, Telemetry, and Failure Modes

See Also

Agent Integration — CLI, MCP Server & LLM Runtime

Related Pages

Agent Integration — CLI, MCP Server & LLM Runtime

Overview

CLI Command Surface

Setup, Status, and Ingest

Knowledge and Semantic-Layer Search

Shell Completion

MCP Server

LLM Runtime and Managed Python

Common Failure Modes

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Configuration risk: Configuration risk requires verification

2. Security or permission risk: Security or permission risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Security or permission risk: Security or permission risk requires verification

5. Security or permission risk: Security or permission risk requires verification

6. Installation risk: Installation risk requires verification

7. Installation risk: Installation risk requires verification

8. Installation risk: Installation risk requires verification

9. Configuration risk: Configuration risk requires verification

10. Configuration risk: Configuration risk requires verification

11. Configuration risk: Configuration risk requires verification

12. Configuration risk: Configuration risk requires verification

Community Discussion Evidence

Community Discussion Evidence