Kiln Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

Kiln

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.

Kiln Overview and System Architecture

Related topics: AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant, Backend REST API, Data Model, Git Sync,...

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Overview

Continue reading this section for the full explanation and source context.

Section Architecture Diagram

Continue reading this section for the full explanation and source context.

Section Command-Line Interface

Continue reading this section for the full explanation and source context.

Kiln Overview and System Architecture

Kiln is an integrated workbench for the full AI development loop — evals, optimization, prompts, RAG, fine-tuning, synthetic data, agents, and tools — packaged as a desktop application with an MIT-licensed Python core library and a REST server. The platform is designed so that non-technical team members (PMs, subject-matter experts, QA) can collaborate on AI systems alongside engineers, without writing code, while engineers retain a full programmatic surface for production deployment.

Purpose and Scope

The core value proposition described in the project README is that Kiln unifies the entire AI workflow that is normally scattered across notebooks, scripts, dashboards, and vendor UIs. As stated in the repository's main description, it is "a workbench for the full AI development loop" that ships as a desktop app plus an open Python library so "the same tasks [can be deployed] to production."

Source: README.md

The platform is intentionally flexible about model access — users can bring their own API keys (OpenAI, OpenRouter, etc.) or run fully offline using Ollama. Datasets are stored as open JSON files that the user owns and controls, making them easy to version, diff, and migrate.

Source: libs/core/README.md

A commercial tier, Kiln Pro, adds the AI Assistant, Auto-Optimize, and the Eval Builder. The README clarifies that Pro is opt-in and "the core Kiln app remains fully functional without it," meaning the open-source portions are not crippled by the paid features.

Source: README.md

System Components

Kiln is organized into several distinct packages, each with its own PyPI distribution and release cadence. The repository reflects this in its layout under libs/ and app/.

Component Overview

Component	Purpose	Distribution
Desktop App	Cross-platform GUI (Mac/Windows/Linux) for non-technical users	Source-available under `app/` (fair-code)
Core Python Library	Data model, task runner, RAG, agents, CLI	`kiln-ai` (MIT)
REST Server	FastAPI server exposing Kiln operations over HTTP	`kiln-server`
MCP Server	Allows Kiln tools (e.g. Search) to be called from MCP clients	Ships with `kiln-server` (`kiln_mcp` command)
CLI	Typer-based command-line entry into the core library	Bundled in `kiln-ai`

Sources:

README.md
libs/core/README.md
libs/server/README.md
libs/server/kiln_ai/mcp/README.md

Architecture Diagram

flowchart LR
    UI[Desktop App UI<br/>app/desktop]
    GS[Git Auto Sync<br/>app/desktop/git_sync]
    CLI[Kiln CLI<br/>kiln_ai/cli/cli.py]
    CORE[Core Library<br/>kiln_ai/datamodel, agents, RAG, fine-tuning]
    SRV[REST Server<br/>kiln_server]
    MCP[MCP Server<br/>kiln_mcp]
    EXT[External AI Providers<br/>OpenAI, OpenRouter, Ollama, Fireworks, Vertex]
    FS[(Project Folder<br/>.kiln JSON files)]

    UI --> CORE
    UI --> GS
    CLI --> CORE
    SRV --> CORE
    MCP --> CORE
    CORE --> EXT
    CORE --> FS
    GS --> FS

The desktop app, CLI, REST server, and MCP server all sit on top of the same core Python library. This shared foundation is what allows a task prototyped in the GUI to be exported via the package_project command and run unmodified from code, a server, or an MCP client.

Source: libs/core/kiln_ai/cli/commands/test_package_project.py

Data Model and Project Layout

Kiln projects are stored as a directory of small JSON files, most using the .kiln extension. The README explains the rationale:

Git compatibility: Kiln project folders are easy to collaborate on in git. The filenames use unique IDs to avoid conflicts and allow many people to work in parallel. The files are small and easy to compare using standard diff tools.

Source: libs/core/README.md

The nesting is:

Project — top-level container, referenced via project.kiln
Task — defines instructions, input/output schemas, and requirements (task.kiln)
TaskRun — a single execution sample with input, output, and human ratings
Finetune — configuration and status for fine-tuning jobs on the task's data
Prompt — a versioned prompt template for the task
DatasetSplit — frozen train/test/validation partitions of task runs

The core library enforces schema validation on load and save, so the recommended way to manipulate data is through the kiln_ai.datamodel classes rather than direct JSON edits.

Source: libs/core/README.md

CLI, Server, and Collaboration Surfaces

Command-Line Interface

The CLI is a Typer application registered at kiln_ai/cli/cli.py. It currently exposes three subcommands:

# libs/core/kiln_ai/cli/cli.py
app = typer.Typer(help="Kiln AI CLI - Build AI systems with evals, data gen, fine-tuning, and more.")
app.add_typer(projects.app, name="projects")
app.add_typer(tasks.app, name="tasks")
app.command(name="package_project")(package_project.package_project)

The package_project command produces a minimal export folder containing only the files needed to run a task in production, rather than the full history of runs and evals. The unit tests in test_package_project.py confirm the contract: a project, a task, and a default TaskRunConfig are validated and bundled into a deployable artifact.

Source: libs/core/kiln_ai/cli/cli.py Source: libs/core/kiln_ai/cli/commands/test_package_project.py

REST Server

The REST server is published separately as kiln-server. It runs via the kiln_server command and accepts standard options for host, port, log level, and auto-reload. Its OpenAPI schema is published as a reference for client generation; the desktop app embeds a generated client, visible in the auto-generated model classes such as SyntheticDataGenerationSessionConfigInput and OutputFileInfo.

Source: libs/server/README.md Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/output_file_info.py

MCP Server

A separate kiln_mcp command (also from the kiln_server distribution) exposes Kiln's tools — notably the Search/RAG tool — to any MCP-compatible client such as Cursor or VS Code. It is currently flagged as Beta and "not designed for production workloads," and requires that search indexing has already been run in the desktop app on the same machine.

Source: libs/server/kiln_ai/mcp/README.md

Git-Native Collaboration

The desktop app includes an automatic Git sync layer that lives at the HTTP middleware layer. Key properties:

Operates inside a hidden .git-projects/ directory inside the user's Kiln Projects folder, so user-installed editors and IDEs never interfere.
Stays within ~15 seconds of the remote via background polling; going offline blocks reads/writes (503 errors) rather than silently diverging.
Requires a developer to create the initial repository, but otherwise lets non-technical users connect with a personal access token.

Source: app/desktop/git_sync/README.md

Feature Surface (and Community Signals)

The README advertises eight high-level capabilities: Intuitive App, Eval Builder, Auto-Optimize, AI Assistant, Git-native collaboration, RAG, Subagents, Synthetic Data Generation, Fine-Tuning, and an Open Python library. These map onto community priorities visible in the issue tracker:

Expose model parameters in UI (issue #31) — users want control over temperature, max_tokens, and similar settings to be first-class in the GUI, which the Kiln run-config model is designed to support.
Kiln-Unsloth bridge for usability (issue #251) — community members want a tighter end-to-end path from Kiln export through Unsloth fine-tuning back into Ollama; the current flow stops at a JSONL export, which is the gap this feature request targets.
Manual data entry / correction (issue #115) — a recurring request from data-curation workflows, addressed by Kiln's open-JSON task-run model.

Sources:

Common Failure Modes and Design Limits

A few boundaries are explicitly stated and worth knowing before relying on Kiln:

Source: libs/server/kiln_ai/mcp/README.md

Source: app/desktop/git_sync/README.md

Source: README.md

Kiln MCP server is Beta and local-only. It "isn't designed for production workloads," and depends on a pre-indexed project on the same machine.
Git sync is online-only by design. Going offline blocks reads and writes rather than allowing silent divergence, so workflows that require intermittent connectivity need a different sync strategy.
Some features require Kiln Pro. Auto-Optimize, the AI Assistant, and the Eval Builder are gated behind the paid tier, though the rest of the app and library remain usable without it.

AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents

Related topics: Kiln Overview and System Architecture, Backend REST API, Data Model, Git Sync, and Extensibility

Section Related Pages

Continue reading this section for the full explanation and source context.

AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents

Overview

Kiln provides an end-to-end AI/ML development workbench that unifies the entire pipeline — model selection, evaluation, prompt optimization, fine-tuning, retrieval-augmented generation (RAG), and agent composition — into a single coherent system. The platform is composed of two primary artifacts: a cross-platform desktop application (app/) used by teams to author tasks and collaborate via Git, and an MIT-licensed Python library (libs/core/) that ships the same projects to production without rewrites.

Source: README.md:18-26

The pipeline is built around a shared data model: every Kiln project is a directory of .kiln JSON files representing Project → Task → TaskRun / Finetune / Prompt / DatasetSplit. This design makes datasets Git-friendly, diff-friendly, and easy to load with standard tools such as Pandas or Polars.

Source: libs/core/README.md:60-77

Architecture: Models and Adapters

Kiln normalizes access to dozens of model providers through an adapter layer. The adapter layer abstracts over OpenAI, Anthropic, Gemini, Bedrock, Ollama, OpenRouter, Fireworks, Groq, and any OpenAI-compatible endpoint, allowing models to be swapped without changing task definitions.

Source: README.md:10-14

At the foundation, a Task is paired with a TaskRunConfig (such as KilnAgentRunConfigProperties) that selects a model, provider, prompt generator, and structured-output mode. CLI tooling in kiln_ai.cli.commands.package_project then packages a minimal project directory containing only the artifacts required to run a task in production, pruning thousands of historical run files.

Source: libs/core/kiln_ai/cli/commands/test_package_project.py:38-65

flowchart LR
    UI[Kiln Desktop App] -->|git sync| Repo[(Project .kiln files)]
    Repo --> Lib[kiln-ai Python library]
    Lib --> Adapter[Model Adapter Layer]
    Adapter --> OpenAI
    Adapter --> Anthropic
    Adapter --> Ollama
    Adapter --> Custom[OpenAI-compatible]
    Adapter --> MCP[MCP Servers]
    Lib --> Eval[Eval Runner]
    Eval --> Judge[LLM-as-Judge]
    Lib --> RAG[RAG / Search Tools]
    Lib --> Agent[Subagent Orchestrator]

Evaluation and Rating System

The evaluation subsystem supports multiple rating primitives so that human reviewers and automated judges can score outputs in compatible ways. The TaskOutputRating model documents this directly:

Supports: five_star: 1–5 star ratings; pass_fail: boolean pass/fail (1.0 = pass, 0.0 = fail); pass_fail_critical: tri-state (1.0 = pass, 0.0 = fail, -1.0 = critical fail).

Source: task_output_rating.py:23-26

Per-requirement ratings are captured via RequirementRating, which pairs a numeric value with a rating type, enabling fine-grained evaluation against a specification.

Source: requirement_rating.py:13-19

The Eval Builder feature (introduced and iterated across v0.21–v1.0.3 releases) auto-generates a judge plus a synthetic eval dataset, allowing teams to align models to their preferences in roughly ten minutes.

Source: README.md:34-36

Optimizers, Specs, and Fine-Tuning

Kiln’s optimization surfaces cover three distinct layers:

Layer	Mechanism	Source
Prompt	Kiln Copilot / Specs refine a `Specification` through interactive Q&A (`SubmitAnswersRequest`, `RefineSpecInput`)	`submit_answers_request.py`, `refine_spec_input.py`
Synthetic Data	`SyntheticDataGenerationSessionConfig` orchestrates topic, input, and output generation steps	`synthetic_data_generation_session_config_input.py`
Fine-Tuning	`Finetune` records track status and configuration; zero-code fine-tuning is offered across 60+ models	`libs/core/README.md:60-77`

The ClarifySpecOutput payload bundles examples_for_feedback, a judge_result, and an sdg_session_config, demonstrating how specifications, evaluator feedback, and synthetic-data generation are wired together in a single round-trip.

Source: clarify_spec_output.py:18-22

For fine-tuning, exported project directories include JSONL consumed by hosted trainers (Fireworks, Together, Vertex). Community issue #251 (“Kiln-Unsloth Bridge for Usability”) highlights that local users currently must export JSONL, run a notebook, convert to GGUF, and re-import through Ollama, which the team is actively working to streamline.

Source: README.md:18-26, community context #251

RAG, Agents, and Output Artifacts

RAG is implemented as a first-class tool type. Documents (PDF, image, video, audio) can be dropped into a project to produce a vector index and a corresponding eval that is synthesized from the user’s own documents. RAG evals and tool-use evals were added together with the v0.22–v0.23 agent milestones.

Source: README.md:44-50

Agents are composed hierarchically: any Task can be exposed as a tool, allowing subagents to run in isolated, focused context windows. MCP servers extend this with the open Agent Skills standard (v0.26) and tool-server integrations.

Source: README.md:48-50

Generated artifacts — including fine-tuned checkpoints and RAG indexes — are exposed through OutputFileInfo, which pairs a human-readable name with a signed_url and mime_type so they can be downloaded by clients.

Source: output_file_info.py:14-18

Configuration and CLI Surface

The MIT-licensed CLI (kiln-ai) is exposed via Typer subcommands for projects, tasks, and package_project, enabling headless packaging for deployment:

pip install kiln-ai
python -m kiln_ai.cli.cli package_project --help

Source: libs/core/kiln_ai/cli/cli.py:1-13

A companion REST server (kiln-server) exposes the same engine over HTTP, with OpenAPI documentation published for clients.

Source: libs/server/README.md:15-22

Common Failure Modes and Community Pain Points

Two community issues consistently surface around the pipeline:

Issue #31 — *Expose model parameters in UI (temperature, max_tokens, etc.)* — Users want first-class UI controls and persistence for sampling parameters. Until addressed, parameters must be passed through run-config properties programmatically.
Issue #115 — *Allow manual data entry/correction* — Reviewers want to fix or augment TaskRun records without rerunning the pipeline. The data model is already permissive (open JSON with validation in the library), but the UI flow for manual edits is still a frequent request.
Issue #251 — *Kiln-Unsloth Bridge* — Local fine-tuning ergonomics are the largest gap; users want one-click export → GGUF → Ollama registration.

Source: community context #31, #115, #251

Desktop App, Web UI, and the Kiln Chat Assistant

Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Backend REST API, Data Model, Git Sync, and Extens...

Section Related Pages

Continue reading this section for the full explanation and source context.

Desktop App, Web UI, and the Kiln Chat Assistant

Overview

Kiln ships as three coordinated surfaces that share one data model and one set of Python primitives:

A cross-platform desktop app (Mac, Windows, Linux) that wraps a local web UI and runs an embedded REST server.
An in-app Web UI (SvelteKit) used by both technical and non-technical teammates for evals, datasets, RAG, fine-tuning, and prompts.
The Kiln Chat Assistant, an agentic chat panel embedded in the UI that can run evals, propose optimizations, and call backend APIs with streaming status.

The desktop app's positioning is described as "a workbench for the full AI development loop: evals, optimization, prompts, RAG, fine-tuning, synthetic data, agents, and tools - all working together" Source: README.md:5-7. The same file notes that "The MIT-licensed Python library ships the same tasks to production" Source: README.md:7-8, confirming that what is configured in the UI is portable to a deployable Python pipeline.

The Kiln Chat Assistant was introduced in v0.28.0 as "a new chat/assistant panel in the app" and continues to evolve; v1.0.3 ships "Improved Assistant: Kiln Assistant can now call APIs with streaming status, like evals" Source: README.md:3-5. Kiln Pro is the opt-in service tier that unlocks Assistant-driven flows like Auto-Optimize and the Eval Builder Source: README.md:39-41.

Architecture: Desktop Shell, Web UI, and Backend Services

The desktop bundle embeds the web UI and a local REST server, so the UI talks to an in-process backend rather than a remote cloud. The Studio Server ships an auto-generated Python client whose model catalog reveals the API surface available to the chat assistant.

flowchart LR
  User[User / Team] --> Desktop[Kiln Desktop App]
  Desktop --> WebUI[SvelteKit Web UI]
  WebUI -->|REST + streaming| StudioServer[Studio Server / FastAPI]
  StudioServer --> CoreLib[kiln_ai Python library]
  CoreLib --> Providers[(OpenAI / OpenRouter / Ollama / Fireworks / Vertex)]
  Desktop -. optional .-> GitSync[Git Auto Sync]
  GitSync -. .-> RemoteRepo[(Git Remote)]
  StudioServer -->|kiln_mcp stdio/HTTP| MCPClients[Cursor / VSCode / 3rd-party MCP]

Key architectural facts from the source tree:

The desktop app's auto-sync layer treats the repo as Kiln-owned, clones into a hidden .git-projects/ directory, and stays within 15 seconds of the remote by polling Source: app/desktop/git_sync/README.md:11-15.
A standalone kiln_server PyPI package exposes the same REST surface headlessly, configured via --host, --port, --log-level, and --auto-reload Source: libs/server/README.md:17-24.
A separate kiln_mcp binary launches the Kiln MCP server so external MCP clients (Cursor, VSCode, etc.) can call Kiln search tools Source: libs/server/kiln_server/mcp/README.md:11-16.
The Python library is the shared datamodel; projects are directories of .kiln JSON files, with tasks, runs, finetunes, prompts, and dataset splits nested under a Project Source: libs/core/README.md:38-50.

Kiln Chat Assistant (Assistant Panel)

The assistant is an in-app chat panel where users "ask, and Kiln can help you build and optimize any AI system" Source: README.md:5-7. Its capabilities, as surfaced in release notes and the API client, include:

Streaming tool calls. The assistant invokes backend APIs (evals, runs, optimizations) and streams progress back into the chat UI rather than blocking on a single response. This is the headline v1.0.3 improvement Source: README.md:3-5 and is implemented against the chat endpoints listed in the Studio Server client.
Session persistence. The API exposes chat session lifecycle endpoints (/v1/chat/sessions, /v1/chat/sessions/{session_id}) and a POST /v1/chat handler, with explicit 400/404/426/500 response variants Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:36-44. The 426 response code is used to signal an upgrade-required / capability gate, which maps to Kiln Pro-only features.
OpenAI-compatible chat payloads. Request bodies use ChatCompletionContentPartTextParam with a typed "type": "text" discriminator and an open additional_properties bag for forward compatibility Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/chat_completion_content_part_text_param.py:11-22.
Tool outputs as files. When the assistant produces structured artifacts, the API returns them via OutputFileInfo objects carrying name, mime_type, and a short-lived signed_url Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/output_file_info.py:13-19. This is how eval reports, synthetic datasets, and exported prompts are delivered back to the chat panel.
Job control. Long-running operations (evals, synthetic data generation, prompt optimization) are coordinated through JobStartResponse and JobStatusResponse so the chat UI can show progress Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:31-34.

The auto-generated SDK also exposes models for SyntheticDataGenerationSessionConfigInput, which configures three sequential steps (topic_generation_config, input_generation_config, output_generation_config) Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config_input.py:21-25 - the assistant drives these sessions on the user's behalf.

Common Usage Patterns and Failure Modes

Git-native collaboration. Non-technical teammates never touch git directly; the assistant and UI operate against a Kiln-managed clone Source: app/desktop/git_sync/README.md:5-9. Going offline will not silently diverge - reads and writes return 503 until the sync layer reconnects Source: app/desktop/git_sync/README.md:13-15.

Headless / production use. Anything built in the UI can be packaged into a minimal deployment via the CLI (package_project and package_project_for_training are exercised in the core test suite) Source: libs/core/kiln_ai/cli/commands/test_package_project.py:18-37, and then served via kiln_server or imported directly from the kiln_ai library Source: libs/core/README.md:73-78.

MCP integration. For IDE-driven workflows, kiln_mcp exposes Kiln search tools over streamable-http or stdio and is registered in each client's mcp.json. Indexing must be run from the desktop app first, on the same machine Source: libs/server/kiln_server/mcp/README.md:11-25.

Community-driven gaps. Several open issues map directly to features the assistant/UI is expected to surface but currently does not expose well:

*Issue #115 ("Allow manual data entry/correction")* highlights that human-in-the-loop editing of generated runs is a recurring request - relevant because the assistant currently drives generation but manual override UX is limited.
*Issue #31 ("Expose model parameters in UI")* points to a gap in surfacing temperature, max_tokens, etc. in the UI; KilnAgentRunConfigProperties is the underlying model that would need a UI binding Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:48-49.

Failure modes to watch for. If the local REST server is unreachable the assistant will surface a 426 (upgrade-required for Pro features) or 500 (server error) depending on the cause Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:39-44. If Git auto-sync is offline, writes will block with 503 rather than silently diverging Source: app/desktop/git_sync/README.md:13-15. If MCP search tools return empty results, confirm that desktop-app indexing has completed on the same machine Source: libs/server/kiln_server/mcp/README.md:13-15.

Backend REST API, Data Model, Git Sync, and Extensibility

Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant

Section Related Pages

Continue reading this section for the full explanation and source context.

Backend REST API, Data Model, Git Sync, and Extensibility

Kiln is structured around four tightly integrated layers: a REST server that exposes Kiln capabilities to clients and integrations, a file-based data model that backs every project, a Git Sync subsystem that provides transparent collaboration, and an extensibility surface for custom models, tools, and packaging. Together, these layers turn Kiln into a programmable workbench rather than a closed application.

REST Server and API Surface

The kiln_server package is a standalone PyPI distribution that runs the Kiln backend over HTTP. Source: libs/server/README.md. The server can be launched as a CLI with configurable host, port, log level, and an --auto-reload flag for development:

uv tool install kiln_server
kiln_server --host 0.0.0.0 --port 8000 --log-level info

The HTTP surface is generated as an OpenAPI specification; full schema docs are published at https://kiln-ai.github.io/Kiln/kiln_server_openapi_docs/index.html. Source: libs/server/README.md.

The API exposes a wide range of capabilities, organized by domain. The generated client model registry enumerates the main endpoint families:

Domain	Example endpoints / models	Source
Chat / Assistant	`HandleChatV1ChatPost`, `ListSessionsV1ChatSessionsGet`, `GetSessionV1ChatSessionsSessionIdGet`	models/__init__.py
Synthetic data generation	`GenerateBatchInput`, `SyntheticDataGenerationSessionConfigInput`, `JobStartResponse`, `JobStatusResponse`	generate_batch_input.py
Specs / Eval builder	`ClarifySpecOutput`, `RefineSpecInput`, `RefineSpecApiOutput`, `SubmitAnswersRequest`	clarify_spec_output.py, refine_spec_input.py
Outputs / Files	`OutputFileInfo` (signed URLs for downloads)	output_file_info.py
Health	`HealthHealthGet`	models/__init__.py

Request and response payloads are attrs-defined dataclasses serialized via to_dict / from_dict, e.g. RefineSpecInput carries a target_task_info, target_specification, and a list of examples_with_feedback. Source: refine_spec_input.py.

File-Based Data Model

Every Kiln project is a directory of .kiln files (mostly JSON) describing tasks, runs, prompts, fine-tuning jobs, and dataset splits. This design was chosen for git compatibility, easy diffing, and trivial loading into pandas/polars. Source: libs/core/README.md.

flowchart TD
    P[Project folder<br/>project.kiln] --> T1[Task A<br/>task.kiln]
    P --> T2[Task B<br/>task.kiln]
    T1 --> R1[TaskRun files]
    T1 --> F1[Finetune config & status]
    T1 --> PR1[Prompt definitions]
    T1 --> DS1[DatasetSplit<br/>train / test / val]
    T2 --> R2[TaskRun files]
    T2 --> DS2[DatasetSplit]

Each task validates its input/output against JSON schemas when present, and the kiln_ai.datamodel package enforces structure on load/save. Source: libs/core/README.md. The CLI test fixtures in test_package_project.py demonstrate the canonical shape: a Project saves to project.kiln, contains Task objects that save to task.kiln, and references a TaskRunConfig (with a KilnAgentRunConfigProperties block) that is set as task.default_run_config_id. Source: test_package_project.py.

Git Sync and Collaboration

Git Sync is the layer that turns a local project folder into a multi-user, cloud-synced workspace without exposing git to non-technical users. Source: app/desktop/git_sync/README.md.

Key design choices:

Kiln owns the repo. Auto-sync clones into a hidden .git-projects/ directory inside the projects folder; git status is the single source of truth. Source: app/desktop/git_sync/README.md.
Online-only. Background polling keeps the local repo within 15 seconds of the remote; offline operation returns 503 errors rather than allowing silent divergence. Source: app/desktop/git_sync/README.md.
Multi-user safety. The data model uses small, frequent commits and most objects are immutable/append-only, so merge conflicts are extremely rare. Source: app/desktop/git_sync/README.md.
Setup UX. Users authenticate with a personal access token via a deep link to GitHub; no git CLI knowledge is required. Source: app/desktop/git_sync/README.md.

This was introduced in v0.28.0 ("Automatic Git Sync") and remains a core v1.0 feature; the project README explicitly markets *"Git-native collaboration — the app syncs to Git automatically."* Source: README.md.

Extensibility and Packaging

Kiln is intentionally open at its edges. Three extensibility hooks are documented in the core library:

Custom models and providers. Code can register additional model providers alongside OpenAI, OpenRouter, Ollama, etc. Source: libs/core/README.md.
Programmatic runs. Tasks can be built, executed, and tagged in pure Python via kiln_ai.datamodel.TaskRun, with optional source and created_by metadata. Source: libs/core/README.md.
Project packaging and export. The CLI module kiln_ai.cli.commands exposes helpers like package_project, package_project_for_training, export_task, export_task_runs, export_evals, export_documents, export_skills, and export_tool_servers. These produce a minimal folder for deployment or for fine-tuning providers (Fireworks, Together, Vertex). Source: test_package_project.py.

Community interest in this area is concrete: issue #251 requests a "Kiln-Unsloth bridge" because users currently must export JSONL, run a notebook, build GGUF, import to Ollama, and finally register the model — a gap that future packaging helpers could close. Source: Community issue #251. Likewise, issue #31 asks for UI exposure of generation parameters (temperature, max_tokens), which would complement the provider-level extensibility already available in the library. Source: Community issue #31.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

1. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/Kiln-AI/Kiln

2. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln

3. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/Kiln-AI/Kiln

4. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/Kiln-AI/Kiln

5. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln

6. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 11

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Kiln with real data or production workflows.

Kiln Desktop - v1.0.3 - github / github_release
Kiln Desktop - v1.0 - github / github_release
Kiln Desktop - v0.28.0 - github / github_release
Kiln Desktop - v0.26.0 - github / github_release
Kiln Desktop - v0.25.0 - github / github_release
Kiln Desktop - v0.24.0 - github / github_release
Kiln Desktop - v0.23.0 - github / github_release
Kiln Desktop - v0.22.0 - github / github_release
v0.21.1 - github / github_release
v0.21.0 - github / github_release
Capability evidence risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

Kiln

Kiln Overview and System Architecture

Related Pages

Kiln Overview and System Architecture

Purpose and Scope

System Components

Component Overview

Architecture Diagram

Data Model and Project Layout

CLI, Server, and Collaboration Surfaces

Command-Line Interface

REST Server

MCP Server

Git-Native Collaboration

Feature Surface (and Community Signals)

Common Failure Modes and Design Limits

See Also

AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents

Related Pages

AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents

Overview

Architecture: Models and Adapters

Evaluation and Rating System

Optimizers, Specs, and Fine-Tuning

RAG, Agents, and Output Artifacts

Configuration and CLI Surface

Common Failure Modes and Community Pain Points

See Also

Desktop App, Web UI, and the Kiln Chat Assistant

Related Pages

Desktop App, Web UI, and the Kiln Chat Assistant

Overview

Architecture: Desktop Shell, Web UI, and Backend Services

Kiln Chat Assistant (Assistant Panel)

Common Usage Patterns and Failure Modes

See Also

Backend REST API, Data Model, Git Sync, and Extensibility

Related Pages

Backend REST API, Data Model, Git Sync, and Extensibility

REST Server and API Surface

File-Based Data Model

Git Sync and Collaboration

Extensibility and Packaging

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Capability evidence risk: Capability evidence risk requires verification

2. Maintenance risk: Maintenance risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Security or permission risk: Security or permission risk requires verification

5. Maintenance risk: Maintenance risk requires verification

6. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence