Doramagic Project Pack · Human Manual
Kiln
Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
Kiln Overview and System Architecture
Related topics: AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant, Backend REST API, Data Model, Git Sync,...
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant, Backend REST API, Data Model, Git Sync, and Extensibility
Kiln Overview and System Architecture
Kiln is an integrated workbench for the full AI development loop — evals, optimization, prompts, RAG, fine-tuning, synthetic data, agents, and tools — packaged as a desktop application with an MIT-licensed Python core library and a REST server. The platform is designed so that non-technical team members (PMs, subject-matter experts, QA) can collaborate on AI systems alongside engineers, without writing code, while engineers retain a full programmatic surface for production deployment.
Purpose and Scope
The core value proposition described in the project README is that Kiln unifies the entire AI workflow that is normally scattered across notebooks, scripts, dashboards, and vendor UIs. As stated in the repository's main description, it is "a workbench for the full AI development loop" that ships as a desktop app plus an open Python library so "the same tasks [can be deployed] to production."
Source: README.md
The platform is intentionally flexible about model access — users can bring their own API keys (OpenAI, OpenRouter, etc.) or run fully offline using Ollama. Datasets are stored as open JSON files that the user owns and controls, making them easy to version, diff, and migrate.
Source: libs/core/README.md
A commercial tier, Kiln Pro, adds the AI Assistant, Auto-Optimize, and the Eval Builder. The README clarifies that Pro is opt-in and "the core Kiln app remains fully functional without it," meaning the open-source portions are not crippled by the paid features.
Source: README.md
System Components
Kiln is organized into several distinct packages, each with its own PyPI distribution and release cadence. The repository reflects this in its layout under libs/ and app/.
Component Overview
| Component | Purpose | Distribution |
|---|---|---|
| Desktop App | Cross-platform GUI (Mac/Windows/Linux) for non-technical users | Source-available under app/ (fair-code) |
| Core Python Library | Data model, task runner, RAG, agents, CLI | kiln-ai (MIT) |
| REST Server | FastAPI server exposing Kiln operations over HTTP | kiln-server |
| MCP Server | Allows Kiln tools (e.g. Search) to be called from MCP clients | Ships with kiln-server (kiln_mcp command) |
| CLI | Typer-based command-line entry into the core library | Bundled in kiln-ai |
Sources:
- README.md
- libs/core/README.md
- libs/server/README.md
- libs/server/kiln_ai/mcp/README.md
Architecture Diagram
flowchart LR
UI[Desktop App UI<br/>app/desktop]
GS[Git Auto Sync<br/>app/desktop/git_sync]
CLI[Kiln CLI<br/>kiln_ai/cli/cli.py]
CORE[Core Library<br/>kiln_ai/datamodel, agents, RAG, fine-tuning]
SRV[REST Server<br/>kiln_server]
MCP[MCP Server<br/>kiln_mcp]
EXT[External AI Providers<br/>OpenAI, OpenRouter, Ollama, Fireworks, Vertex]
FS[(Project Folder<br/>.kiln JSON files)]
UI --> CORE
UI --> GS
CLI --> CORE
SRV --> CORE
MCP --> CORE
CORE --> EXT
CORE --> FS
GS --> FSThe desktop app, CLI, REST server, and MCP server all sit on top of the same core Python library. This shared foundation is what allows a task prototyped in the GUI to be exported via the package_project command and run unmodified from code, a server, or an MCP client.
Source: libs/core/kiln_ai/cli/commands/test_package_project.py
Data Model and Project Layout
Kiln projects are stored as a directory of small JSON files, most using the .kiln extension. The README explains the rationale:
Git compatibility: Kiln project folders are easy to collaborate on in git. The filenames use unique IDs to avoid conflicts and allow many people to work in parallel. The files are small and easy to compare using standard diff tools.
Source: libs/core/README.md
The nesting is:
- Project — top-level container, referenced via
project.kiln - Task — defines instructions, input/output schemas, and requirements (
task.kiln) - TaskRun — a single execution sample with input, output, and human ratings
- Finetune — configuration and status for fine-tuning jobs on the task's data
- Prompt — a versioned prompt template for the task
- DatasetSplit — frozen train/test/validation partitions of task runs
The core library enforces schema validation on load and save, so the recommended way to manipulate data is through the kiln_ai.datamodel classes rather than direct JSON edits.
Source: libs/core/README.md
CLI, Server, and Collaboration Surfaces
Command-Line Interface
The CLI is a Typer application registered at kiln_ai/cli/cli.py. It currently exposes three subcommands:
# libs/core/kiln_ai/cli/cli.py
app = typer.Typer(help="Kiln AI CLI - Build AI systems with evals, data gen, fine-tuning, and more.")
app.add_typer(projects.app, name="projects")
app.add_typer(tasks.app, name="tasks")
app.command(name="package_project")(package_project.package_project)
The package_project command produces a minimal export folder containing only the files needed to run a task in production, rather than the full history of runs and evals. The unit tests in test_package_project.py confirm the contract: a project, a task, and a default TaskRunConfig are validated and bundled into a deployable artifact.
Source: libs/core/kiln_ai/cli/cli.py Source: libs/core/kiln_ai/cli/commands/test_package_project.py
REST Server
The REST server is published separately as kiln-server. It runs via the kiln_server command and accepts standard options for host, port, log level, and auto-reload. Its OpenAPI schema is published as a reference for client generation; the desktop app embeds a generated client, visible in the auto-generated model classes such as SyntheticDataGenerationSessionConfigInput and OutputFileInfo.
Source: libs/server/README.md Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/output_file_info.py
MCP Server
A separate kiln_mcp command (also from the kiln_server distribution) exposes Kiln's tools — notably the Search/RAG tool — to any MCP-compatible client such as Cursor or VS Code. It is currently flagged as Beta and "not designed for production workloads," and requires that search indexing has already been run in the desktop app on the same machine.
Source: libs/server/kiln_ai/mcp/README.md
Git-Native Collaboration
The desktop app includes an automatic Git sync layer that lives at the HTTP middleware layer. Key properties:
- Operates inside a hidden
.git-projects/directory inside the user's Kiln Projects folder, so user-installed editors and IDEs never interfere. - Stays within ~15 seconds of the remote via background polling; going offline blocks reads/writes (503 errors) rather than silently diverging.
- Requires a developer to create the initial repository, but otherwise lets non-technical users connect with a personal access token.
Source: app/desktop/git_sync/README.md
Feature Surface (and Community Signals)
The README advertises eight high-level capabilities: Intuitive App, Eval Builder, Auto-Optimize, AI Assistant, Git-native collaboration, RAG, Subagents, Synthetic Data Generation, Fine-Tuning, and an Open Python library. These map onto community priorities visible in the issue tracker:
- Expose model parameters in UI (issue #31) — users want control over temperature,
max_tokens, and similar settings to be first-class in the GUI, which the Kiln run-config model is designed to support. - Kiln-Unsloth bridge for usability (issue #251) — community members want a tighter end-to-end path from Kiln export through Unsloth fine-tuning back into Ollama; the current flow stops at a JSONL export, which is the gap this feature request targets.
- Manual data entry / correction (issue #115) — a recurring request from data-curation workflows, addressed by Kiln's open-JSON task-run model.
Sources:
Common Failure Modes and Design Limits
A few boundaries are explicitly stated and worth knowing before relying on Kiln:
Source: libs/server/kiln_ai/mcp/README.md
Source: app/desktop/git_sync/README.md
Source: README.md
- Kiln MCP server is Beta and local-only. It "isn't designed for production workloads," and depends on a pre-indexed project on the same machine.
- Git sync is online-only by design. Going offline blocks reads and writes rather than allowing silent divergence, so workflows that require intermittent connectivity need a different sync strategy.
- Some features require Kiln Pro. Auto-Optimize, the AI Assistant, and the Eval Builder are gated behind the paid tier, though the rest of the app and library remain usable without it.
See Also
- Data Model and Tasks
- Fine-Tuning Guide
- RAG and Search Tools
- Agents and Subtasks
- REST API Reference
- Git Auto Sync
Sources:
AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents
Related topics: Kiln Overview and System Architecture, Backend REST API, Data Model, Git Sync, and Extensibility
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Kiln Overview and System Architecture, Backend REST API, Data Model, Git Sync, and Extensibility
AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents
Overview
Kiln provides an end-to-end AI/ML development workbench that unifies the entire pipeline — model selection, evaluation, prompt optimization, fine-tuning, retrieval-augmented generation (RAG), and agent composition — into a single coherent system. The platform is composed of two primary artifacts: a cross-platform desktop application (app/) used by teams to author tasks and collaborate via Git, and an MIT-licensed Python library (libs/core/) that ships the same projects to production without rewrites.
Source: README.md:18-26
The pipeline is built around a shared data model: every Kiln project is a directory of .kiln JSON files representing Project → Task → TaskRun / Finetune / Prompt / DatasetSplit. This design makes datasets Git-friendly, diff-friendly, and easy to load with standard tools such as Pandas or Polars.
Source: libs/core/README.md:60-77
Architecture: Models and Adapters
Kiln normalizes access to dozens of model providers through an adapter layer. The adapter layer abstracts over OpenAI, Anthropic, Gemini, Bedrock, Ollama, OpenRouter, Fireworks, Groq, and any OpenAI-compatible endpoint, allowing models to be swapped without changing task definitions.
Source: README.md:10-14
At the foundation, a Task is paired with a TaskRunConfig (such as KilnAgentRunConfigProperties) that selects a model, provider, prompt generator, and structured-output mode. CLI tooling in kiln_ai.cli.commands.package_project then packages a minimal project directory containing only the artifacts required to run a task in production, pruning thousands of historical run files.
Source: libs/core/kiln_ai/cli/commands/test_package_project.py:38-65
flowchart LR
UI[Kiln Desktop App] -->|git sync| Repo[(Project .kiln files)]
Repo --> Lib[kiln-ai Python library]
Lib --> Adapter[Model Adapter Layer]
Adapter --> OpenAI
Adapter --> Anthropic
Adapter --> Ollama
Adapter --> Custom[OpenAI-compatible]
Adapter --> MCP[MCP Servers]
Lib --> Eval[Eval Runner]
Eval --> Judge[LLM-as-Judge]
Lib --> RAG[RAG / Search Tools]
Lib --> Agent[Subagent Orchestrator]Evaluation and Rating System
The evaluation subsystem supports multiple rating primitives so that human reviewers and automated judges can score outputs in compatible ways. The TaskOutputRating model documents this directly:
Supports:five_star: 1–5 star ratings;pass_fail: boolean pass/fail (1.0 = pass, 0.0 = fail);pass_fail_critical: tri-state (1.0 = pass, 0.0 = fail, -1.0 = critical fail).
Source: task_output_rating.py:23-26
Per-requirement ratings are captured via RequirementRating, which pairs a numeric value with a rating type, enabling fine-grained evaluation against a specification.
Source: requirement_rating.py:13-19
The Eval Builder feature (introduced and iterated across v0.21–v1.0.3 releases) auto-generates a judge plus a synthetic eval dataset, allowing teams to align models to their preferences in roughly ten minutes.
Source: README.md:34-36
Optimizers, Specs, and Fine-Tuning
Kiln’s optimization surfaces cover three distinct layers:
| Layer | Mechanism | Source |
|---|---|---|
| Prompt | Kiln Copilot / Specs refine a Specification through interactive Q&A (SubmitAnswersRequest, RefineSpecInput) | submit_answers_request.py, refine_spec_input.py |
| Synthetic Data | SyntheticDataGenerationSessionConfig orchestrates topic, input, and output generation steps | synthetic_data_generation_session_config_input.py |
| Fine-Tuning | Finetune records track status and configuration; zero-code fine-tuning is offered across 60+ models | libs/core/README.md:60-77 |
The ClarifySpecOutput payload bundles examples_for_feedback, a judge_result, and an sdg_session_config, demonstrating how specifications, evaluator feedback, and synthetic-data generation are wired together in a single round-trip.
Source: clarify_spec_output.py:18-22
For fine-tuning, exported project directories include JSONL consumed by hosted trainers (Fireworks, Together, Vertex). Community issue #251 (“Kiln-Unsloth Bridge for Usability”) highlights that local users currently must export JSONL, run a notebook, convert to GGUF, and re-import through Ollama, which the team is actively working to streamline.
Source: README.md:18-26, community context #251
RAG, Agents, and Output Artifacts
RAG is implemented as a first-class tool type. Documents (PDF, image, video, audio) can be dropped into a project to produce a vector index and a corresponding eval that is synthesized from the user’s own documents. RAG evals and tool-use evals were added together with the v0.22–v0.23 agent milestones.
Source: README.md:44-50
Agents are composed hierarchically: any Task can be exposed as a tool, allowing subagents to run in isolated, focused context windows. MCP servers extend this with the open Agent Skills standard (v0.26) and tool-server integrations.
Source: README.md:48-50
Generated artifacts — including fine-tuned checkpoints and RAG indexes — are exposed through OutputFileInfo, which pairs a human-readable name with a signed_url and mime_type so they can be downloaded by clients.
Source: output_file_info.py:14-18
Configuration and CLI Surface
The MIT-licensed CLI (kiln-ai) is exposed via Typer subcommands for projects, tasks, and package_project, enabling headless packaging for deployment:
pip install kiln-ai
python -m kiln_ai.cli.cli package_project --help
Source: libs/core/kiln_ai/cli/cli.py:1-13
A companion REST server (kiln-server) exposes the same engine over HTTP, with OpenAPI documentation published for clients.
Source: libs/server/README.md:15-22
Common Failure Modes and Community Pain Points
Two community issues consistently surface around the pipeline:
- Issue #31 — *Expose model parameters in UI (temperature, max_tokens, etc.)* — Users want first-class UI controls and persistence for sampling parameters. Until addressed, parameters must be passed through run-config properties programmatically.
- Issue #115 — *Allow manual data entry/correction* — Reviewers want to fix or augment
TaskRunrecords without rerunning the pipeline. The data model is already permissive (open JSON with validation in the library), but the UI flow for manual edits is still a frequent request. - Issue #251 — *Kiln-Unsloth Bridge* — Local fine-tuning ergonomics are the largest gap; users want one-click export → GGUF → Ollama registration.
Source: community context #31, #115, #251
See Also
Source: https://github.com/Kiln-AI/Kiln / Human Manual
Desktop App, Web UI, and the Kiln Chat Assistant
Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Backend REST API, Data Model, Git Sync, and Extens...
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Backend REST API, Data Model, Git Sync, and Extensibility
Desktop App, Web UI, and the Kiln Chat Assistant
Overview
Kiln ships as three coordinated surfaces that share one data model and one set of Python primitives:
- A cross-platform desktop app (Mac, Windows, Linux) that wraps a local web UI and runs an embedded REST server.
- An in-app Web UI (SvelteKit) used by both technical and non-technical teammates for evals, datasets, RAG, fine-tuning, and prompts.
- The Kiln Chat Assistant, an agentic chat panel embedded in the UI that can run evals, propose optimizations, and call backend APIs with streaming status.
The desktop app's positioning is described as "a workbench for the full AI development loop: evals, optimization, prompts, RAG, fine-tuning, synthetic data, agents, and tools - all working together" Source: README.md:5-7. The same file notes that "The MIT-licensed Python library ships the same tasks to production" Source: README.md:7-8, confirming that what is configured in the UI is portable to a deployable Python pipeline.
The Kiln Chat Assistant was introduced in v0.28.0 as "a new chat/assistant panel in the app" and continues to evolve; v1.0.3 ships "Improved Assistant: Kiln Assistant can now call APIs with streaming status, like evals" Source: README.md:3-5. Kiln Pro is the opt-in service tier that unlocks Assistant-driven flows like Auto-Optimize and the Eval Builder Source: README.md:39-41.
Architecture: Desktop Shell, Web UI, and Backend Services
The desktop bundle embeds the web UI and a local REST server, so the UI talks to an in-process backend rather than a remote cloud. The Studio Server ships an auto-generated Python client whose model catalog reveals the API surface available to the chat assistant.
flowchart LR User[User / Team] --> Desktop[Kiln Desktop App] Desktop --> WebUI[SvelteKit Web UI] WebUI -->|REST + streaming| StudioServer[Studio Server / FastAPI] StudioServer --> CoreLib[kiln_ai Python library] CoreLib --> Providers[(OpenAI / OpenRouter / Ollama / Fireworks / Vertex)] Desktop -. optional .-> GitSync[Git Auto Sync] GitSync -. .-> RemoteRepo[(Git Remote)] StudioServer -->|kiln_mcp stdio/HTTP| MCPClients[Cursor / VSCode / 3rd-party MCP]
Key architectural facts from the source tree:
- The desktop app's auto-sync layer treats the repo as Kiln-owned, clones into a hidden
.git-projects/directory, and stays within 15 seconds of the remote by polling Source: app/desktop/git_sync/README.md:11-15. - A standalone
kiln_serverPyPI package exposes the same REST surface headlessly, configured via--host,--port,--log-level, and--auto-reloadSource: libs/server/README.md:17-24. - A separate
kiln_mcpbinary launches the Kiln MCP server so external MCP clients (Cursor, VSCode, etc.) can call Kiln search tools Source: libs/server/kiln_server/mcp/README.md:11-16. - The Python library is the shared datamodel; projects are directories of
.kilnJSON files, with tasks, runs, finetunes, prompts, and dataset splits nested under aProjectSource: libs/core/README.md:38-50.
Kiln Chat Assistant (Assistant Panel)
The assistant is an in-app chat panel where users "ask, and Kiln can help you build and optimize any AI system" Source: README.md:5-7. Its capabilities, as surfaced in release notes and the API client, include:
- Streaming tool calls. The assistant invokes backend APIs (evals, runs, optimizations) and streams progress back into the chat UI rather than blocking on a single response. This is the headline v1.0.3 improvement Source: README.md:3-5 and is implemented against the chat endpoints listed in the Studio Server client.
- Session persistence. The API exposes chat session lifecycle endpoints (
/v1/chat/sessions,/v1/chat/sessions/{session_id}) and aPOST /v1/chathandler, with explicit 400/404/426/500 response variants Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:36-44. The 426 response code is used to signal an upgrade-required / capability gate, which maps to Kiln Pro-only features. - OpenAI-compatible chat payloads. Request bodies use
ChatCompletionContentPartTextParamwith a typed"type": "text"discriminator and an openadditional_propertiesbag for forward compatibility Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/chat_completion_content_part_text_param.py:11-22. - Tool outputs as files. When the assistant produces structured artifacts, the API returns them via
OutputFileInfoobjects carryingname,mime_type, and a short-livedsigned_urlSource: app/desktop/studio_server/api_client/kiln_ai_server_client/models/output_file_info.py:13-19. This is how eval reports, synthetic datasets, and exported prompts are delivered back to the chat panel. - Job control. Long-running operations (evals, synthetic data generation, prompt optimization) are coordinated through
JobStartResponseandJobStatusResponseso the chat UI can show progress Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:31-34.
The auto-generated SDK also exposes models for SyntheticDataGenerationSessionConfigInput, which configures three sequential steps (topic_generation_config, input_generation_config, output_generation_config) Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config_input.py:21-25 - the assistant drives these sessions on the user's behalf.
Common Usage Patterns and Failure Modes
Git-native collaboration. Non-technical teammates never touch git directly; the assistant and UI operate against a Kiln-managed clone Source: app/desktop/git_sync/README.md:5-9. Going offline will not silently diverge - reads and writes return 503 until the sync layer reconnects Source: app/desktop/git_sync/README.md:13-15.
Headless / production use. Anything built in the UI can be packaged into a minimal deployment via the CLI (package_project and package_project_for_training are exercised in the core test suite) Source: libs/core/kiln_ai/cli/commands/test_package_project.py:18-37, and then served via kiln_server or imported directly from the kiln_ai library Source: libs/core/README.md:73-78.
MCP integration. For IDE-driven workflows, kiln_mcp exposes Kiln search tools over streamable-http or stdio and is registered in each client's mcp.json. Indexing must be run from the desktop app first, on the same machine Source: libs/server/kiln_server/mcp/README.md:11-25.
Community-driven gaps. Several open issues map directly to features the assistant/UI is expected to surface but currently does not expose well:
- *Issue #115 ("Allow manual data entry/correction")* highlights that human-in-the-loop editing of generated runs is a recurring request - relevant because the assistant currently drives generation but manual override UX is limited.
- *Issue #31 ("Expose model parameters in UI")* points to a gap in surfacing
temperature,max_tokens, etc. in the UI;KilnAgentRunConfigPropertiesis the underlying model that would need a UI binding Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:48-49.
Failure modes to watch for. If the local REST server is unreachable the assistant will surface a 426 (upgrade-required for Pro features) or 500 (server error) depending on the cause Source: app/desktop/studio_server/api_client/kiln_ai_server_client/models/__init__.py:39-44. If Git auto-sync is offline, writes will block with 503 rather than silently diverging Source: app/desktop/git_sync/README.md:13-15. If MCP search tools return empty results, confirm that desktop-app indexing has completed on the same machine Source: libs/server/kiln_server/mcp/README.md:13-15.
See Also
Source: https://github.com/Kiln-AI/Kiln / Human Manual
Backend REST API, Data Model, Git Sync, and Extensibility
Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Kiln Overview and System Architecture, AI/ML Pipeline: Models, Adapters, Evals, Optimizers, Fine-Tuning, RAG, and Agents, Desktop App, Web UI, and the Kiln Chat Assistant
Backend REST API, Data Model, Git Sync, and Extensibility
Kiln is structured around four tightly integrated layers: a REST server that exposes Kiln capabilities to clients and integrations, a file-based data model that backs every project, a Git Sync subsystem that provides transparent collaboration, and an extensibility surface for custom models, tools, and packaging. Together, these layers turn Kiln into a programmable workbench rather than a closed application.
REST Server and API Surface
The kiln_server package is a standalone PyPI distribution that runs the Kiln backend over HTTP. Source: libs/server/README.md. The server can be launched as a CLI with configurable host, port, log level, and an --auto-reload flag for development:
uv tool install kiln_server
kiln_server --host 0.0.0.0 --port 8000 --log-level info
The HTTP surface is generated as an OpenAPI specification; full schema docs are published at https://kiln-ai.github.io/Kiln/kiln_server_openapi_docs/index.html. Source: libs/server/README.md.
The API exposes a wide range of capabilities, organized by domain. The generated client model registry enumerates the main endpoint families:
| Domain | Example endpoints / models | Source |
|---|---|---|
| Chat / Assistant | HandleChatV1ChatPost, ListSessionsV1ChatSessionsGet, GetSessionV1ChatSessionsSessionIdGet | models/__init__.py |
| Synthetic data generation | GenerateBatchInput, SyntheticDataGenerationSessionConfigInput, JobStartResponse, JobStatusResponse | generate_batch_input.py |
| Specs / Eval builder | ClarifySpecOutput, RefineSpecInput, RefineSpecApiOutput, SubmitAnswersRequest | clarify_spec_output.py, refine_spec_input.py |
| Outputs / Files | OutputFileInfo (signed URLs for downloads) | output_file_info.py |
| Health | HealthHealthGet | models/__init__.py |
Request and response payloads are attrs-defined dataclasses serialized via to_dict / from_dict, e.g. RefineSpecInput carries a target_task_info, target_specification, and a list of examples_with_feedback. Source: refine_spec_input.py.
File-Based Data Model
Every Kiln project is a directory of .kiln files (mostly JSON) describing tasks, runs, prompts, fine-tuning jobs, and dataset splits. This design was chosen for git compatibility, easy diffing, and trivial loading into pandas/polars. Source: libs/core/README.md.
flowchart TD
P[Project folder<br/>project.kiln] --> T1[Task A<br/>task.kiln]
P --> T2[Task B<br/>task.kiln]
T1 --> R1[TaskRun files]
T1 --> F1[Finetune config & status]
T1 --> PR1[Prompt definitions]
T1 --> DS1[DatasetSplit<br/>train / test / val]
T2 --> R2[TaskRun files]
T2 --> DS2[DatasetSplit]Each task validates its input/output against JSON schemas when present, and the kiln_ai.datamodel package enforces structure on load/save. Source: libs/core/README.md. The CLI test fixtures in test_package_project.py demonstrate the canonical shape: a Project saves to project.kiln, contains Task objects that save to task.kiln, and references a TaskRunConfig (with a KilnAgentRunConfigProperties block) that is set as task.default_run_config_id. Source: test_package_project.py.
Git Sync and Collaboration
Git Sync is the layer that turns a local project folder into a multi-user, cloud-synced workspace without exposing git to non-technical users. Source: app/desktop/git_sync/README.md.
Key design choices:
- Kiln owns the repo. Auto-sync clones into a hidden
.git-projects/directory inside the projects folder;git statusis the single source of truth. Source: app/desktop/git_sync/README.md. - Online-only. Background polling keeps the local repo within 15 seconds of the remote; offline operation returns 503 errors rather than allowing silent divergence. Source: app/desktop/git_sync/README.md.
- Multi-user safety. The data model uses small, frequent commits and most objects are immutable/append-only, so merge conflicts are extremely rare. Source: app/desktop/git_sync/README.md.
- Setup UX. Users authenticate with a personal access token via a deep link to GitHub; no git CLI knowledge is required. Source: app/desktop/git_sync/README.md.
This was introduced in v0.28.0 ("Automatic Git Sync") and remains a core v1.0 feature; the project README explicitly markets *"Git-native collaboration — the app syncs to Git automatically."* Source: README.md.
Extensibility and Packaging
Kiln is intentionally open at its edges. Three extensibility hooks are documented in the core library:
- Custom models and providers. Code can register additional model providers alongside OpenAI, OpenRouter, Ollama, etc. Source: libs/core/README.md.
- Programmatic runs. Tasks can be built, executed, and tagged in pure Python via
kiln_ai.datamodel.TaskRun, with optionalsourceandcreated_bymetadata. Source: libs/core/README.md. - Project packaging and export. The CLI module
kiln_ai.cli.commandsexposes helpers likepackage_project,package_project_for_training,export_task,export_task_runs,export_evals,export_documents,export_skills, andexport_tool_servers. These produce a minimal folder for deployment or for fine-tuning providers (Fireworks, Together, Vertex). Source: test_package_project.py.
Community interest in this area is concrete: issue #251 requests a "Kiln-Unsloth bridge" because users currently must export JSONL, run a notebook, build GGUF, import to Ollama, and finally register the model — a gap that future packaging helpers could close. Source: Community issue #251. Likewise, issue #31 asks for UI exposure of generation parameters (temperature, max_tokens), which would complement the provider-level extensibility already available in the library. Source: Community issue #31.
See Also
Source: https://github.com/Kiln-AI/Kiln / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.
1. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/Kiln-AI/Kiln
2. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln
3. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/Kiln-AI/Kiln
4. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/Kiln-AI/Kiln
5. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln
6. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/Kiln-AI/Kiln
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Kiln with real data or production workflows.
- Kiln Desktop - v1.0.3 - github / github_release
- Kiln Desktop - v1.0 - github / github_release
- Kiln Desktop - v0.28.0 - github / github_release
- Kiln Desktop - v0.26.0 - github / github_release
- Kiln Desktop - v0.25.0 - github / github_release
- Kiln Desktop - v0.24.0 - github / github_release
- Kiln Desktop - v0.23.0 - github / github_release
- Kiln Desktop - v0.22.0 - github / github_release
- v0.21.1 - github / github_release
- v0.21.0 - github / github_release
- Capability evidence risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence