# https://github.com/Zhonghao1995/agentic-swmm-workflow Project Manual

Generated at: 2026-07-03 21:43:18 UTC

## Table of Contents

- [Project Overview and Goals](#page-1)
- [Agent Runtime and Orchestration](#page-2)
- [MCP Servers and Skill Layer](#page-3)
- [SWMM Build, Run, and Network Synthesis](#page-4)
- [Calibration, Uncertainty, and Water Quality](#page-5)
- [GIS, Climate, Plot, Report, and Review Skills](#page-6)
- [Modelling Memory and Learning Layer](#page-7)
- [Audit, Provenance, and Verification](#page-8)
- [Installation Paths and Runtime Configuration](#page-9)
- [CLI Commands and Doctor Diagnostics](#page-10)
- [Extensibility, Agent Runtimes, and Cross-Project Integration](#page-11)
- [Operations, Workflows, and Common Failure Modes](#page-12)

<a id='page-1'></a>

## Project Overview and Goals

### Related Pages

Related topics: [Agent Runtime and Orchestration](#page-2), [Extensibility, Agent Runtimes, and Cross-Project Integration](#page-11)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/README.md)
- [CITATION.cff](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/CITATION.cff)
- [CHANGELOG.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/CHANGELOG.md)
- [pyproject.toml](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/pyproject.toml)
- [src/aiswmm/__init__.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/src/aiswmm/__init__.py)
- [src/aiswmm/cli.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/src/aiswmm/cli.py)
- [docs/index.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/index.md)
</details>

# Project Overview and Goals

## Purpose and Scope

Agentic SWMM is an auditable, reproducible stormwater-modelling workflow that pairs EPA SWMM with an agent runtime and the Model Context Protocol (MCP). The project is published under the PyPI distribution name `aiswmm` and is described in its companion paper *"Agentic SWMM: Auditable and Reproducible Stormwater Modelling Workflow with Agent Skills and Model Context Protocol"* in *AI for Engineering* (DOI: 10.3390/aieng1010005). `Source: [README.md:1-40]()` `Source: [CITATION.cff:1-20]()`

The package's stated mission is to collapse the gap between a natural-language request and a deterministic, byte-stable SWMM run. A single sentence that names a WGS84 bounding box can drive the full pipeline: synthesise the drainage network from public OSM and DEM data, execute SWMM, write an audit dossier, and render a spatial network map, all deposited under a canonical `runs/<date>/<id>/` layout. `Source: [CHANGELOG.md:1-30]()` `Source: [README.md:42-90]()`

The codebase is organised so that every run is inspectable after the fact. Hashes, configuration snapshots, prompts, tool calls, and SWMM I/O are written alongside the model, making each artefact independently verifiable. `Source: [docs/index.md:1-60]()`

## Core Capabilities

The v0.7.x line consolidates six user-visible capability groups:

| Capability | Description |
|---|---|
| Natural-language synthesis | One sentence + bounding box → end-to-end SWMM run |
| Agent skills & MCP tools | Typed-tool surface exposed to the runtime |
| Calibration & sensitivity | Agent-reachable parameter sweeps and fits |
| Design storms | Synthetic event generation for planning studies |
| Memory observability | `aiswmm doctor` and `aiswmm memory repair-sessions` |
| Install UX | One-line installers for Windows, macOS, and Linux |

`Source: [CHANGELOG.md:30-120]()` `Source: [src/aiswmm/cli.py:1-80]()`

A single CLI entry point, `aiswmm`, is registered through `pyproject.toml` and dispatches to verbs such as `doctor`, `memory`, and skill invocations. The Claude Agent SDK is offered as an optional extra via `pip install "aiswmm[claude]==…"`, isolating the heavy agent dependency from the default install. `Source: [pyproject.toml:1-80]()` `Source: [src/aiswmm/__init__.py:1-40]()`

## Architecture and Workflow

```mermaid
flowchart LR
    A[NL request + WGS84 bbox] --> B[Skill router]
    B --> C[SWMManywhere synthesis]
    C --> D[SWMM engine run]
    D --> E[Audit dossier]
    D --> F[Spatial network map]
    B -.tools.-> G[(MCP typed tools)]
    D --> H[(runs/date/id/)]
    E --> H
    F --> H
```

The user-facing flow starts with a free-form sentence; the skill router matches it against consolidated keyword tables and invokes the appropriate MCP tools. Network synthesis is delegated to SWMAnywhere (Imperial College), and the resulting INP file is executed by an embedded SWMM binary. Determinism is enforced through pinned dependency versions and auto-built Docker images that are immutable once tagged. `Source: [CHANGELOG.md:120-200]()` `Source: [src/aiswmm/cli.py:80-160]()`

Every run writes a self-describing directory so external reviewers can replay, diff, or audit a study without contacting the original analyst. `Source: [docs/index.md:60-140]()`

## Release Lineage and Stability Guarantees

The project follows a stable-track / pre-release split mandated by PEP 440. As of release notes, `pip install aiswmm` resolves to the latest stable tag, while `pip install aiswmm==0.7.0a1` (and similar alpha pins) locks to a specific pre-release. The v0.6.4 tag is the first non-pre-release on the 0.6.x line and underwrites the byte-reproducibility claims of the companion paper. `Source: [CHANGELOG.md:200-280]()` `Source: [pyproject.toml:80-140]()`

Maintenance signals visible to operators include the `aiswmm doctor` health check, which classifies the Sessions store into `ok / corrupt / unreadable / absent` and exits non-zero on data-loss conditions so CI pipelines can catch regressions. The companion `aiswmm memory repair-sessions` verb was introduced specifically to make memory stores self-healing rather than write-only. `Source: [CHANGELOG.md:280-360]()` `Source: [src/aiswmm/cli.py:160-240]()`

In summary, the project positions itself as the reproducible, agent-driven control plane over a traditional hydrological-and-hydraulic simulator, with installation, observability, and determinism treated as first-class concerns rather than afterthoughts. `Source: [README.md:90-160]()` `Source: [docs/index.md:140-220]()`

---

<a id='page-2'></a>

## Agent Runtime and Orchestration

### Related Pages

Related topics: [MCP Servers and Skill Layer](#page-3), [Operations, Workflows, and Common Failure Modes](#page-12)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/agent/runtime.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/runtime.py)
- [agentic_swmm/agent/repl.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/repl.py)
- [agentic_swmm/agent/planner.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/planner.py)
- [agentic_swmm/agent/intent_classifier.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/intent_classifier.py)
- [agentic_swmm/agent/executor.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/executor.py)
- [agentic_swmm/agent/tool_registry.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/tool_registry.py)
- [agentic_swmm/cli/main.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/cli/main.py)
- [agentic_swmm/memory/sessions.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/sessions.py)
</details>

# Agent Runtime and Orchestration

The Agent Runtime is the control plane of `aiswmm`. It sits between the user-facing REPL or programmatic entry points and the deterministic SWMM execution layer, deciding *what* the user wants, *how* to break that intent into typed tool calls, and *when* to persist state so that every modelling run is auditable and reproducible. v0.7.0 promoted the runtime to a stable, documented subsystem and v0.7.2 extended it to reach calibration and sensitivity tools.

Source: [agentic_swmm/agent/runtime.py:1-40]()

## High-Level Responsibilities

The runtime coordinates five concerns that the rest of the system depends on:

1. **Intent grounding** — converting a free-form natural-language request into a structured intent.
2. **Planning** — turning the intent into an ordered list of tool invocations and side-effects.
3. **Dispatch** — handing each step to a typed tool implementation registered in the tool registry.
4. **Persistence** — recording session state, run manifests, and audit artefacts in `runs/<date>/<id>/`.
5. **Self-checks** — surfacing environment diagnostics through `aiswmm doctor`.

Source: [agentic_swmm/agent/runtime.py:42-110]()

## Request Lifecycle

A single turn follows a predictable pipeline. The REPL captures input and forwards it to the runtime; the runtime never trusts raw text.

```mermaid
flowchart LR
    U[User input] --> R[REPL<br/>repl.py]
    R --> IC[Intent Classifier<br/>intent_classifier.py]
    IC --> P[Planner<br/>planner.py]
    P --> E[Executor<br/>executor.py]
    E --> TR[Tool Registry<br/>tool_registry.py]
    TR --> S[Session Store<br/>memory/sessions.py]
    E --> A[Audit artefacts<br/>runs/&lt;date&gt;/&lt;id&gt;/]
    A --> U
```

1. **REPL intake** — `repl.py` owns greeting, warm-intro caching (fixed in v0.6.2a1 to fire once per session rather than every greeting), and command parsing. It also dispatches non-agent verbs such as `aiswmm doctor` or `aiswmm memory repair-sessions` directly without invoking the runtime. Source: [agentic_swmm/agent/repl.py:30-90]()
2. **Intent classification** — `intent_classifier.py` extracts the request type (network synthesis, calibration, sensitivity, design storm, etc.) and any constraints (bounding box, return period, output path). As of v0.6.3a1, the keyword-matching logic was consolidated into this single module instead of being scattered across six callers. Source: [agentic_swmm/agent/intent_classifier.py:1-70]()
3. **Planning** — `planner.py` produces an ordered, typed step list. Each step references a tool name and pre-validated arguments. The planner is the only component that decides ordering; downstream consumers treat the plan as immutable. Source: [agentic_swmm/agent/planner.py:25-95]()
4. **Execution** — `executor.py` walks the plan, resolves tools through the registry, executes them in a sandboxed subprocess for SWMM calls, and streams results back to the REPL. It is also responsible for failure handling and run-dossier generation. Source: [agentic_swmm/agent/executor.py:1-80]()
5. **Tool resolution** — `tool_registry.py` is a typed catalogue. v0.7.2 introduced three new skills — calibration, sensitivity, and design storms — which had to be registered here before the planner could reach them. Source: [agentic_swmm/agent/tool_registry.py:1-60]()

## Session and Run Persistence

The runtime writes deterministic artefacts for every turn. The layout under `runs/<date>/<id>/` is the canonical audit dossier referenced by the v0.7.1 release notes: synthesise network → run SWMM → write dossier → render map. The sessions store is independently validated by `aiswmm doctor`, which reports one of four states — `ok`, `corrupt`, `unreadable`, `absent` — and exits non-zero on `CORRUPT` / `UNREADABLE` so CI catches data-loss conditions. A companion `aiswmm memory repair-sessions` verb (v0.7.0a2) attempts to restore damaged stores before they are trusted by future runs.

Source: [agentic_swmm/agent/runtime.py:140-210]()

Source: [agentic_swmm/memory/sessions.py:45-120]()

## Extension Points

Operators add capabilities by registering new tools, not by patching the runtime. The contract is:

| Concern | Where to edit |
|---|---|
| New typed tool | Add an entry to `tool_registry.py` |
| New intent category | Extend `intent_classifier.py` keywords |
| New planning strategy | Subclass `planner.py` |
| New shell verb | Register in `cli/main.py` (routes around runtime) |

This split is why v0.7.2 could ship three new skills without touching the runtime core — only the registry, the classifier, and (where needed) the planner needed updates.

Source: [agentic_swmm/agent/tool_registry.py:80-140]()

## Operational Notes

- The CLI binary `aiswmm` always boots the runtime even for diagnostics; `cli/main.py` is the only entry that may short-circuit it for non-agent verbs.
- Byte-level reproducibility (v0.6.4) depends on the runtime pinning environment variables before dispatch — do not bypass `executor.py` with ad-hoc subprocess calls.
- If `aiswmm doctor` reports a sessions-store problem, run `aiswmm memory repair-sessions` *before* starting any modelling session, otherwise the planner may plan against an unreadable memory state.

Source: [agentic_swmm/cli/main.py:1-70]()

---

<a id='page-3'></a>

## MCP Servers and Skill Layer

### Related Pages

Related topics: [Agent Runtime and Orchestration](#page-2), [Extensibility, Agent Runtimes, and Cross-Project Integration](#page-11)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [integrations/mcp/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/integrations/mcp/README.md)
- [integrations/skills/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/integrations/skills/README.md)
- [skills/swmm-end-to-end/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-end-to-end/SKILL.md)
- [mcp/swmm-runner/server.js](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/mcp/swmm-runner/server.js)
- [mcp/swmm-builder/server.js](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/mcp/swmm-builder/server.js)
- [mcp/swmm-network/server.js](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/mcp/swmm-network/server.js)
- [skills/swmm-calibrate/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-calibrate/SKILL.md)
</details>

# MCP Servers and Skill Layer

The agentic-swmm-workflow project exposes its stormwater-modelling capabilities to LLM agents through two cooperating layers: a **Model Context Protocol (MCP) server layer** that provides typed, JSON-RPC tools, and a **Skill layer** that packages higher-level procedures (prompts, scripts, and conventions) for the agent to invoke. Together they implement the agent-reachable surface of the workflow, which is the focus of the v0.7.2 release notes describing "agent-reachable calibration & sensitivity, three new skills, design storms, memory observability" `Source: [integrations/mcp/README.md:1-40]()`.

## 1. MCP Server Layer

The `integrations/mcp` directory bundles three MCP servers, each implemented in Node.js as a JSON-RPC stdio process and scoped to a single modelling phase. `Source: [integrations/mcp/README.md:10-38]()`

| Server | Module Path | Responsibility |
| --- | --- | --- |
| `swmm-runner` | `mcp/swmm-runner/server.js` | Executes a prepared `.inp` model, returns diagnostics, and writes outputs into the standard `runs/<date>/<id>/` layout. |
| `swmm-builder` | `mcp/swmm-builder/server.js` | Synthesises and edits SWMM input files (subcatchments, conduits, controls). |
| `swmm-network` | `mcp/swmm-network/server.js` | Generates the drainage topology from OSM + DEM via SWMManywhere and exports spatial artefacts. |

Each server registers tools with explicit JSON Schemas so the agent receives parameter validation and typed responses. For example, `swmm-runner` exposes a `run_simulation` tool whose arguments include `inp_path`, `duration_h`, and `output_dir`; results carry the path to the `.rpt` and `.out` files plus a content hash, which downstream skills use to verify byte-reproducibility. `Source: [mcp/swmm-runner/server.js:40-120]()` Similarly, `swmm-builder` returns an updated `.inp` plus a diff summary rather than mutating state implicitly. `Source: [mcp/swmm-builder/server.js:55-90]()` The `swmm-network` server isolates external dependencies (OSMnx, rasterio, SWMManywhere) behind a single `synthesise_network(bbox, out_dir)` call. `Source: [mcp/swmm-network/server.js:30-80]()`

```mermaid
flowchart LR
    A[LLM Agent] -->|JSON-RPC| B(swmm-network)
    A -->|JSON-RPC| C(swmm-builder)
    A -->|JSON-RPC| D(swmm-runner)
    A -->|invoke| E[Skill Layer]
    E --> B
    E --> C
    E --> D
```

## 2. Skill Layer

Sitting one level above the MCP tools, **skills** are the "verbs" the agent is taught to recognise in natural language. Each skill is a markdown document under `skills/<name>/SKILL.md` that pairs a YAML front-matter manifest (name, description, allowed tools) with a prose procedure. `Source: [integrations/skills/README.md:5-45]()` The v0.7.2 release added three new skills covering calibration, sensitivity analysis, and design-storm generation. `Source: [skills/swmm-calibrate/SKILL.md:1-30]()`

The flagship skill is `swmm-end-to-end`, which converts a single WGS84 bounding-box sentence into a complete run:

1. Call `swmm-network.synthesise_network` to obtain the topology. `Source: [skills/swmm-end-to-end/SKILL.md:20-35]()`
2. Delegate subcatchment/conduit assembly to `swmm-builder`. `Source: [skills/swmm-end-to-end/SKILL.md:36-50]()`
3. Hand the resulting `.inp` to `swmm-runner.run_simulation`. `Source: [skills/swmm-end-to-end/SKILL.md:51-68]()`
4. Persist inputs, outputs, and the prompt audit trail under `runs/<date>/<id>/`. `Source: [skills/swmm-end-to-end/SKILL.md:69-82]()`

Skills are deliberately **declarative**: they describe *what* the agent should do, while the MCP servers define *how* a step executes. This separation lets new procedures be added by dropping a markdown file into `skills/`, without modifying any JavaScript code. `Source: [integrations/skills/README.md:46-72]()`

## 3. Determinism, Auditing, and Error Surfaces

Because the paper-grade claims of the project depend on reproducible runs, both layers funnel every action through the `runs/<date>/<id>/` layout and emit content hashes. The MCP servers are required to be stateless: long-lived memory (sessions, observations, repair events) lives in the memory module and is queried through the skill layer rather than cached in the servers themselves. `Source: [mcp/swmm-runner/server.js:1-30]()` When a tool call fails, the server returns a structured error object (`{code, message, retryable}`) that the skill procedure can pattern-match before deciding whether to abort, retry, or repair the session. `Source: [mcp/swmm-builder/server.js:140-170]()`

## 4. Operating the Stack

Installers introduced in v0.7.3 provision the full toolchain (Node, Python, SWMM, and the MCP servers) in one command; both platforms default to the latest published release so the JSON-RPC contracts above stay aligned with what `aiswmm` dispatches at runtime. `Source: [integrations/mcp/README.md:75-95]()` Developers iterating on a single server can launch it in isolation with `node mcp/swmm-runner/server.js` and attach any MCP-compatible client (Claude Desktop, the `aiswmm` agent runtime, or `mcp-cli`) for live testing. `Source: [mcp/swmm-runner/server.js:200-230]()`

In summary, the MCP servers expose narrow, validated primitives, the skill layer composes those primitives into user-visible procedures, and the v0.7.x line ensures the two stay deterministic, auditable, and installable end-to-end.

---

<a id='page-4'></a>

## SWMM Build, Run, and Network Synthesis

### Related Pages

Related topics: [Calibration, Uncertainty, and Water Quality](#page-5), [GIS, Climate, Plot, Report, and Review Skills](#page-6)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/agent/swmm_runtime/run_artifacts.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/run_artifacts.py)
- [agentic_swmm/agent/swmm_runtime/run_manifests.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/run_manifests.py)
- [agentic_swmm/agent/swmm_runtime/preflight.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/preflight.py)
- [agentic_swmm/agent/swmm_runtime/postflight.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/postflight.py)
- [agentic_swmm/agent/swmm_runtime/inp_parsing.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/inp_parsing.py)
- [agentic_swmm/agent/swmm_runtime/rpt_summary.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/rpt_summary.py)
</details>

# SWMM Build, Run, and Network Synthesis

## Purpose and Scope

The SWMM Build, Run, and Network Synthesis subsystem is the execution core of the agentic-swmm-workflow. It transforms a natural-language description of a study area — typically a WGS84 bounding box — into a fully simulated EPA SWMM model whose inputs, outputs, and provenance are written to a deterministic on-disk layout under `runs/<date>/<id>/`. Three responsibilities are layered into this subsystem: (1) synthesising a drainage network from public OSM and DEM data via SWMManywhere, (2) launching the SWMM engine against the produced `.inp` file, and (3) capturing every artifact, manifest, and report into an auditable record.

As highlighted in the v0.7.1 release, a single sentence referring to a bounding box now drives the end-to-end workflow, and v0.7.2 layered agent-reachable calibration, sensitivity, and design-storm skills on top of that foundation. The runtime modules guarantee that the byte-level reproducibility claims made in the companion *AI for Engineering* paper are honoured at the filesystem level.

## Pipeline Architecture

The pipeline follows a strict **preflight → build → run → postflight** sequence. Each stage is independently testable and emits a manifest fragment so that a failure can be attributed to a single phase rather than the whole run.

```mermaid
flowchart LR
    A[Bounding Box + Intent] --> B[preflight.py]
    B --> C[Network Synthesis<br/>SWMManywhere]
    C --> D[.inp generation<br/>inp_parsing.py]
    D --> E[SWMM Engine Run]
    E --> F[.rpt + .out capture<br/>rpt_summary.py]
    F --> G[postflight.py]
    G --> H[run_artifacts.py<br/>+ run_manifests.py]
```

Source: [agentic_swmm/agent/swmm_runtime/preflight.py:1-40](), [agentic_swmm/agent/swmm_runtime/postflight.py:1-40](), [agentic_swmm/agent/swmm_runtime/run_artifacts.py:1-30]()

## Network Synthesis (Build Phase)

The build phase produces a `.inp` file from open data. SWMManywhere — the Imperial College synthesis tool — is invoked with the bounding box and an optional subcatchment hint, returning a GeoPackage containing conduits, junctions, subcatchments, and outfalls. The agentic runtime then validates that the synthesised network is hydrologically plausible (minimum conduit count, presence of an outfall, CRS correctness) before writing the SWMM input file.

The resulting `.inp` is the single source of truth for the downstream simulation. `inp_parsing.py` provides structured readers so that agents and skills can inspect options, raingages, and subcatchment geometry without re-parsing the raw text — this is what makes the calibration, sensitivity, and design-storm skills introduced in v0.7.2 deterministic.

Source: [agentic_swmm/agent/swmm_runtime/inp_parsing.py:1-50](), [agentic_swmm/agent/swmm_runtime/preflight.py:20-80]()

## Run Execution

The run phase shells out to the EPA SWMM 5.x engine using a pinned binary or the official Docker image, with the byte-reproducibility guarantees introduced in v0.6.4 enforced through dependency pinning. `preflight.py` confirms the engine binary is reachable, that the working directory is writable, and that disk quotas are not exceeded before launching the simulation. `postflight.py` then re-reads the produced `.rpt` and `.out` files to verify that the engine reported zero continuity errors and that all expected sections are present in the report.

During execution, `run_artifacts.py` snapshots the input file, the synthesised GeoPackage, the engine version, and a SHA-256 digest of each artifact into the run directory. This snapshot is what makes the workflow auditable: a reviewer can later reconstruct the exact `.inp` and engine version that produced a given set of results.

Source: [agentic_swmm/agent/swmm_runtime/preflight.py:40-120](), [agentic_swmm/agent/swmm_runtime/postflight.py:40-120](), [agentic_swmm/agent/swmm_runtime/run_artifacts.py:30-90]()

## Output Capture and Manifests

Once the engine terminates, `rpt_summary.py` distils the verbose `.rpt` file into a compact summary covering simulation duration, rainfall totals, peak runoff, and any warnings. `run_manifests.py` writes the deterministic `manifest.json` and `audit.json` files that index every artifact under `runs/<date>/<id>/`, satisfying the "deterministic audit dossier" requirement announced in v0.7.1.

The manifest schema is intentionally flat and human-readable, so that downstream skills — calibration, sensitivity, design storms — can locate predecessor artifacts without traversing the agent conversation history. This decoupling is what allows the modeling memory subsystem (introduced across the v0.7.0 series) to revisit a prior run as a black box and re-use its inputs.

Source: [agentic_swmm/agent/swmm_runtime/rpt_summary.py:1-60](), [agentic_swmm/agent/swmm_runtime/run_manifests.py:1-80](), [agentic_swmm/agent/swmm_runtime/run_artifacts.py:60-140]()

## Summary

The SWMM Build, Run, and Network Synthesis subsystem is the deterministic backbone of the agentic workflow. Network synthesis, runtime execution, and artifact capture are separated into independently verifiable phases so that the byte-reproducibility, auditability, and agent-reachability guarantees published alongside the v0.7.x line all hold at the filesystem level rather than only in documentation.

---

<a id='page-5'></a>

## Calibration, Uncertainty, and Water Quality

### Related Pages

Related topics: [SWMM Build, Run, and Network Synthesis](#page-4), [Modelling Memory and Learning Layer](#page-7)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/agent/calibration_batch.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/calibration_batch.py)
- [agentic_swmm/agent/swmm_runtime/calibration_runner.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/calibration_runner.py)
- [agentic_swmm/agent/swmm_runtime/uncertainty_plan.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/uncertainty_plan.py)
- [agentic_swmm/agent/swmm_runtime/design_storm.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/design_storm.py)
- [agentic_swmm/agent/tool_handlers/swmm_calibration.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/tool_handlers/swmm_calibration.py)
- [agentic_swmm/agent/tool_handlers/swmm_uncertainty.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/tool_handlers/swmm_uncertainty.py)
</details>

# Calibration, Uncertainty, and Water Quality

This page documents the agent-reachable surface of the Agentic SWMM workflow responsible for **calibration**, **uncertainty quantification**, and the **design-storm / water-quality** inputs that drive those analyses. Together these modules close the loop between a synthesised drainage network and a defensible, auditable model output. The v0.7.2 release notes describe this surface as the "agent-reachable calibration & sensitivity" wave, which is the canonical entry point for the components below. `Source: [CHANGELOG](https://github.com/Zhonghao1995/agentic-swmm-workflow/releases/tag/v0.7.2)`

## Scope and high-level role

The calibration and uncertainty subsystem sits between the **swmm_runtime** layer (which executes EPA-SWMM simulations deterministically) and the **tool_handlers** layer (which exposes typed tools to the agent runtime). It is responsible for:

- Preparing a deterministic **plan** of SWMM runs with perturbed parameters.
- Executing that plan in **batches** so that long calibration campaigns can be resumed, retried, and audited.
- Scoring each run against an objective (typically NSE, KGE, or PBIAS on observed flow/quality series).
- Producing an uncertainty envelope and parameter-sensitivity ranking that can be re-consumed by the agent in subsequent turns.

The design-storm and water-quality components supply the **forcings** (rainfall hyetographs, pollutant loadings) that those calibration runs operate on, ensuring that the calibrated parameters are tied to a reproducible storm scenario rather than ad-hoc inputs.

## Module map

| Module | Role | Consumes | Produces |
| --- | --- | --- | --- |
| `swmm_runtime/calibration_runner.py` | Executes one calibration iteration (one parameter set × one storm) | A `.inp` template, a parameter vector | A scored run artefact |
| `swmm_runtime/uncertainty_plan.py` | Builds the parameter-space sampling plan (Sobol / LHS / grid) | Parameter priors, bounds, sample count | A `plan.json` + run manifest |
| `swmm_runtime/design_storm.py` | Generates design-storm hyetographs (e.g. SCS Type II, IDF-based) and links pollutant land-use loading | AOI, return period, duration | `.dat` rainfall + quality inputs |
| `agent/calibration_batch.py` | Orchestrates many `calibration_runner` invocations; handles resume/retry/diff | `plan.json`, observed series | Batch directory + manifest |
| `tool_handlers/swmm_calibration.py` | Exposes calibration to the agent as typed tool calls | Natural-language request | Hand-off to `calibration_batch` |
| `tool_handlers/swmm_uncertainty.py` | Exposes sensitivity / uncertainty tool calls | `plan.json`, scored results | Summary tables + plots |

This separation keeps **deterministic numerics** in `swmm_runtime/` and **agent-facing schemas** in `tool_handlers/`, which is the same boundary used elsewhere in the codebase (v0.6.3a1 consolidated the keyword routing that the agent uses to dispatch into these handlers). `Source: [agentic_swmm/agent/swmm_runtime/calibration_runner.py:1-40](), [agentic_swmm/agent/tool_handlers/swmm_calibration.py:1-40]()`

## Calibration pipeline

A calibration campaign follows four phases:

1. **Plan** — `uncertainty_plan.py` enumerates parameter combinations according to the configured sampler (Latin Hypercube for broad exploration, Sobol for sensitivity indices, or grid for low-dim re-checks). The plan is written to `runs/<date>/<id>/plan.json` so that downstream tools can resume mid-campaign. `Source: [agentic_swmm/agent/swmm_runtime/uncertainty_plan.py:1-60]()`
2. **Execute** — `calibration_batch.py` walks the plan and calls `calibration_runner.py` once per row. Each row produces a self-contained run directory with its own `.inp`, `.rpt`, and `.out`, which is what gives the system its byte-reproducibility story (v0.6.4). `Source: [agentic_swmm/agent/calibration_batch.py:1-80](), [agentic_swmm/agent/swmm_runtime/calibration_runner.py:1-80]()`
3. **Score** — The runner reports one or more objective-function values (configurable in the tool schema exposed by `swmm_calibration.py`). Results are appended to `scores.csv` next to the plan. `Source: [agentic_swmm/agent/tool_handlers/swmm_calibration.py:40-120]()`
4. **Surface** — `swmm_uncertainty.py` consumes `scores.csv` + `plan.json` and emits the sensitivity ranking and uncertainty bands that the agent quotes back to the user. `Source: [agentic_swmm/agent/tool_handlers/swmm_uncertainty.py:1-80]()`

```mermaid
flowchart LR
  A[uncertainty_plan.py] -->|plan.json| B[calibration_batch.py]
  B --> C[calibration_runner.py]
  C -->|scores.csv| D[swmm_uncertainty.py]
  B --> E[design_storm.py]
  E -->|.dat forcing| C
  H[agent tool call] -->|typed schema| F[swmm_calibration.py]
  H -->|typed schema| G[swmm_uncertainty.py]
  F --> B
  G --> D
```

## Design storms and water quality

`design_storm.py` produces the synthetic rainfall and pollutant inputs that the calibration pipeline consumes. Its responsibilities split cleanly:

- **Hydrology** — SCS Type II / Type I hyetographs or user-supplied IDF curves, parameterised by return period and duration. The same generator feeds both calibration (where it acts as a repeatable forcing) and scenario analysis (where it acts as the design event). `Source: [agentic_swmm/agent/swmm_runtime/design_storm.py:1-80]()`
- **Water quality** — Land-use based pollutant build-up / washoff definitions and dry-weather flow profiles are written next to the rainfall file so that a single `.inp` mutation covers both quantity and quality calibration in the same batch. This is the mechanism by which the agent can answer "calibrate against observed TSS at outfall X" without bespoke scripting. `Source: [agentic_swmm/agent/swmm_runtime/design_storm.py:80-160]()`

Because design storms and quality inputs are versioned inside the run directory, any audit dossier produced by the workflow can replay the exact forcing that produced a given sensitivity ranking — a property the project leans on heavily in its reproducibility claims. `Source: [agentic_swmm/agent/calibration_batch.py:80-140]()`

## Notes for users and integrators

- The agent-facing schemas in `swmm_calibration.py` and `swmm_uncertainty.py` are the only stable contract; the `swmm_runtime/` modules may evolve as long as the schemas do not.
- Resuming a campaign is supported by re-invoking the calibration tool with the same `runs/<date>/<id>/` path — the batch driver skips already-scored rows.
- Sensitivity results from `swmm_uncertainty.py` are the recommended input to a subsequent calibration refinement pass, narrowing the prior bounds before the next sampling round.

---

<a id='page-6'></a>

## GIS, Climate, Plot, Report, and Review Skills

### Related Pages

Related topics: [SWMM Build, Run, and Network Synthesis](#page-4), [Calibration, Uncertainty, and Water Quality](#page-5)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [skills/swmm-gis/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-gis/SKILL.md)
- [skills/swmm-params/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-params/SKILL.md)
- [skills/swmm-climate/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-climate/SKILL.md)
- [skills/swmm-plot/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-plot/SKILL.md)
- [skills/swmm-report/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-report/SKILL.md)
- [skills/swmm-design-review/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/swmm-design-review/SKILL.md)
</details>

# GIS, Climate, Plot, Report, and Review Skills

The Agentic SWMM workflow exposes its capabilities as a set of modular **Agent Skills**, each backed by a `SKILL.md` contract under `skills/<name>/`. Five of these skills form the post-modeling and pre-delivery pipeline — **GIS, Climate, Plot, Report, and Design Review** — turning raw SWMM runs into auditable spatial artifacts, defensible input data, and human-readable deliverables. Together with `swmm-params`, they constitute the "interpret, contextualize, and present" layer of the agent runtime, sitting between the simulation engine and the final `runs/<date>/<id>/` dossier.

## Purpose and Scope

Each skill is an atomic, agent-reachable unit with a narrow, well-defined responsibility. The Agent invokes them through the Model Context Protocol (MCP), passing typed arguments and receiving deterministic outputs written to the run directory. The five skills target distinct concerns:

- **`swmm-gis`** handles geospatial preparation: bounding-box ingestion, OSM/DEM fetches, subcatchment delineation, and storage of vector layers alongside the run.
- **`swmm-climate`** curates meteorological forcing — rainfall time series, design storms, and IDF inputs — and binds them to SWMM's `[TIMESERIES]` and `[RAINGAGE]` sections.
- **`swmm-plot`** renders hydrographs, network maps, and spatial overlays as static images or interactive artifacts for downstream review.
- **`swmm-report`** assembles the deterministic audit dossier, summarizing inputs, decisions, and outputs in a reproducible document.
- **`swmm-design-review`** applies engineering review rules over the assembled run, flagging capacity violations, surcharge, and guideline deviations.

This decomposition lets the Agent compose complex workflows from small, verifiable steps rather than monolithic scripts. Source: [skills/swmm-gis/SKILL.md:1-15](), [skills/swmm-climate/SKILL.md:1-12](), [skills/swmm-plot/SKILL.md:1-10]().

## Skill-by-Skill Architecture

### GIS and Climate: input preparation

The **GIS skill** acts as the spatial front door. Given a WGS84 bounding box, it orchestrates network synthesis (delegating to SWMManywhere as introduced in v0.7.1) and persists GeoJSON / shapefile artifacts inside the run directory for later plotting and review. It is the only skill that performs external fetches (OSM, DEM), and its output schema is consumed by every downstream skill that needs geometry. Source: [skills/swmm-gis/SKILL.md:18-44]().

The **Climate skill** mirrors this role for hydrometeorological inputs. It accepts observed hyetographs, synthetic design storms (introduced in v0.7.2), and user-supplied IDF curves, normalizing them into SWMM-compatible timeseries objects. Determinism is enforced by storing both the raw source and the canonicalized form. Source: [skills/swmm-climate/SKILL.md:20-52]().

| Skill   | Primary Input         | Primary Output             | Consumed By                          |
|---------|-----------------------|----------------------------|--------------------------------------|
| gis     | BBOX or geo file      | Vector layers + network    | plot, report, design-review          |
| climate | Hyetograph / IDF      | SWMM `[TIMESERIES]` block  | report, design-review                |
| plot    | Run results + layers  | PNG/HTML visualizations    | report, design-review                |
| report  | All run artifacts     | Audit dossier (PDF/HTML)   | End user                             |
| design-review | Run results + rules | Findings list (JSON/MD)  | report, end user                     |

### Plot, Report, and Design Review: output synthesis

The **Plot skill** translates numeric SWMM outputs into visual artifacts. It reads the binary `.out` results, pairs them with GIS layers from `swmm-gis`, and produces hydrographs, longitudinal profiles, and spatial network maps. All plots carry a deterministic seed and a run-id watermark so they are byte-reproducible across machines. Source: [skills/swmm-plot/SKILL.md:14-39]().

The **Report skill** is the integrator. It walks the run directory, aggregates the artifacts produced by the upstream skills (GIS layers, climate inputs, plots, model parameters, simulation logs), and emits the deterministic audit dossier — the same dossier referenced in the v0.7.1 release as "a deterministic audit dossier … rendered alongside a spatial network map." Source: [skills/swmm-report/SKILL.md:10-58]().

The **Design Review skill** applies engineering judgement. It loads the simulation results together with configurable review rules (capacity, freeboard, surcharge depth, flooding extent) and produces a structured findings document. When the v0.7.2 release mentions "agent-reachable calibration & sensitivity," review is the companion step that interprets calibrated outputs against design intent. Source: [skills/swmm-design-review/SKILL.md:12-47]().

## Workflow and Composition

The skills are designed to be invoked sequentially or selectively, depending on the Agent's plan. A typical end-to-end flow follows the data dependency chain: **params → gis → climate → (simulate) → plot → design-review → report**. Each step writes its outputs into the standard `runs/<date>/<id>/` layout, so any step can be re-run independently without side effects on the others — a property formalized by the byte-reproducibility hardening in v0.7.4. Source: [skills/swmm-report/SKILL.md:30-45]().

```mermaid
flowchart LR
    A[swmm-params] --> B[swmm-gis]
    A --> C[swmm-climate]
    B --> D[SWMM Sim]
    C --> D
    D --> E[swmm-plot]
    D --> F[swmm-design-review]
    B --> E
    E --> G[swmm-report]
    F --> G
    C --> G
```

This composition model lets the Agent skip steps when artifacts already exist (for example, re-rendering plots without re-synthesizing the network) while keeping the final dossier deterministic. Source: [skills/swmm-plot/SKILL.md:22-28](), [skills/swmm-design-review/SKILL.md:30-36]().

## Cross-cutting Properties

All five skills share three properties that distinguish them from ordinary scripts:

1. **Idempotency** — invoking a skill twice with identical inputs produces byte-identical outputs, enabling CI verification.
2. **Typed contracts** — each `SKILL.md` declares input/output schemas consumed by the MCP tool surface introduced across the v0.7.x line.
3. **Run-scoped I/O** — no skill writes outside `runs/<date>/<id>/`, preserving the audit trail that `swmm-report` later assembles.

These properties are what allow the Agent to treat the skills as reliable building blocks for complex, multi-step stormwater investigations. Source: [skills/swmm-gis/SKILL.md:48-60](), [skills/swmm-climate/SKILL.md:55-66](), [skills/swmm-report/SKILL.md:62-70]().

---

<a id='page-7'></a>

## Modelling Memory and Learning Layer

### Related Pages

Related topics: [Audit, Provenance, and Verification](#page-8), [Operations, Workflows, and Common Failure Modes](#page-12)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/memory/jsonl_store.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/jsonl_store.py)
- [agentic_swmm/memory/session_db.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/session_db.py)
- [agentic_swmm/memory/session_repair.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/session_repair.py)
- [agentic_swmm/memory/recall.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/recall.py)
- [agentic_swmm/memory/parametric_memory.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/parametric_memory.py)
- [agentic_swmm/memory/lessons_lifecycle.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/lessons_lifecycle.py)
</details>

# Modelling Memory and Learning Layer

## Overview and Purpose

The Modelling Memory and Learning Layer is the persistence-and-feedback subsystem that lets the agentic SWMM agent accumulate experience across runs. It captures what the agent saw, decided, and produced for each modelling session, then surfaces that history during future invocations so that recurring stormwater modelling tasks become faster, cheaper, and more consistent.

The layer was promoted to a first-class subsystem in **v0.7.0 — Modeling memory, agent runtime, install UX**, which anchored the memory module as a sibling to the agent runtime and installer. v0.7.2 later added observability hooks for memory and v0.7.0a2 introduced `aiswmm memory repair-sessions` together with stronger integrity checks under `aiswmm doctor`.

Source: [CHANGELOG entry for v0.7.0](https://github.com/Zhonghao1995/agentic-swmm-workflow/releases/tag/v0.7.0)
Source: [CHANGELOG entry for v0.7.0a2](https://github.com/Zhonghao1995/agentic-swmm-workflow/releases/tag/v0.7.0a2)

The subsystem is organised under `agentic_swmm/memory/` and is composed of three responsibilities: durable storage, retrieval, and lifecycle/curation.

## Component Architecture

The layer is split into cooperating modules with narrowly scoped responsibilities.

- `jsonl_store.py` — append-only JSON Lines writer/reader used as the lowest-level event log. JSONL keeps writes crash-safe and streamable, which matters for the long-lived run directories produced by the workflow. Source: [agentic_swmm/memory/jsonl_store.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/jsonl_store.py)
- `session_db.py` — session-oriented adapter that groups JSONL events into a queryable per-session record (sessions are the unit of recall for the agent). Source: [agentic_swmm/memory/session_db.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/session_db.py)
- `session_repair.py` — repair utility backing the `aiswmm memory repair-sessions` verb; it scans the sessions store and restores integrity where events are missing or truncated. Source: [agentic_swmm/memory/session_repair.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/session_repair.py)
- `recall.py` — retrieval API the agent calls to fetch prior relevant context before planning a new run. Source: [agentic_swmm/memory/recall.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/recall.py)
- `parametric_memory.py` — stores tuned model parameters (roughness, subcatchment widths, control rules, calibration coefficients) so they can be re-applied to similar catchments later. Source: [agentic_swmm/memory/parametric_memory.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/parametric_memory.py)
- `lessons_lifecycle.py` — promotes/demotes durable "lessons learned" between short-term, working, and long-term tiers and exposes the curation hooks. Source: [agentic_swmm/memory/lessons_lifecycle.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/lessons_lifecycle.py)

A canonical data flow is:

```mermaid
flowchart LR
  A[Run agent] --> B[session_db.py]
  B --> C[jsonl_store.py]
  C --> D[(sessions/ events)]
  D --> E[recall.py]
  E --> A
  D --> F[parametric_memory.py]
  D --> G[lessons_lifecycle.py]
  D --> H[session_repair.py]
  H --> D
```

## Storage Tiers and Integrity Model

Two storage tiers cooperate:

1. **Event log tier** — `jsonl_store.py` writes each agent action and observation as one JSON object per line. The append-only shape makes partial writes recoverable and aligns with the byte-reproducibility story introduced in v0.6.4.
2. **Sessions tier** — `session_db.py` materialises per-session records on top of the event log. The sessions store is what `aiswmm doctor` now reports against (introduced in v0.7.0a2 with explicit states `ok / corrupt / unreadable / absent`) and what the doctor returns a non-zero exit code for when integrity fails. Source: [v0.7.0a2 changelog](https://github.com/Zhonghao1995/agentic-swmm-workflow/releases/tag/v0.7.0a2)

When the doctor flags `corrupt` or `unreadable`, `session_repair.py` is the supported recovery path. The CLI surface added in v0.7.0a2 — `aiswmm memory repair-sessions` — invokes that module so that an operator can fix the store without rebuilding it by hand. Source: [agentic_swmm/memory/session_repair.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/session_repair.py)

## Learning Surfaces: Parametric Memory and Lessons

The "learning" half of the layer is delivered through two complementary stores:

- **Parametric memory** persists concrete model parameters — calibrated roughness values, subcatchment geometry, control rule settings, hydrology choices — keyed by catchment signature so they can be proposed as defaults on similar new runs. Source: [agentic_swmm/memory/parametric_memory.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/parametric_memory.py)
- **Lessons** are qualitative observations produced after a run (e.g. "this subcatchment saturates control rule X during 10-yr events") that are curated across sessions by `lessons_lifecycle.py`. Lifecycle management ensures stale or contradicted lessons can be demoted and superseded.

These two surfaces are surfaced to the agent exclusively through `recall.py`, which is invoked at the start of a workflow to inject prior context into the prompt and at the end of a workflow to record new observations. Source: [agentic_swmm/memory/recall.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/memory/recall.py)

## Operations and CLI Surface

The layer is operated through the `aiswmm memory` verb family:

| Verb | Purpose | Backed by |
| --- | --- | --- |
| `aiswmm memory repair-sessions` | Recovers the sessions store after corruption | `session_repair.py` |
| `aiswmm doctor` (sessions row) | Reports sessions store state (`ok / corrupt / unreadable / absent`) and exits non-zero on failure | `session_db.py` + `jsonl_store.py` |

Memory observability improvements shipped in v0.7.2 alongside the calibration/sensitivity skills. Source: [v0.7.2 release notes](https://github.com/Zhonghao1995/agentic-swmm-workflow/releases/tag/v0.7.2)

## Summary

The Modelling Memory and Learning Layer transforms the agent from a stateless workflow runner into a cumulative system. Append-only JSONL gives durable, reproducible event capture; the sessions tier turns those events into queryable units that the doctor can validate; parametric memory and lessons close the loop by feeding prior experience back into new runs. Repair and observability operations added in the 0.7.x line make the layer safe to run unattended in CI while still giving operators an explicit recovery path.

---

<a id='page-8'></a>

## Audit, Provenance, and Verification

### Related Pages

Related topics: [Modelling Memory and Learning Layer](#page-7), [Operations, Workflows, and Common Failure Modes](#page-12)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/audit/provenance_v1_2.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/audit/provenance_v1_2.py)
- [agentic_swmm/audit/run_folder_layout.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/audit/run_folder_layout.py)
- [agentic_swmm/audit/chat_note.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/audit/chat_note.py)
- [agentic_swmm/audit/llm_calls.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/audit/llm_calls.py)
- [agentic_swmm/agent/swmm_runtime/compare.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/agent/swmm_runtime/compare.py)
- [docs/experiment-audit-framework.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/experiment-audit-framework.md)
</details>

# Audit, Provenance, and Verification

The audit subsystem in `agentic-swmm-workflow` is the connective tissue that turns an agent-driven stormwater modelling session into a defensible scientific artifact. Every natural-language request, tool invocation, SWMM run, and comparison step is captured into a deterministic dossier so that a reviewer (or another agent) can reconstruct what happened, on which data, and with which tool versions. This is the auditable-and-reproducible contract advertised in the v0.7.2 release notes and the companion *AI for Engineering* paper. Source: [docs/experiment-audit-framework.md]()

## Goals and Scope

The audit module serves three intertwined goals:

1. **Provenance** — record which inputs (OSM extracts, DEM tiles, rainfall series, parameter files) produced a given `runs/<date>/<id>/` artifact, together with the toolchain version that processed them. Source: [agentic_swmm/audit/provenance_v1_2.py]()
2. **Trace** — log the conversational and tool-call sequence the agent followed, including prompt text, model identifiers, and structured tool arguments, so the reasoning path is inspectable. Source: [agentic_swmm/audit/chat_note.py](), [agentic_swmm/audit/llm_calls.py]()
3. **Verification** — provide deterministic comparison primitives that detect drift between two SWMM runs (baseline vs. candidate) and surface divergences in a way a reviewer can audit. Source: [agentic_swmm/agent/swmm_runtime/compare.py]()

Scope is deliberately bounded to *one run folder at a time*. Cross-run queries (searching past runs, comparing sessions, memory repair) live in the `memory` subsystem; the audit subsystem only guarantees that any single run is self-contained and replayable. This boundary is enforced by the standard `runs/<date>/<id>/` layout contract.

## Run Folder Layout — the Audit Container

Every agent invocation that produces a model result lands under `runs/<date>/<id>/`. The layout itself is a first-class object: `agentic_swmm/audit/run_folder_layout.py` exposes helpers that construct paths, enforce the schema, and refuse to write into a folder that already has a different `provenance.json`. The folder always carries the same minimum set of artifacts:

| Artifact | Purpose | Producer |
|---|---|---|
| `provenance.json` (v1.2) | Toolchain version, input hashes, seed values, SWMManywhere / SWMM versions | `provenance_v1_2.py` |
| `chat_log.jsonl` | Turn-by-turn user/assistant messages | `chat_note.py` |
| `llm_calls.jsonl` | Each provider call: prompt, response, tokens, latency | `llm_calls.py` |
| `inp/<model>.inp` | The exact SWMM input file that was executed | SWMM runtime |
| `rpt/<model>.rpt` | The SWMM report | SWMM runtime |
| `audit/summary.md` | Human-readable dossier synthesised at run end | audit framework |

Source: [agentic_swmm/audit/run_folder_layout.py](), [agentic_swmm/audit/provenance_v1_2.py]()

The folder is *immutable once sealed*: closing a run writes a `SEALED` marker that subsequent tools check before overwriting. This is what makes the v0.6.4 byte-reproducibility claim tractable — a reviewer can hash every file under the folder and compare against a re-execution that consumes the same `provenance.json`.

## Provenance Capture (v1.2)

`provenance_v1_2.py` defines the canonical provenance schema. It is deliberately versioned (`v1_2`) so older dossiers can still be parsed while the schema evolves. The module captures:

- The `aiswmm` package version and the commit hash of the installed checkout.
- Versions of synthesiser dependencies (SWMManywhere, pyswmm, swmm-toolkit) and the SWMM engine binary.
- SHA-256 hashes of every external input file (OSM PBF, DEM GeoTIFF, rainfall CSV, calibration targets).
- RNG seeds and any explicit non-determinism flags the agent set.

Because the provenance file lists input hashes *before* the run executes, the audit dossier is forward-checkable: re-running with the same provenance must reproduce byte-identical SWMM outputs (modulo explicitly flagged non-determinism). Source: [agentic_swmm/audit/provenance_v1_2.py]()

## Conversational and Tool-Call Trace

Two parallel logs keep the reasoning path visible:

- `chat_note.py` records the human-facing dialog: user prompts, assistant responses, slash-command expansions, and any clarifications the agent asked for. Records are append-only JSONL so a partial read still yields a coherent prefix.
- `llm_calls.py` records the *underlying* provider calls: model id, prompt hash, completion text, token counts, latency, and whether the call succeeded, retried, or was rejected by the policy layer. This separation matters because the chat log shows *what the user saw*, while the LLM log shows *what the model actually computed* — including retries and tool-use loops the user never directly observed.

Source: [agentic_swmm/audit/chat_note.py](), [agentic_swmm/audit/llm_calls.py]()

## Verification via Run Comparison

Verification is the "did the model behave as expected?" half of the framework. `agentic_swmm/agent/swmm_runtime/compare.py` provides deterministic comparators that take two run folders (typically a sealed baseline and a fresh candidate) and emit a structured diff:

- **INP diff** — section-by-section comparison of the SWMM input files, useful when a calibration skill rewrote subcatchment parameters.
- **RPT diff** — numerical comparison of node flooding, link surcharge, and outfall totals against tolerances declared in the provenance.
- **Timeseries diff** — node and link time series compared at sample points with an absolute and relative tolerance, summarising max-abs-error and RMSE.

The comparator never silently passes: any tolerance breach is surfaced as a non-empty diff object and the run is marked `VERIFY_FAIL` in the audit dossier. This is what makes the *audit dossier* claim from v0.7.1 meaningful — reviewers do not need to trust the agent's self-report, they re-verify against the provenance and the comparator output. Source: [agentic_swmm/agent/swmm_runtime/compare.py]()

## End-to-End Audit Workflow

```mermaid
flowchart LR
    A[User NL request] --> B[chat_note append]
    B --> C[Agent plan]
    C --> D[llm_calls log]
    C --> E[Tool invocations]
    E --> F[provenance_v1_2 snapshot]
    F --> G[SWMM run in runs/date/id/]
    G --> H[compare vs baseline]
    H --> I[Sealed audit dossier]
    I --> J[Reviewer re-verifies]
```

The audit layer never blocks the agent loop; it observes and persists. That separation is what allows the framework to remain useful both for live calibration sessions (where the agent iterates rapidly) and for the formal reproducibility study released alongside v0.6.4.

## Operational Notes

- `aiswmm doctor` includes a row for the audit store and reports corruption separately from absence, so a missing folder is benign while a partially-written `provenance.json` is not.
- Once a run is `SEALED`, the audit framework treats it as read-only. Re-running the same provenance is permitted and is the supported reproducibility path; mutating a sealed folder is not.
- The audit dossier is the canonical artifact for downstream memory ingestion: the memory subsystem reads provenance and the comparison summary, not the raw LLM transcript, when indexing a session.

Source: [docs/experiment-audit-framework.md](), [agentic_swmm/audit/run_folder_layout.py](), [agentic_swmm/audit/provenance_v1_2.py]()

---

<a id='page-9'></a>

## Installation Paths and Runtime Configuration

### Related Pages

Related topics: [CLI Commands and Doctor Diagnostics](#page-10), [Operations, Workflows, and Common Failure Modes](#page-12)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [docs/installation.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/installation.md)
- [docs/runtime-install-options.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/runtime-install-options.md)
- [docs/api-key-configuration.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/api-key-configuration.md)
- [docs/llm_providers.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/llm_providers.md)
- [scripts/install.sh](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/scripts/install.sh)
- [scripts/install.ps1](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/scripts/install.ps1)
</details>

# Installation Paths and Runtime Configuration

Agentic SWMM is distributed through multiple parallel install paths that converge on the same `aiswmm` CLI entry point. Choosing between them is a trade-off between reproducibility, isolation, and ease of first-run setup, while runtime configuration layers on top of every install path to select the LLM provider and authenticate it.

## Supported Install Paths

The project exposes three primary installation surfaces, each targeting a different user persona.

### 1. PyPI (recommended for most users)

The canonical install is `pip install aiswmm`, which resolves to the current stable release on PyPI. Pre-release channels are available for users who want to dogfood upcoming features:

```bash
pip install aiswmm              # latest stable (e.g. v0.7.x)
pip install aiswmm==0.7.0a1     # pin a specific alpha
pip install --pre aiswmm        # any pre-release
pip install "aiswmm[claude]"    # add the optional Claude Agent SDK provider
```

The optional `[claude]` extra pulls in the Claude Agent SDK provider used by skills that require it. `pip install aiswmm` without extras still installs the core CLI and default providers. Source: [docs/installation.md:1-30]()

### 2. One-line installers (fresh-machine provisioning)

For a brand-new machine that does not yet have the toolchain, the repository ships two scripts: `scripts/install.sh` (Linux/macOS) and `scripts/install.ps1` (Windows). The v0.7.3 release reworked both scripts so that a single command provisions the entire toolchain — Python, pip, the package itself, and supporting tools — and both platforms default to the latest published release. Source: [scripts/install.sh:1-20]()

On Windows, the PowerShell installer explicitly clones into `%LOCALAPPDATA%` rather than the write-protected `C:\Windows\System32`, and sets a process-scope `ExecutionPolicy Bypass` so the clone step itself can run without a prior machine-wide policy change. Source: [scripts/install.ps1:1-40]()

### 3. Docker image (byte-reproducible)

Auto-built Docker images are published alongside every tagged release, primarily to support the byte-level reproducibility claims made in the companion *AI for Engineering* paper. v0.6.4 introduced pinned-dependency Docker builds so that two researchers pulling the same image tag produce identical byte streams for the same input. The Docker path is the recommended choice when audit dossier hashing matters more than iteration speed. Source: [docs/installation.md:45-70](), [docs/runtime-install-options.md:10-25]()

## Version Pinning Semantics

Stable releases follow PEP 440 strictly: `pip install aiswmm` always resolves to the latest non-pre-release tag. Alpha and beta channels (`a1`, `a2`, `b1`) are opt-in only and never shadow the stable default. For example, throughout the 0.7.0a* series `pip install aiswmm` continued to deliver v0.6.4, and every v0.6.4 pin (PyPI / Git tag / Docker image) remained immutable. Source: [docs/installation.md:75-95]()

This lets production deployments pin a specific version with full confidence that no implicit upgrade will land during a reproducibility-sensitive run, while contributors can opt into pre-releases without poisoning downstream installs.

## Runtime Configuration

Once installed, the runtime is configured through a combination of environment variables and an on-disk sessions store. The most common configuration surface is the LLM provider selection.

### Provider selection

The CLI accepts a `--provider` flag (or `AISWMM_PROVIDER` environment variable) that selects which model backend drives the agent loop. Supported providers are documented in `docs/llm_providers.md`, with the Claude Agent SDK provider only available when the `[claude]` extra was installed at install time. Source: [docs/llm_providers.md:1-40]()

### API key configuration

Each provider requires its own credential, supplied via environment variables such as `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `AISWMM_*` equivalents documented in `docs/api-key-configuration.md`. Keys are read at process start; rotating a key requires restarting any long-running agent. The API-key doc explicitly warns against committing keys to the repository and recommends a local `.env` file excluded by `.gitignore`. Source: [docs/api-key-configuration.md:1-50]()

### Health and integrity checks

`aiswmm doctor` inspects the local install and reports per-subsystem status. v0.7.0a2 added a `Sessions store` row that performs a full integrity check, returning one of four states: `ok`, `corrupt`, `unreadable`, or `absent`. The exit code is non-zero for `CORRUPT` and `UNREADABLE` so CI pipelines can fail loudly on data-loss conditions rather than silently producing empty results. A companion verb `aiswmm memory repair-sessions` was introduced at the same time to attempt recovery from a corrupt store. Source: [docs/runtime-install-options.md:40-70]()

## Choosing Between Paths

The table below summarises the trade-offs. It is the only table on this page and is intended as a quick-reference decision aid.

| Path | Best for | Reproducibility | First-run friction |
|---|---|---|---|
| `pip install aiswmm` | Day-to-day users, CI | High when pinned | Low |
| One-line installer | Fresh laptops, onboarding | High when pinned | Lowest (single command) |
| Docker image | Audited runs, paper reproduction | Byte-level | Medium (Docker required) |
| Pre-release (`--pre`) | Contributors, dogfooders | Lower until promoted | Low |

For most users, the recommended path is `pip install aiswmm` on a machine that already has Python 3.10+, falling back to the one-line installer on a fresh machine and to Docker only when an audit dossier needs byte-identical reproduction. Runtime configuration (provider + API key) is identical across all three paths because the CLI reads configuration from the same environment regardless of how the package was installed. Source: [docs/installation.md:100-120](), [scripts/install.sh:1-20](), [scripts/install.ps1:1-40](), [docs/llm_providers.md:1-40](), [docs/api-key-configuration.md:1-50]()

---

<a id='page-10'></a>

## CLI Commands and Doctor Diagnostics

### Related Pages

Related topics: [Installation Paths and Runtime Configuration](#page-9), [Modelling Memory and Learning Layer](#page-7)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [agentic_swmm/cli.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/cli.py)
- [agentic_swmm/commands/doctor.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/commands/doctor.py)
- [agentic_swmm/commands/memory.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/commands/memory.py)
- [agentic_swmm/commands/memory_health.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/commands/memory_health.py)
- [agentic_swmm/commands/expert/__init__.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/commands/expert/__init__.py)
- [agentic_swmm/diagnostics/doctor_report.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/diagnostics/doctor_report.py)
</details>

# CLI Commands and Doctor Diagnostics

The `aiswmm` command-line interface is the user-facing surface of the Agentic SWMM workflow. It exposes a verb-first command tree built around three concerns: invoking the agent runtime, inspecting the persisted "modeling memory" store, and running environment diagnostics. The `aiswmm doctor` subcommand is the canonical health probe and is the contract CI pipelines use to decide whether the local toolchain is trustworthy enough to produce a reproducible run.

## CLI Architecture and Command Tree

The CLI entry point is implemented in `agentic_swmm/cli.py`, which dispatches to grouped submodules under `agentic_swmm/commands/`. The top-level verbs reflect the operational lifecycle of a modelling session: `run` (start the agent), `expert` (ask the expert skill group), `memory` (manage the on-disk session store), and `doctor` (verify environment and store integrity) (Source: [agentic_swmm/cli.py:1-80]()).

Subcommands are organized as thin parsers that delegate to focused implementation modules. For example, `aiswmm memory` is routed to `agentic_swmm/commands/memory.py`, while its observability verbs live alongside it in `agentic_swmm/commands/memory_health.py` so that reporting and mutation logic stay close (Source: [agentic_swmm/commands/memory.py:1-60]()).

The `expert` group aggregates the domain-specific skills surfaced to the agent (calibration, sensitivity, design storms, network synthesis, etc.). It is registered as a namespace through `agentic_swmm/commands/expert/__init__.py`, which keeps the skill roster isolated from the runtime verbs (Source: [agentic_swmm/commands/expert/__init__.py:1-40]()).

This separation — root dispatcher, per-verb module, per-skill namespace — makes the CLI predictable for both human operators and the MCP/Agent SDK wrappers that invoke it programmatically.

## The `aiswmm doctor` Diagnostic

`aiswmm doctor` walks a fixed checklist of subsystems and prints a status table. Each row is rendered through `agentic_swmm/diagnostics/doctor_report.py`, which normalizes per-check results into a stable, machine-parseable format suitable for CI logs (Source: [agentic_swmm/diagnostics/doctor_report.py:1-120]()).

The check list is defined in `agentic_swmm/commands/doctor.py` and currently includes the **Sessions store** as a first-class row. The store check reports one of four explicit states, which were introduced in v0.7.0a2 to give CI an unambiguous signal (Source: [agentic_swmm/commands/doctor.py:1-100]()).

| State | Meaning | Exit-code impact |
|---|---|---|
| `ok` | Sessions store readable, schema valid | 0 |
| `absent` | No sessions directory yet (fresh install) | 0 |
| `unreadable` | Directory exists but cannot be opened | non-zero |
| `corrupt` | Records present but fail integrity validation | non-zero |

A non-zero exit on `corrupt` or `unreadable` is the mechanism by which CI catches data-loss conditions before they propagate into a published run. This was a deliberate hardening in the v0.7.0a2 release (Source: [agentic_swmm/diagnostics/doctor_report.py:40-90]()).

## Memory Health and Repair Verbs

Because `aiswmm doctor` is read-only, mutation of a degraded store is exposed through a parallel `aiswmm memory repair-sessions` verb. This lets operators diagnose with `doctor`, repair in a separate auditable step, and re-verify with `doctor` — the same three-step pattern used in database operations (Source: [agentic_swmm/commands/memory.py:60-140]()).

`agentic_swmm/commands/memory_health.py` contains the observability helpers that the doctor report calls into, keeping health-introspection logic out of the mutation path. Splitting reads from writes is what allows `doctor` to remain a safe, idempotent probe that can run on every CI job (Source: [agentic_swmm/commands/memory_health.py:1-80]()).

## Operational Patterns and Version Notes

The recommended operator loop is therefore:

1. `aiswmm doctor` — confirm `ok` or `absent` before starting a run.
2. If `corrupt` / `unreadable`, run `aiswmm memory repair-sessions` (added in v0.7.0a2).
3. Re-run `aiswmm doctor` to confirm recovery.
4. Only then invoke the agent runtime.

This loop was promoted to stable in v0.7.0, which folded the v0.7.0a1/a2 prereleases — including the doctor exit-code change and the repair verb — into the default `pip install aiswmm` line (Source: [agentic_swmm/cli.py:40-120]()).

For users on older lines, the relevant pin notes are documented in the v0.7.0a1 and v0.7.0a2 release notes: `pip install aiswmm==0.7.0a1` for the alpha, or `pip install --pre aiswmm` for any prerelease. Byte-reproducibility pinning guidance lives separately in the v0.6.4 release notes and is orthogonal to the doctor diagnostics contract (Source: [agentic_swmm/diagnostics/doctor_report.py:1-60]()).

## Summary

- `aiswmm` exposes a verb-first CLI organised into `run`, `expert`, `memory`, and `doctor` groups.
- `aiswmm doctor` is the canonical CI health probe and emits one of four explicit `Sessions store` states.
- `corrupt` and `unreadable` return non-zero exit codes, enabling CI to gate reproducible runs.
- `aiswmm memory repair-sessions` is the sanctioned repair verb, kept separate from the read-only doctor path.
- The full lifecycle — diagnose, repair, re-verify — became stable in v0.7.0 and is the supported operator workflow today.

---

<a id='page-11'></a>

## Extensibility, Agent Runtimes, and Cross-Project Integration

### Related Pages

Related topics: [MCP Servers and Skill Layer](#page-3), [Installation Paths and Runtime Configuration](#page-9)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [skills/skill-author/SKILL.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/skills/skill-author/SKILL.md)
- [agentic_swmm/integrations/swmmcanada_runner.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/integrations/swmmcanada_runner.py)
- [agentic_swmm/integrations/inp_source.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/agentic_swmm/integrations/inp_source.py)
- [docs/codex-runtime.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/codex-runtime.md)
- [docs/openclaw-execution-path.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/openclaw-execution-path.md)
- [integrations/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/integrations/README.md)
</details>

# Extensibility, Agent Runtimes, and Cross-Project Integration

The `agentic-swmm-workflow` repository treats stormwater modeling as an auditable agent workflow rather than a monolithic CLI. Its extensibility story rests on three orthogonal axes: (1) a **skill system** that lets third parties package new modeling capabilities as discoverable markdown bundles, (2) a **pluggable agent runtime** that swaps the LLM harness behind a stable typed-tool interface, and (3) a thin **integration layer** that bridges external SWMM-related projects (SWMManywhere, SWMM Canada, custom `.inp` sources) into the canonical `runs/<date>/<id>/` layout. Together these axes let the same agent drive an end-to-end workflow whether the user talks to it through Codex, OpenClaw, or the Claude Agent SDK, while external projects contribute data and runners without forking the core.

## Skill System: Authoring New Capabilities

Skills are the unit of capability extension. Each skill lives under `skills/<name>/SKILL.md` and follows a front-matter contract that the loader parses at runtime. The `skill-author` skill is itself a skill whose purpose is to scaffold new ones, closing the loop on extensibility.

The contract enforced by `skills/skill-author/SKILL.md` covers scope, inputs, outputs, and an invocation example. A new skill declares its triggers, the verbs it adds to the agent's toolkit, and the artefacts it materialises in the run directory. Because the agent indexes skills at startup, adding a directory under `skills/` is sufficient to make a capability available — no core code changes are required. This pattern is what enables the v0.7.2 release notes to advertise "three new skills" without re-architecting the dispatcher. Source: [skills/skill-author/SKILL.md]().

## Agent Runtimes: Codex, OpenClaw, and the Claude Agent SDK

The runtime layer is intentionally thin so that different agent harnesses can host the same skills and tools. Two runtimes are documented as first-class execution paths.

**Codex runtime.** The Codex adapter (`docs/codex-runtime.md`) describes how the agent is driven inside an OpenAI Codex-style loop. It maps the core typed tools onto Codex's tool-calling schema, preserves the deterministic run-layout convention, and forwards the audit dossier generation unchanged. The runtime is responsible only for I/O and credential plumbing; reasoning, planning, and tool selection remain with the model.

**OpenClaw execution path.** `docs/openclaw-execution-path.md` documents a second harness with different concurrency and tool-isolation guarantees. It exposes the same verbs to the agent but runs them through OpenClaw's executor, which is relevant for users who need stricter process isolation between SWMM runs and the LLM control plane.

The Claude Agent SDK is delivered as an optional extra: `pip install "aiswmm[claude]==..."`. The PyPI extras mechanism keeps the default install lightweight while still allowing the same skill bundle to be hosted on Anthropic's SDK. This is the same pattern referenced in the v0.7.0a1 release notes, where the Claude provider is gated behind an extra so that `pip install aiswmm` stays stable on v0.6.x for users who have not opted into the 0.7.x line. Source: [docs/codex-runtime.md](), Source: [docs/openclaw-execution-path.md]().

Because each runtime consumes the same typed-tool surface, a workflow authored against one runtime is portable across the others; the skill markdown is the source of truth, not the harness.

## Cross-Project Integration Layer

External SWMM-adjacent projects are integrated through two narrow Python modules rather than through forking or vendoring.

**`agentic_swmm/integrations/inp_source.py`** is the abstraction for *where an `.inp` file comes from*. It normalises several upstream sources — local files, generated networks, and synthesised drainage from SWMManywhere — into a single object the rest of the pipeline can consume. The v0.7.1 release notes describe the headline use case: a single natural-language sentence referring to a WGS84 bounding box is enough to drive synthesis via SWMManywhere (Imperial College) and produce a valid `.inp` without the user supplying one. Source: [agentic_swmm/integrations/inp_source.py]().

**`agentic_swmm/integrations/swmmcanada_runner.py`** wraps the SWMM Canada runner, allowing Canadian datasets and configuration conventions to flow through the same audit and dossier pipeline as native runs. It exists because the canonical run layout is the integration contract: any runner that can deposit a `.inp`, an `.rpt`, and an `.out` under `runs/<date>/<id>/` participates in the workflow without further coupling.

The integrations directory also hosts a human-readable entry point: `integrations/README.md` lists supported external projects and their status, providing the discovery surface users actually navigate. Source: [integrations/README.md]().

## How the Three Axes Compose

The following sequence shows how a user request travels through the extensibility layers in a typical session:

```mermaid
flowchart LR
  U[User prompt] --> R{Runtime}
  R -->|Codex| A1[Codex adapter]
  R -->|OpenClaw| A2[OpenClaw executor]
  R -->|Claude SDK| A3[Claude adapter]
  A1 --> S[Skill loader<br/>skills/*/SKILL.md]
  A2 --> S
  A3 --> S
  S --> T[Typed tools]
  T --> I{Integration}
  I --> P1[inp_source.py]
  I --> P2[swmmcanada_runner.py]
  I --> P3[Custom runner]
  P1 --> L[runs/&lt;date&gt;/&lt;id&gt;/]
  P2 --> L
  P3 --> L
  L --> D[Audit dossier + map]
```

A new capability typically arrives in one of three ways: a contributor drops a new `SKILL.md` (skill axis), a downstream user installs a different runtime extra (runtime axis), or an upstream project ships a runner that conforms to the run-layout contract (integration axis). The byte-reproducibility guarantees introduced in v0.6.4 apply across all three because the deterministic layout is enforced below the extensibility surface, not above it.

## Practical Guidance for Extending the Project

- To add a capability, write a new `skills/<verb>/SKILL.md` and follow the contract in `skills/skill-author/SKILL.md`; no Python changes are required.
- To target a different LLM harness, implement the typed-tool adapter described in `docs/codex-runtime.md` or `docs/openclaw-execution-path.md` and gate any heavy dependency behind a PyPI extra.
- To plug in an external data source, return an `.inp` (directly or through `inp_source.py`) and place runner outputs under `runs/<date>/<id>/`; consult `integrations/README.md` for the supported matrix and conventions.

This separation — skills as capability, runtimes as harness, integrations as data/runners — is what makes the workflow auditable, reproducible, and open to community contribution without compromising the deterministic core.

---

<a id='page-12'></a>

## Operations, Workflows, and Common Failure Modes

### Related Pages

Related topics: [Modelling Memory and Learning Layer](#page-7), [Installation Paths and Runtime Configuration](#page-9), [CLI Commands and Doctor Diagnostics](#page-10)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [examples/tecnopolo/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/examples/tecnopolo/README.md)
- [examples/calibration/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/examples/calibration/README.md)
- [examples/tuflow-swmm-module03/README.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/examples/tuflow-swmm-module03/README.md)
- [docs/byte-identical-reproducibility.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/byte-identical-reproducibility.md)
- [docs/swmm-anywhere-quickstart.md](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/docs/swmm-anywhere-quickstart.md)
- [scripts/acceptance/run_acceptance.py](https://github.com/Zhonghao1995/agentic-swmm-workflow/blob/main/scripts/acceptance/run_acceptance.py)
</details>

# Operations, Workflows, and Common Failure Modes

## Scope and Purpose

The `agentic-swmm-workflow` project wraps stormwater modelling around an agent runtime that accepts natural-language intents and dispatches typed tools against SWMM (Storm Water Management Model). From an operator's perspective, three concerns dominate: how a run is initiated and laid out on disk, how the agent reasons across calibration, sensitivity, and design-storm steps, and how the system surfaces or recovers from failure states. This page consolidates the operational surface exposed by the `aiswmm` CLI, the canonical workflow patterns demonstrated in the example folders, and the failure modes the maintainers have explicitly hardened against since v0.6.x. Source: [README.md:1-40]().

## CLI Operations and Run Layout

Day-to-day operations are anchored on the `aiswmm` command and the `runs/<date>/<id>/` directory convention. Every workflow produces a deterministic folder whose contents can be re-hashed and compared across machines. The CLI exposes verbs such as `aiswmm doctor` and `aiswmm memory repair-sessions` that are explicitly intended for operational health checks rather than modelling work. Source: [scripts/acceptance/run_acceptance.py:1-80]().

The two most important operational verbs are:

- **`aiswmm doctor`** — runs an integrity sweep that prints a `Sessions store` row reporting one of `ok`, `corrupt`, `unreadable`, or `absent`. Exit codes are non-zero for `CORRUPT` and `UNREADABLE`, so CI pipelines and cron jobs can fail fast when the session store degrades. Source: [README.md:40-90]().
- **`aiswmm memory repair-sessions`** — a safer maintenance verb introduced alongside `doctor` for recovering or rebuilding the sessions index without dropping user data. Source: [README.md:90-140]().

Installation itself is treated as an operational concern. v0.7.3 reworked the one-line installers so a single command provisions the toolchain on a fresh machine, with the Windows path cloning into `%LOCALAPPDATA%` rather than `C:\Windows\System32` and applying a process-scope `ExecutionPolicy Bypass`. Defaulting to the latest published release keeps unattended installs predictable. Source: [README.md:1-40]().

## Canonical Workflows

The example folders document three recurring patterns that operators should recognise.

**Natural-language end-to-end run.** A single sentence that names a WGS84 bounding box is sufficient to synthesise the drainage network from public OSM and DEM data via SWMManywhere (Imperial College), execute SWMM, write a deterministic audit dossier, and render a spatial network map — all written under `runs/<date>/<id>/`. This is the workflow targeted by the v0.7.1 release notes. Source: [docs/swmm-anywhere-quickstart.md:1-60]().

**Calibration and sensitivity.** The calibration example pairs observed hydrographs with SWMM parameters and exercises the agent-reachable calibration and sensitivity tools introduced in v0.7.2 alongside the *AI for Engineering* paper. Operators typically invoke a sequence of `run → calibrate → sensitivity → design-storm` steps, each leaving an audit trail in the same run folder. Source: [examples/calibration/README.md:1-60]().

**Coupled 1D/2D modelling.** The TUFLOW ↔ SWMM coupling in `examples/tuflow-swmm-module03/` illustrates a multi-engine workflow in which the agent orchestrates boundary exchange between TUFLOW and SWMM rather than running either in isolation. Source: [examples/tuflow-swmm-module03/README.md:1-80]().

For repeatability, the project pins dependencies and ships auto-built Docker images; v0.6.4 introduced this hardening so that re-running the same intent on a clean container reproduces byte-identical outputs. Source: [docs/byte-identical-reproducibility.md:1-80]().

## Common Failure Modes

Operational incidents fall into a small, well-characterised set. The table below maps each mode to its observable signal and the documented recovery path.

| Failure mode | Observable signal | Recovery / mitigation |
|---|---|---|
| Sessions store corruption | `aiswmm doctor` reports `corrupt` (exit ≠ 0) | Run `aiswmm memory repair-sessions`; re-run `doctor` to confirm `ok`. Source: [README.md:40-90]() |
| Sessions store unreadable | `doctor` reports `unreadable` (exit ≠ 0) | Inspect file permissions on the sessions store path; rerun `doctor`. Source: [README.md:40-90]() |
| Installer writes to a protected directory | Clone or `pip install` fails under `C:\Windows\System32` on Windows | Use the v0.7.3 one-line installer, which targets `%LOCALAPPDATA%` and applies `ExecutionPolicy Bypass`. Source: [README.md:1-40]() |
| Dependency drift across agents | SWMM outputs differ between machines despite identical intent | Pin via `pip install aiswmm==<version>` or use the auto-built Docker images documented for byte-reproducibility. Source: [docs/byte-identical-reproducibility.md:1-80]() |
| Pre-release confusion | `pip install aiswmm` returns a stable build while a newer pre-release exists | Use `pip install --pre aiswmm` or pin explicitly (`aiswmm==0.7.0a1`). Source: [README.md:1-40]() |
| Missing example inputs | Workflow aborts mid-synthesise because OSM/DEM tiles are unavailable | Pre-stage inputs under the example folder or run inside the published Docker image. Source: [examples/tecnopolo/README.md:1-60]() |

The acceptance runner at `scripts/acceptance/run_acceptance.py` codifies these failure paths into executable checks, which is why CI in the repository can fail builds on a `CORRUPT` or `UNREADABLE` sessions state rather than silently passing. Source: [scripts/acceptance/run_acceptance.py:1-80]().

## Observability and Memory

Memory operations are first-class verbs because the agent re-uses prior runs to keep calibration histories and design-storm results coherent. v0.7.0 promoted "modelling memory" to a stable feature, and v0.7.0a2 added the safer `repair-sessions` verb specifically to avoid destructive maintenance steps. Operators should treat `doctor` as the canonical pre-flight check before any batch run and treat `memory repair-sessions` as an interactive recovery action rather than a routine step. Source: [README.md:40-140]().

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: Zhonghao1995/agentic-swmm-workflow

Summary: Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: runtime_trace
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Repro command: `docker run --rm -v "$PWD/runs:/app/runs" ghcr.io/zhonghao1995/agentic-swmm-workflow:v0.7.4 acceptance`
- Evidence: identity.distribution | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 2. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 3. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 4. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 5. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 7. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

<!-- canonical_name: Zhonghao1995/agentic-swmm-workflow; human_manual_source: deepwiki_human_wiki -->
