agentic-swmm-workflow Manual

Doramagic Project Pack · Human Manual

agentic-swmm-workflow

Agentic SWMM is an automated, auditable, and memory-informed framework for reproducible stormwater modelling, integrating QGIS and EPA SWMM through the aiswmm runtime, reusable Skills, and MCP interfaces, with QA verification, provenance tracking, calibration support, and Codex, Hermes, Claude code as well as OpenClaw compatibility

Project Overview and Goals

Related topics: Agent Runtime and Orchestration, Extensibility, Agent Runtimes, and Cross-Project Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Project Overview and Goals

Purpose and Scope

Agentic SWMM is an auditable, reproducible stormwater-modelling workflow that pairs EPA SWMM with an agent runtime and the Model Context Protocol (MCP). The project is published under the PyPI distribution name aiswmm and is described in its companion paper *"Agentic SWMM: Auditable and Reproducible Stormwater Modelling Workflow with Agent Skills and Model Context Protocol"* in *AI for Engineering* (DOI: 10.3390/aieng1010005). Source: README.md:1-40 Source: CITATION.cff:1-20

The package's stated mission is to collapse the gap between a natural-language request and a deterministic, byte-stable SWMM run. A single sentence that names a WGS84 bounding box can drive the full pipeline: synthesise the drainage network from public OSM and DEM data, execute SWMM, write an audit dossier, and render a spatial network map, all deposited under a canonical runs/<date>/<id>/ layout. Source: CHANGELOG.md:1-30 Source: README.md:42-90

The codebase is organised so that every run is inspectable after the fact. Hashes, configuration snapshots, prompts, tool calls, and SWMM I/O are written alongside the model, making each artefact independently verifiable. Source: docs/index.md:1-60

Core Capabilities

The v0.7.x line consolidates six user-visible capability groups:

Capability	Description
Natural-language synthesis	One sentence + bounding box → end-to-end SWMM run
Agent skills & MCP tools	Typed-tool surface exposed to the runtime
Calibration & sensitivity	Agent-reachable parameter sweeps and fits
Design storms	Synthetic event generation for planning studies
Memory observability	`aiswmm doctor` and `aiswmm memory repair-sessions`
Install UX	One-line installers for Windows, macOS, and Linux

Source: CHANGELOG.md:30-120 Source: src/aiswmm/cli.py:1-80

A single CLI entry point, aiswmm, is registered through pyproject.toml and dispatches to verbs such as doctor, memory, and skill invocations. The Claude Agent SDK is offered as an optional extra via pip install "aiswmm[claude]==…", isolating the heavy agent dependency from the default install. Source: pyproject.toml:1-80 Source: src/aiswmm/__init__.py:1-40

Architecture and Workflow

flowchart LR
    A[NL request + WGS84 bbox] --> B[Skill router]
    B --> C[SWMManywhere synthesis]
    C --> D[SWMM engine run]
    D --> E[Audit dossier]
    D --> F[Spatial network map]
    B -.tools.-> G[(MCP typed tools)]
    D --> H[(runs/date/id/)]
    E --> H
    F --> H

The user-facing flow starts with a free-form sentence; the skill router matches it against consolidated keyword tables and invokes the appropriate MCP tools. Network synthesis is delegated to SWMAnywhere (Imperial College), and the resulting INP file is executed by an embedded SWMM binary. Determinism is enforced through pinned dependency versions and auto-built Docker images that are immutable once tagged. Source: CHANGELOG.md:120-200 Source: src/aiswmm/cli.py:80-160

Every run writes a self-describing directory so external reviewers can replay, diff, or audit a study without contacting the original analyst. Source: docs/index.md:60-140

Release Lineage and Stability Guarantees

The project follows a stable-track / pre-release split mandated by PEP 440. As of release notes, pip install aiswmm resolves to the latest stable tag, while pip install aiswmm==0.7.0a1 (and similar alpha pins) locks to a specific pre-release. The v0.6.4 tag is the first non-pre-release on the 0.6.x line and underwrites the byte-reproducibility claims of the companion paper. Source: CHANGELOG.md:200-280 Source: pyproject.toml:80-140

Maintenance signals visible to operators include the aiswmm doctor health check, which classifies the Sessions store into ok / corrupt / unreadable / absent and exits non-zero on data-loss conditions so CI pipelines can catch regressions. The companion aiswmm memory repair-sessions verb was introduced specifically to make memory stores self-healing rather than write-only. Source: CHANGELOG.md:280-360 Source: src/aiswmm/cli.py:160-240

In summary, the project positions itself as the reproducible, agent-driven control plane over a traditional hydrological-and-hydraulic simulator, with installation, observability, and determinism treated as first-class concerns rather than afterthoughts. Source: README.md:90-160 Source: docs/index.md:140-220

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Agent Runtime and Orchestration

Related topics: MCP Servers and Skill Layer, Operations, Workflows, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Agent Runtime and Orchestration

The Agent Runtime is the control plane of aiswmm. It sits between the user-facing REPL or programmatic entry points and the deterministic SWMM execution layer, deciding *what* the user wants, *how* to break that intent into typed tool calls, and *when* to persist state so that every modelling run is auditable and reproducible. v0.7.0 promoted the runtime to a stable, documented subsystem and v0.7.2 extended it to reach calibration and sensitivity tools.

Source: agentic_swmm/agent/runtime.py:1-40

High-Level Responsibilities

The runtime coordinates five concerns that the rest of the system depends on:

Intent grounding — converting a free-form natural-language request into a structured intent.
Planning — turning the intent into an ordered list of tool invocations and side-effects.
Dispatch — handing each step to a typed tool implementation registered in the tool registry.
Persistence — recording session state, run manifests, and audit artefacts in runs/<date>/<id>/.
Self-checks — surfacing environment diagnostics through aiswmm doctor.

Source: agentic_swmm/agent/runtime.py:42-110

Request Lifecycle

A single turn follows a predictable pipeline. The REPL captures input and forwards it to the runtime; the runtime never trusts raw text.

flowchart LR
    U[User input] --> R[REPL<br/>repl.py]
    R --> IC[Intent Classifier<br/>intent_classifier.py]
    IC --> P[Planner<br/>planner.py]
    P --> E[Executor<br/>executor.py]
    E --> TR[Tool Registry<br/>tool_registry.py]
    TR --> S[Session Store<br/>memory/sessions.py]
    E --> A[Audit artefacts<br/>runs/&lt;date&gt;/&lt;id&gt;/]
    A --> U

REPL intake — repl.py owns greeting, warm-intro caching (fixed in v0.6.2a1 to fire once per session rather than every greeting), and command parsing. It also dispatches non-agent verbs such as aiswmm doctor or aiswmm memory repair-sessions directly without invoking the runtime. Source: agentic_swmm/agent/repl.py:30-90
Intent classification — intent_classifier.py extracts the request type (network synthesis, calibration, sensitivity, design storm, etc.) and any constraints (bounding box, return period, output path). As of v0.6.3a1, the keyword-matching logic was consolidated into this single module instead of being scattered across six callers. Source: agentic_swmm/agent/intent_classifier.py:1-70
Planning — planner.py produces an ordered, typed step list. Each step references a tool name and pre-validated arguments. The planner is the only component that decides ordering; downstream consumers treat the plan as immutable. Source: agentic_swmm/agent/planner.py:25-95
Execution — executor.py walks the plan, resolves tools through the registry, executes them in a sandboxed subprocess for SWMM calls, and streams results back to the REPL. It is also responsible for failure handling and run-dossier generation. Source: agentic_swmm/agent/executor.py:1-80
Tool resolution — tool_registry.py is a typed catalogue. v0.7.2 introduced three new skills — calibration, sensitivity, and design storms — which had to be registered here before the planner could reach them. Source: agentic_swmm/agent/tool_registry.py:1-60

Session and Run Persistence

The runtime writes deterministic artefacts for every turn. The layout under runs/<date>/<id>/ is the canonical audit dossier referenced by the v0.7.1 release notes: synthesise network → run SWMM → write dossier → render map. The sessions store is independently validated by aiswmm doctor, which reports one of four states — ok, corrupt, unreadable, absent — and exits non-zero on CORRUPT / UNREADABLE so CI catches data-loss conditions. A companion aiswmm memory repair-sessions verb (v0.7.0a2) attempts to restore damaged stores before they are trusted by future runs.

Source: agentic_swmm/agent/runtime.py:140-210

Source: agentic_swmm/memory/sessions.py:45-120

Extension Points

Operators add capabilities by registering new tools, not by patching the runtime. The contract is:

Concern	Where to edit
New typed tool	Add an entry to `tool_registry.py`
New intent category	Extend `intent_classifier.py` keywords
New planning strategy	Subclass `planner.py`
New shell verb	Register in `cli/main.py` (routes around runtime)

This split is why v0.7.2 could ship three new skills without touching the runtime core — only the registry, the classifier, and (where needed) the planner needed updates.

Source: agentic_swmm/agent/tool_registry.py:80-140

Operational Notes

The CLI binary aiswmm always boots the runtime even for diagnostics; cli/main.py is the only entry that may short-circuit it for non-agent verbs.
Byte-level reproducibility (v0.6.4) depends on the runtime pinning environment variables before dispatch — do not bypass executor.py with ad-hoc subprocess calls.
If aiswmm doctor reports a sessions-store problem, run aiswmm memory repair-sessions *before* starting any modelling session, otherwise the planner may plan against an unreadable memory state.

Source: agentic_swmm/cli/main.py:1-70

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

MCP Servers and Skill Layer

Related topics: Agent Runtime and Orchestration, Extensibility, Agent Runtimes, and Cross-Project Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

MCP Servers and Skill Layer

The agentic-swmm-workflow project exposes its stormwater-modelling capabilities to LLM agents through two cooperating layers: a Model Context Protocol (MCP) server layer that provides typed, JSON-RPC tools, and a Skill layer that packages higher-level procedures (prompts, scripts, and conventions) for the agent to invoke. Together they implement the agent-reachable surface of the workflow, which is the focus of the v0.7.2 release notes describing "agent-reachable calibration & sensitivity, three new skills, design storms, memory observability" Source: integrations/mcp/README.md:1-40.

1. MCP Server Layer

The integrations/mcp directory bundles three MCP servers, each implemented in Node.js as a JSON-RPC stdio process and scoped to a single modelling phase. Source: integrations/mcp/README.md:10-38

Server	Module Path	Responsibility
`swmm-runner`	`mcp/swmm-runner/server.js`	Executes a prepared `.inp` model, returns diagnostics, and writes outputs into the standard `runs/<date>/<id>/` layout.
`swmm-builder`	`mcp/swmm-builder/server.js`	Synthesises and edits SWMM input files (subcatchments, conduits, controls).
`swmm-network`	`mcp/swmm-network/server.js`	Generates the drainage topology from OSM + DEM via SWMManywhere and exports spatial artefacts.

Each server registers tools with explicit JSON Schemas so the agent receives parameter validation and typed responses. For example, swmm-runner exposes a run_simulation tool whose arguments include inp_path, duration_h, and output_dir; results carry the path to the .rpt and .out files plus a content hash, which downstream skills use to verify byte-reproducibility. Source: mcp/swmm-runner/server.js:40-120 Similarly, swmm-builder returns an updated .inp plus a diff summary rather than mutating state implicitly. Source: mcp/swmm-builder/server.js:55-90 The swmm-network server isolates external dependencies (OSMnx, rasterio, SWMManywhere) behind a single synthesise_network(bbox, out_dir) call. Source: mcp/swmm-network/server.js:30-80

flowchart LR
    A[LLM Agent] -->|JSON-RPC| B(swmm-network)
    A -->|JSON-RPC| C(swmm-builder)
    A -->|JSON-RPC| D(swmm-runner)
    A -->|invoke| E[Skill Layer]
    E --> B
    E --> C
    E --> D

2. Skill Layer

Sitting one level above the MCP tools, skills are the "verbs" the agent is taught to recognise in natural language. Each skill is a markdown document under skills/<name>/SKILL.md that pairs a YAML front-matter manifest (name, description, allowed tools) with a prose procedure. Source: integrations/skills/README.md:5-45 The v0.7.2 release added three new skills covering calibration, sensitivity analysis, and design-storm generation. Source: skills/swmm-calibrate/SKILL.md:1-30

The flagship skill is swmm-end-to-end, which converts a single WGS84 bounding-box sentence into a complete run:

Call swmm-network.synthesise_network to obtain the topology. Source: skills/swmm-end-to-end/SKILL.md:20-35
Delegate subcatchment/conduit assembly to swmm-builder. Source: skills/swmm-end-to-end/SKILL.md:36-50
Hand the resulting .inp to swmm-runner.run_simulation. Source: skills/swmm-end-to-end/SKILL.md:51-68
Persist inputs, outputs, and the prompt audit trail under runs/<date>/<id>/. Source: skills/swmm-end-to-end/SKILL.md:69-82

Skills are deliberately declarative: they describe *what* the agent should do, while the MCP servers define *how* a step executes. This separation lets new procedures be added by dropping a markdown file into skills/, without modifying any JavaScript code. Source: integrations/skills/README.md:46-72

3. Determinism, Auditing, and Error Surfaces

Because the paper-grade claims of the project depend on reproducible runs, both layers funnel every action through the runs/<date>/<id>/ layout and emit content hashes. The MCP servers are required to be stateless: long-lived memory (sessions, observations, repair events) lives in the memory module and is queried through the skill layer rather than cached in the servers themselves. Source: mcp/swmm-runner/server.js:1-30 When a tool call fails, the server returns a structured error object ({code, message, retryable}) that the skill procedure can pattern-match before deciding whether to abort, retry, or repair the session. Source: mcp/swmm-builder/server.js:140-170

4. Operating the Stack

Installers introduced in v0.7.3 provision the full toolchain (Node, Python, SWMM, and the MCP servers) in one command; both platforms default to the latest published release so the JSON-RPC contracts above stay aligned with what aiswmm dispatches at runtime. Source: integrations/mcp/README.md:75-95 Developers iterating on a single server can launch it in isolation with node mcp/swmm-runner/server.js and attach any MCP-compatible client (Claude Desktop, the aiswmm agent runtime, or mcp-cli) for live testing. Source: mcp/swmm-runner/server.js:200-230

In summary, the MCP servers expose narrow, validated primitives, the skill layer composes those primitives into user-visible procedures, and the v0.7.x line ensures the two stay deterministic, auditable, and installable end-to-end.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

SWMM Build, Run, and Network Synthesis

Related topics: Calibration, Uncertainty, and Water Quality, GIS, Climate, Plot, Report, and Review Skills

Section Related Pages

Continue reading this section for the full explanation and source context.

SWMM Build, Run, and Network Synthesis

Purpose and Scope

The SWMM Build, Run, and Network Synthesis subsystem is the execution core of the agentic-swmm-workflow. It transforms a natural-language description of a study area — typically a WGS84 bounding box — into a fully simulated EPA SWMM model whose inputs, outputs, and provenance are written to a deterministic on-disk layout under runs/<date>/<id>/. Three responsibilities are layered into this subsystem: (1) synthesising a drainage network from public OSM and DEM data via SWMManywhere, (2) launching the SWMM engine against the produced .inp file, and (3) capturing every artifact, manifest, and report into an auditable record.

As highlighted in the v0.7.1 release, a single sentence referring to a bounding box now drives the end-to-end workflow, and v0.7.2 layered agent-reachable calibration, sensitivity, and design-storm skills on top of that foundation. The runtime modules guarantee that the byte-level reproducibility claims made in the companion *AI for Engineering* paper are honoured at the filesystem level.

Pipeline Architecture

The pipeline follows a strict preflight → build → run → postflight sequence. Each stage is independently testable and emits a manifest fragment so that a failure can be attributed to a single phase rather than the whole run.

flowchart LR
    A[Bounding Box + Intent] --> B[preflight.py]
    B --> C[Network Synthesis<br/>SWMManywhere]
    C --> D[.inp generation<br/>inp_parsing.py]
    D --> E[SWMM Engine Run]
    E --> F[.rpt + .out capture<br/>rpt_summary.py]
    F --> G[postflight.py]
    G --> H[run_artifacts.py<br/>+ run_manifests.py]

Source: agentic_swmm/agent/swmm_runtime/preflight.py:1-40, agentic_swmm/agent/swmm_runtime/postflight.py:1-40, agentic_swmm/agent/swmm_runtime/run_artifacts.py:1-30

Network Synthesis (Build Phase)

The build phase produces a .inp file from open data. SWMManywhere — the Imperial College synthesis tool — is invoked with the bounding box and an optional subcatchment hint, returning a GeoPackage containing conduits, junctions, subcatchments, and outfalls. The agentic runtime then validates that the synthesised network is hydrologically plausible (minimum conduit count, presence of an outfall, CRS correctness) before writing the SWMM input file.

The resulting .inp is the single source of truth for the downstream simulation. inp_parsing.py provides structured readers so that agents and skills can inspect options, raingages, and subcatchment geometry without re-parsing the raw text — this is what makes the calibration, sensitivity, and design-storm skills introduced in v0.7.2 deterministic.

Source: agentic_swmm/agent/swmm_runtime/inp_parsing.py:1-50, agentic_swmm/agent/swmm_runtime/preflight.py:20-80

Run Execution

The run phase shells out to the EPA SWMM 5.x engine using a pinned binary or the official Docker image, with the byte-reproducibility guarantees introduced in v0.6.4 enforced through dependency pinning. preflight.py confirms the engine binary is reachable, that the working directory is writable, and that disk quotas are not exceeded before launching the simulation. postflight.py then re-reads the produced .rpt and .out files to verify that the engine reported zero continuity errors and that all expected sections are present in the report.

During execution, run_artifacts.py snapshots the input file, the synthesised GeoPackage, the engine version, and a SHA-256 digest of each artifact into the run directory. This snapshot is what makes the workflow auditable: a reviewer can later reconstruct the exact .inp and engine version that produced a given set of results.

Source: agentic_swmm/agent/swmm_runtime/preflight.py:40-120, agentic_swmm/agent/swmm_runtime/postflight.py:40-120, agentic_swmm/agent/swmm_runtime/run_artifacts.py:30-90

Output Capture and Manifests

Once the engine terminates, rpt_summary.py distils the verbose .rpt file into a compact summary covering simulation duration, rainfall totals, peak runoff, and any warnings. run_manifests.py writes the deterministic manifest.json and audit.json files that index every artifact under runs/<date>/<id>/, satisfying the "deterministic audit dossier" requirement announced in v0.7.1.

The manifest schema is intentionally flat and human-readable, so that downstream skills — calibration, sensitivity, design storms — can locate predecessor artifacts without traversing the agent conversation history. This decoupling is what allows the modeling memory subsystem (introduced across the v0.7.0 series) to revisit a prior run as a black box and re-use its inputs.

Source: agentic_swmm/agent/swmm_runtime/rpt_summary.py:1-60, agentic_swmm/agent/swmm_runtime/run_manifests.py:1-80, agentic_swmm/agent/swmm_runtime/run_artifacts.py:60-140

Summary

The SWMM Build, Run, and Network Synthesis subsystem is the deterministic backbone of the agentic workflow. Network synthesis, runtime execution, and artifact capture are separated into independently verifiable phases so that the byte-reproducibility, auditability, and agent-reachability guarantees published alongside the v0.7.x line all hold at the filesystem level rather than only in documentation.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Calibration, Uncertainty, and Water Quality

Related topics: SWMM Build, Run, and Network Synthesis, Modelling Memory and Learning Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Calibration, Uncertainty, and Water Quality

This page documents the agent-reachable surface of the Agentic SWMM workflow responsible for calibration, uncertainty quantification, and the design-storm / water-quality inputs that drive those analyses. Together these modules close the loop between a synthesised drainage network and a defensible, auditable model output. The v0.7.2 release notes describe this surface as the "agent-reachable calibration & sensitivity" wave, which is the canonical entry point for the components below. Source: CHANGELOG

Scope and high-level role

The calibration and uncertainty subsystem sits between the swmm_runtime layer (which executes EPA-SWMM simulations deterministically) and the tool_handlers layer (which exposes typed tools to the agent runtime). It is responsible for:

Preparing a deterministic plan of SWMM runs with perturbed parameters.
Executing that plan in batches so that long calibration campaigns can be resumed, retried, and audited.
Scoring each run against an objective (typically NSE, KGE, or PBIAS on observed flow/quality series).
Producing an uncertainty envelope and parameter-sensitivity ranking that can be re-consumed by the agent in subsequent turns.

The design-storm and water-quality components supply the forcings (rainfall hyetographs, pollutant loadings) that those calibration runs operate on, ensuring that the calibrated parameters are tied to a reproducible storm scenario rather than ad-hoc inputs.

Module map

Module	Role	Consumes	Produces
`swmm_runtime/calibration_runner.py`	Executes one calibration iteration (one parameter set × one storm)	A `.inp` template, a parameter vector	A scored run artefact
`swmm_runtime/uncertainty_plan.py`	Builds the parameter-space sampling plan (Sobol / LHS / grid)	Parameter priors, bounds, sample count	A `plan.json` + run manifest
`swmm_runtime/design_storm.py`	Generates design-storm hyetographs (e.g. SCS Type II, IDF-based) and links pollutant land-use loading	AOI, return period, duration	`.dat` rainfall + quality inputs
`agent/calibration_batch.py`	Orchestrates many `calibration_runner` invocations; handles resume/retry/diff	`plan.json`, observed series	Batch directory + manifest
`tool_handlers/swmm_calibration.py`	Exposes calibration to the agent as typed tool calls	Natural-language request	Hand-off to `calibration_batch`
`tool_handlers/swmm_uncertainty.py`	Exposes sensitivity / uncertainty tool calls	`plan.json`, scored results	Summary tables + plots

This separation keeps deterministic numerics in swmm_runtime/ and agent-facing schemas in tool_handlers/, which is the same boundary used elsewhere in the codebase (v0.6.3a1 consolidated the keyword routing that the agent uses to dispatch into these handlers). Source: agentic_swmm/agent/swmm_runtime/calibration_runner.py:1-40, agentic_swmm/agent/tool_handlers/swmm_calibration.py:1-40

Calibration pipeline

A calibration campaign follows four phases:

Plan — uncertainty_plan.py enumerates parameter combinations according to the configured sampler (Latin Hypercube for broad exploration, Sobol for sensitivity indices, or grid for low-dim re-checks). The plan is written to runs/<date>/<id>/plan.json so that downstream tools can resume mid-campaign. Source: agentic_swmm/agent/swmm_runtime/uncertainty_plan.py:1-60
Execute — calibration_batch.py walks the plan and calls calibration_runner.py once per row. Each row produces a self-contained run directory with its own .inp, .rpt, and .out, which is what gives the system its byte-reproducibility story (v0.6.4). Source: agentic_swmm/agent/calibration_batch.py:1-80, agentic_swmm/agent/swmm_runtime/calibration_runner.py:1-80
Score — The runner reports one or more objective-function values (configurable in the tool schema exposed by swmm_calibration.py). Results are appended to scores.csv next to the plan. Source: agentic_swmm/agent/tool_handlers/swmm_calibration.py:40-120
Surface — swmm_uncertainty.py consumes scores.csv + plan.json and emits the sensitivity ranking and uncertainty bands that the agent quotes back to the user. Source: agentic_swmm/agent/tool_handlers/swmm_uncertainty.py:1-80

flowchart LR
  A[uncertainty_plan.py] -->|plan.json| B[calibration_batch.py]
  B --> C[calibration_runner.py]
  C -->|scores.csv| D[swmm_uncertainty.py]
  B --> E[design_storm.py]
  E -->|.dat forcing| C
  H[agent tool call] -->|typed schema| F[swmm_calibration.py]
  H -->|typed schema| G[swmm_uncertainty.py]
  F --> B
  G --> D

Design storms and water quality

design_storm.py produces the synthetic rainfall and pollutant inputs that the calibration pipeline consumes. Its responsibilities split cleanly:

Hydrology — SCS Type II / Type I hyetographs or user-supplied IDF curves, parameterised by return period and duration. The same generator feeds both calibration (where it acts as a repeatable forcing) and scenario analysis (where it acts as the design event). Source: agentic_swmm/agent/swmm_runtime/design_storm.py:1-80
Water quality — Land-use based pollutant build-up / washoff definitions and dry-weather flow profiles are written next to the rainfall file so that a single .inp mutation covers both quantity and quality calibration in the same batch. This is the mechanism by which the agent can answer "calibrate against observed TSS at outfall X" without bespoke scripting. Source: agentic_swmm/agent/swmm_runtime/design_storm.py:80-160

Because design storms and quality inputs are versioned inside the run directory, any audit dossier produced by the workflow can replay the exact forcing that produced a given sensitivity ranking — a property the project leans on heavily in its reproducibility claims. Source: agentic_swmm/agent/calibration_batch.py:80-140

Notes for users and integrators

The agent-facing schemas in swmm_calibration.py and swmm_uncertainty.py are the only stable contract; the swmm_runtime/ modules may evolve as long as the schemas do not.
Resuming a campaign is supported by re-invoking the calibration tool with the same runs/<date>/<id>/ path — the batch driver skips already-scored rows.
Sensitivity results from swmm_uncertainty.py are the recommended input to a subsequent calibration refinement pass, narrowing the prior bounds before the next sampling round.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

GIS, Climate, Plot, Report, and Review Skills

Related topics: SWMM Build, Run, and Network Synthesis, Calibration, Uncertainty, and Water Quality

Section Related Pages

Continue reading this section for the full explanation and source context.

Section GIS and Climate: input preparation

Continue reading this section for the full explanation and source context.

Section Plot, Report, and Design Review: output synthesis

Continue reading this section for the full explanation and source context.

GIS, Climate, Plot, Report, and Review Skills

The Agentic SWMM workflow exposes its capabilities as a set of modular Agent Skills, each backed by a SKILL.md contract under skills/<name>/. Five of these skills form the post-modeling and pre-delivery pipeline — GIS, Climate, Plot, Report, and Design Review — turning raw SWMM runs into auditable spatial artifacts, defensible input data, and human-readable deliverables. Together with swmm-params, they constitute the "interpret, contextualize, and present" layer of the agent runtime, sitting between the simulation engine and the final runs/<date>/<id>/ dossier.

Purpose and Scope

Each skill is an atomic, agent-reachable unit with a narrow, well-defined responsibility. The Agent invokes them through the Model Context Protocol (MCP), passing typed arguments and receiving deterministic outputs written to the run directory. The five skills target distinct concerns:

swmm-gis handles geospatial preparation: bounding-box ingestion, OSM/DEM fetches, subcatchment delineation, and storage of vector layers alongside the run.
swmm-climate curates meteorological forcing — rainfall time series, design storms, and IDF inputs — and binds them to SWMM's [TIMESERIES] and [RAINGAGE] sections.
swmm-plot renders hydrographs, network maps, and spatial overlays as static images or interactive artifacts for downstream review.
swmm-report assembles the deterministic audit dossier, summarizing inputs, decisions, and outputs in a reproducible document.
swmm-design-review applies engineering review rules over the assembled run, flagging capacity violations, surcharge, and guideline deviations.

This decomposition lets the Agent compose complex workflows from small, verifiable steps rather than monolithic scripts. Source: skills/swmm-gis/SKILL.md:1-15, skills/swmm-climate/SKILL.md:1-12, skills/swmm-plot/SKILL.md:1-10.

Skill-by-Skill Architecture

GIS and Climate: input preparation

The GIS skill acts as the spatial front door. Given a WGS84 bounding box, it orchestrates network synthesis (delegating to SWMManywhere as introduced in v0.7.1) and persists GeoJSON / shapefile artifacts inside the run directory for later plotting and review. It is the only skill that performs external fetches (OSM, DEM), and its output schema is consumed by every downstream skill that needs geometry. Source: skills/swmm-gis/SKILL.md:18-44.

The Climate skill mirrors this role for hydrometeorological inputs. It accepts observed hyetographs, synthetic design storms (introduced in v0.7.2), and user-supplied IDF curves, normalizing them into SWMM-compatible timeseries objects. Determinism is enforced by storing both the raw source and the canonicalized form. Source: skills/swmm-climate/SKILL.md:20-52.

Skill	Primary Input	Primary Output	Consumed By
gis	BBOX or geo file	Vector layers + network	plot, report, design-review
climate	Hyetograph / IDF	SWMM `[TIMESERIES]` block	report, design-review
plot	Run results + layers	PNG/HTML visualizations	report, design-review
report	All run artifacts	Audit dossier (PDF/HTML)	End user
design-review	Run results + rules	Findings list (JSON/MD)	report, end user

Plot, Report, and Design Review: output synthesis

The Plot skill translates numeric SWMM outputs into visual artifacts. It reads the binary .out results, pairs them with GIS layers from swmm-gis, and produces hydrographs, longitudinal profiles, and spatial network maps. All plots carry a deterministic seed and a run-id watermark so they are byte-reproducible across machines. Source: skills/swmm-plot/SKILL.md:14-39.

The Report skill is the integrator. It walks the run directory, aggregates the artifacts produced by the upstream skills (GIS layers, climate inputs, plots, model parameters, simulation logs), and emits the deterministic audit dossier — the same dossier referenced in the v0.7.1 release as "a deterministic audit dossier … rendered alongside a spatial network map." Source: skills/swmm-report/SKILL.md:10-58.

The Design Review skill applies engineering judgement. It loads the simulation results together with configurable review rules (capacity, freeboard, surcharge depth, flooding extent) and produces a structured findings document. When the v0.7.2 release mentions "agent-reachable calibration & sensitivity," review is the companion step that interprets calibrated outputs against design intent. Source: skills/swmm-design-review/SKILL.md:12-47.

Workflow and Composition

The skills are designed to be invoked sequentially or selectively, depending on the Agent's plan. A typical end-to-end flow follows the data dependency chain: params → gis → climate → (simulate) → plot → design-review → report. Each step writes its outputs into the standard runs/<date>/<id>/ layout, so any step can be re-run independently without side effects on the others — a property formalized by the byte-reproducibility hardening in v0.7.4. Source: skills/swmm-report/SKILL.md:30-45.

flowchart LR
    A[swmm-params] --> B[swmm-gis]
    A --> C[swmm-climate]
    B --> D[SWMM Sim]
    C --> D
    D --> E[swmm-plot]
    D --> F[swmm-design-review]
    B --> E
    E --> G[swmm-report]
    F --> G
    C --> G

This composition model lets the Agent skip steps when artifacts already exist (for example, re-rendering plots without re-synthesizing the network) while keeping the final dossier deterministic. Source: skills/swmm-plot/SKILL.md:22-28, skills/swmm-design-review/SKILL.md:30-36.

Cross-cutting Properties

All five skills share three properties that distinguish them from ordinary scripts:

Idempotency — invoking a skill twice with identical inputs produces byte-identical outputs, enabling CI verification.
Typed contracts — each SKILL.md declares input/output schemas consumed by the MCP tool surface introduced across the v0.7.x line.
Run-scoped I/O — no skill writes outside runs/<date>/<id>/, preserving the audit trail that swmm-report later assembles.

These properties are what allow the Agent to treat the skills as reliable building blocks for complex, multi-step stormwater investigations. Source: skills/swmm-gis/SKILL.md:48-60, skills/swmm-climate/SKILL.md:55-66, skills/swmm-report/SKILL.md:62-70.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Modelling Memory and Learning Layer

Related topics: Audit, Provenance, and Verification, Operations, Workflows, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Modelling Memory and Learning Layer

Overview and Purpose

The Modelling Memory and Learning Layer is the persistence-and-feedback subsystem that lets the agentic SWMM agent accumulate experience across runs. It captures what the agent saw, decided, and produced for each modelling session, then surfaces that history during future invocations so that recurring stormwater modelling tasks become faster, cheaper, and more consistent.

The layer was promoted to a first-class subsystem in v0.7.0 — Modeling memory, agent runtime, install UX, which anchored the memory module as a sibling to the agent runtime and installer. v0.7.2 later added observability hooks for memory and v0.7.0a2 introduced aiswmm memory repair-sessions together with stronger integrity checks under aiswmm doctor.

Source: CHANGELOG entry for v0.7.0 Source: CHANGELOG entry for v0.7.0a2

The subsystem is organised under agentic_swmm/memory/ and is composed of three responsibilities: durable storage, retrieval, and lifecycle/curation.

Component Architecture

The layer is split into cooperating modules with narrowly scoped responsibilities.

jsonl_store.py — append-only JSON Lines writer/reader used as the lowest-level event log. JSONL keeps writes crash-safe and streamable, which matters for the long-lived run directories produced by the workflow. Source: agentic_swmm/memory/jsonl_store.py
session_db.py — session-oriented adapter that groups JSONL events into a queryable per-session record (sessions are the unit of recall for the agent). Source: agentic_swmm/memory/session_db.py
session_repair.py — repair utility backing the aiswmm memory repair-sessions verb; it scans the sessions store and restores integrity where events are missing or truncated. Source: agentic_swmm/memory/session_repair.py
recall.py — retrieval API the agent calls to fetch prior relevant context before planning a new run. Source: agentic_swmm/memory/recall.py
parametric_memory.py — stores tuned model parameters (roughness, subcatchment widths, control rules, calibration coefficients) so they can be re-applied to similar catchments later. Source: agentic_swmm/memory/parametric_memory.py
lessons_lifecycle.py — promotes/demotes durable "lessons learned" between short-term, working, and long-term tiers and exposes the curation hooks. Source: agentic_swmm/memory/lessons_lifecycle.py

A canonical data flow is:

flowchart LR
  A[Run agent] --> B[session_db.py]
  B --> C[jsonl_store.py]
  C --> D[(sessions/ events)]
  D --> E[recall.py]
  E --> A
  D --> F[parametric_memory.py]
  D --> G[lessons_lifecycle.py]
  D --> H[session_repair.py]
  H --> D

Storage Tiers and Integrity Model

Two storage tiers cooperate:

Event log tier — jsonl_store.py writes each agent action and observation as one JSON object per line. The append-only shape makes partial writes recoverable and aligns with the byte-reproducibility story introduced in v0.6.4.
Sessions tier — session_db.py materialises per-session records on top of the event log. The sessions store is what aiswmm doctor now reports against (introduced in v0.7.0a2 with explicit states ok / corrupt / unreadable / absent) and what the doctor returns a non-zero exit code for when integrity fails. Source: v0.7.0a2 changelog

When the doctor flags corrupt or unreadable, session_repair.py is the supported recovery path. The CLI surface added in v0.7.0a2 — aiswmm memory repair-sessions — invokes that module so that an operator can fix the store without rebuilding it by hand. Source: agentic_swmm/memory/session_repair.py

Learning Surfaces: Parametric Memory and Lessons

The "learning" half of the layer is delivered through two complementary stores:

Parametric memory persists concrete model parameters — calibrated roughness values, subcatchment geometry, control rule settings, hydrology choices — keyed by catchment signature so they can be proposed as defaults on similar new runs. Source: agentic_swmm/memory/parametric_memory.py
Lessons are qualitative observations produced after a run (e.g. "this subcatchment saturates control rule X during 10-yr events") that are curated across sessions by lessons_lifecycle.py. Lifecycle management ensures stale or contradicted lessons can be demoted and superseded.

These two surfaces are surfaced to the agent exclusively through recall.py, which is invoked at the start of a workflow to inject prior context into the prompt and at the end of a workflow to record new observations. Source: agentic_swmm/memory/recall.py

Operations and CLI Surface

The layer is operated through the aiswmm memory verb family:

Verb	Purpose	Backed by
`aiswmm memory repair-sessions`	Recovers the sessions store after corruption	`session_repair.py`
`aiswmm doctor` (sessions row)	Reports sessions store state (`ok / corrupt / unreadable / absent`) and exits non-zero on failure	`session_db.py` + `jsonl_store.py`

Memory observability improvements shipped in v0.7.2 alongside the calibration/sensitivity skills. Source: v0.7.2 release notes

Summary

The Modelling Memory and Learning Layer transforms the agent from a stateless workflow runner into a cumulative system. Append-only JSONL gives durable, reproducible event capture; the sessions tier turns those events into queryable units that the doctor can validate; parametric memory and lessons close the loop by feeding prior experience back into new runs. Repair and observability operations added in the 0.7.x line make the layer safe to run unattended in CI while still giving operators an explicit recovery path.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Audit, Provenance, and Verification

Related topics: Modelling Memory and Learning Layer, Operations, Workflows, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Audit, Provenance, and Verification

The audit subsystem in agentic-swmm-workflow is the connective tissue that turns an agent-driven stormwater modelling session into a defensible scientific artifact. Every natural-language request, tool invocation, SWMM run, and comparison step is captured into a deterministic dossier so that a reviewer (or another agent) can reconstruct what happened, on which data, and with which tool versions. This is the auditable-and-reproducible contract advertised in the v0.7.2 release notes and the companion *AI for Engineering* paper. Source: docs/experiment-audit-framework.md

Goals and Scope

The audit module serves three intertwined goals:

Provenance — record which inputs (OSM extracts, DEM tiles, rainfall series, parameter files) produced a given runs/<date>/<id>/ artifact, together with the toolchain version that processed them. Source: agentic_swmm/audit/provenance_v1_2.py
Trace — log the conversational and tool-call sequence the agent followed, including prompt text, model identifiers, and structured tool arguments, so the reasoning path is inspectable. Source: agentic_swmm/audit/chat_note.py, agentic_swmm/audit/llm_calls.py
Verification — provide deterministic comparison primitives that detect drift between two SWMM runs (baseline vs. candidate) and surface divergences in a way a reviewer can audit. Source: agentic_swmm/agent/swmm_runtime/compare.py

Scope is deliberately bounded to *one run folder at a time*. Cross-run queries (searching past runs, comparing sessions, memory repair) live in the memory subsystem; the audit subsystem only guarantees that any single run is self-contained and replayable. This boundary is enforced by the standard runs/<date>/<id>/ layout contract.

Run Folder Layout — the Audit Container

Every agent invocation that produces a model result lands under runs/<date>/<id>/. The layout itself is a first-class object: agentic_swmm/audit/run_folder_layout.py exposes helpers that construct paths, enforce the schema, and refuse to write into a folder that already has a different provenance.json. The folder always carries the same minimum set of artifacts:

Artifact	Purpose	Producer
`provenance.json` (v1.2)	Toolchain version, input hashes, seed values, SWMManywhere / SWMM versions	`provenance_v1_2.py`
`chat_log.jsonl`	Turn-by-turn user/assistant messages	`chat_note.py`
`llm_calls.jsonl`	Each provider call: prompt, response, tokens, latency	`llm_calls.py`
`inp/<model>.inp`	The exact SWMM input file that was executed	SWMM runtime
`rpt/<model>.rpt`	The SWMM report	SWMM runtime
`audit/summary.md`	Human-readable dossier synthesised at run end	audit framework

Source: agentic_swmm/audit/run_folder_layout.py, agentic_swmm/audit/provenance_v1_2.py

The folder is *immutable once sealed*: closing a run writes a SEALED marker that subsequent tools check before overwriting. This is what makes the v0.6.4 byte-reproducibility claim tractable — a reviewer can hash every file under the folder and compare against a re-execution that consumes the same provenance.json.

Provenance Capture (v1.2)

provenance_v1_2.py defines the canonical provenance schema. It is deliberately versioned (v1_2) so older dossiers can still be parsed while the schema evolves. The module captures:

The aiswmm package version and the commit hash of the installed checkout.
Versions of synthesiser dependencies (SWMManywhere, pyswmm, swmm-toolkit) and the SWMM engine binary.
SHA-256 hashes of every external input file (OSM PBF, DEM GeoTIFF, rainfall CSV, calibration targets).
RNG seeds and any explicit non-determinism flags the agent set.

Because the provenance file lists input hashes *before* the run executes, the audit dossier is forward-checkable: re-running with the same provenance must reproduce byte-identical SWMM outputs (modulo explicitly flagged non-determinism). Source: agentic_swmm/audit/provenance_v1_2.py

Conversational and Tool-Call Trace

Two parallel logs keep the reasoning path visible:

chat_note.py records the human-facing dialog: user prompts, assistant responses, slash-command expansions, and any clarifications the agent asked for. Records are append-only JSONL so a partial read still yields a coherent prefix.
llm_calls.py records the *underlying* provider calls: model id, prompt hash, completion text, token counts, latency, and whether the call succeeded, retried, or was rejected by the policy layer. This separation matters because the chat log shows *what the user saw*, while the LLM log shows *what the model actually computed* — including retries and tool-use loops the user never directly observed.

Source: agentic_swmm/audit/chat_note.py, agentic_swmm/audit/llm_calls.py

Verification via Run Comparison

Verification is the "did the model behave as expected?" half of the framework. agentic_swmm/agent/swmm_runtime/compare.py provides deterministic comparators that take two run folders (typically a sealed baseline and a fresh candidate) and emit a structured diff:

INP diff — section-by-section comparison of the SWMM input files, useful when a calibration skill rewrote subcatchment parameters.
RPT diff — numerical comparison of node flooding, link surcharge, and outfall totals against tolerances declared in the provenance.
Timeseries diff — node and link time series compared at sample points with an absolute and relative tolerance, summarising max-abs-error and RMSE.

The comparator never silently passes: any tolerance breach is surfaced as a non-empty diff object and the run is marked VERIFY_FAIL in the audit dossier. This is what makes the *audit dossier* claim from v0.7.1 meaningful — reviewers do not need to trust the agent's self-report, they re-verify against the provenance and the comparator output. Source: agentic_swmm/agent/swmm_runtime/compare.py

End-to-End Audit Workflow

flowchart LR
    A[User NL request] --> B[chat_note append]
    B --> C[Agent plan]
    C --> D[llm_calls log]
    C --> E[Tool invocations]
    E --> F[provenance_v1_2 snapshot]
    F --> G[SWMM run in runs/date/id/]
    G --> H[compare vs baseline]
    H --> I[Sealed audit dossier]
    I --> J[Reviewer re-verifies]

The audit layer never blocks the agent loop; it observes and persists. That separation is what allows the framework to remain useful both for live calibration sessions (where the agent iterates rapidly) and for the formal reproducibility study released alongside v0.6.4.

Operational Notes

aiswmm doctor includes a row for the audit store and reports corruption separately from absence, so a missing folder is benign while a partially-written provenance.json is not.
Once a run is SEALED, the audit framework treats it as read-only. Re-running the same provenance is permitted and is the supported reproducibility path; mutating a sealed folder is not.
The audit dossier is the canonical artifact for downstream memory ingestion: the memory subsystem reads provenance and the comparison summary, not the raw LLM transcript, when indexing a session.

Source: docs/experiment-audit-framework.md, agentic_swmm/audit/run_folder_layout.py, agentic_swmm/audit/provenance_v1_2.py

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Installation Paths and Runtime Configuration

Related topics: CLI Commands and Doctor Diagnostics, Operations, Workflows, and Common Failure Modes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. PyPI (recommended for most users)

Continue reading this section for the full explanation and source context.

Section 2. One-line installers (fresh-machine provisioning)

Continue reading this section for the full explanation and source context.

Section 3. Docker image (byte-reproducible)

Continue reading this section for the full explanation and source context.

Installation Paths and Runtime Configuration

Agentic SWMM is distributed through multiple parallel install paths that converge on the same aiswmm CLI entry point. Choosing between them is a trade-off between reproducibility, isolation, and ease of first-run setup, while runtime configuration layers on top of every install path to select the LLM provider and authenticate it.

Supported Install Paths

The project exposes three primary installation surfaces, each targeting a different user persona.

1. PyPI (recommended for most users)

The canonical install is pip install aiswmm, which resolves to the current stable release on PyPI. Pre-release channels are available for users who want to dogfood upcoming features:

pip install aiswmm              # latest stable (e.g. v0.7.x)
pip install aiswmm==0.7.0a1     # pin a specific alpha
pip install --pre aiswmm        # any pre-release
pip install "aiswmm[claude]"    # add the optional Claude Agent SDK provider

The optional [claude] extra pulls in the Claude Agent SDK provider used by skills that require it. pip install aiswmm without extras still installs the core CLI and default providers. Source: docs/installation.md:1-30

2. One-line installers (fresh-machine provisioning)

For a brand-new machine that does not yet have the toolchain, the repository ships two scripts: scripts/install.sh (Linux/macOS) and scripts/install.ps1 (Windows). The v0.7.3 release reworked both scripts so that a single command provisions the entire toolchain — Python, pip, the package itself, and supporting tools — and both platforms default to the latest published release. Source: scripts/install.sh:1-20

On Windows, the PowerShell installer explicitly clones into %LOCALAPPDATA% rather than the write-protected C:\Windows\System32, and sets a process-scope ExecutionPolicy Bypass so the clone step itself can run without a prior machine-wide policy change. Source: scripts/install.ps1:1-40

3. Docker image (byte-reproducible)

Auto-built Docker images are published alongside every tagged release, primarily to support the byte-level reproducibility claims made in the companion *AI for Engineering* paper. v0.6.4 introduced pinned-dependency Docker builds so that two researchers pulling the same image tag produce identical byte streams for the same input. The Docker path is the recommended choice when audit dossier hashing matters more than iteration speed. Source: docs/installation.md:45-70, docs/runtime-install-options.md:10-25

Version Pinning Semantics

Stable releases follow PEP 440 strictly: pip install aiswmm always resolves to the latest non-pre-release tag. Alpha and beta channels (a1, a2, b1) are opt-in only and never shadow the stable default. For example, throughout the 0.7.0a* series pip install aiswmm continued to deliver v0.6.4, and every v0.6.4 pin (PyPI / Git tag / Docker image) remained immutable. Source: docs/installation.md:75-95

This lets production deployments pin a specific version with full confidence that no implicit upgrade will land during a reproducibility-sensitive run, while contributors can opt into pre-releases without poisoning downstream installs.

Runtime Configuration

Once installed, the runtime is configured through a combination of environment variables and an on-disk sessions store. The most common configuration surface is the LLM provider selection.

Provider selection

The CLI accepts a --provider flag (or AISWMM_PROVIDER environment variable) that selects which model backend drives the agent loop. Supported providers are documented in docs/llm_providers.md, with the Claude Agent SDK provider only available when the [claude] extra was installed at install time. Source: docs/llm_providers.md:1-40

API key configuration

Each provider requires its own credential, supplied via environment variables such as OPENAI_API_KEY, ANTHROPIC_API_KEY, or AISWMM_* equivalents documented in docs/api-key-configuration.md. Keys are read at process start; rotating a key requires restarting any long-running agent. The API-key doc explicitly warns against committing keys to the repository and recommends a local .env file excluded by .gitignore. Source: docs/api-key-configuration.md:1-50

Health and integrity checks

aiswmm doctor inspects the local install and reports per-subsystem status. v0.7.0a2 added a Sessions store row that performs a full integrity check, returning one of four states: ok, corrupt, unreadable, or absent. The exit code is non-zero for CORRUPT and UNREADABLE so CI pipelines can fail loudly on data-loss conditions rather than silently producing empty results. A companion verb aiswmm memory repair-sessions was introduced at the same time to attempt recovery from a corrupt store. Source: docs/runtime-install-options.md:40-70

Choosing Between Paths

The table below summarises the trade-offs. It is the only table on this page and is intended as a quick-reference decision aid.

Path	Best for	Reproducibility	First-run friction
`pip install aiswmm`	Day-to-day users, CI	High when pinned	Low
One-line installer	Fresh laptops, onboarding	High when pinned	Lowest (single command)
Docker image	Audited runs, paper reproduction	Byte-level	Medium (Docker required)
Pre-release (`--pre`)	Contributors, dogfooders	Lower until promoted	Low

For most users, the recommended path is pip install aiswmm on a machine that already has Python 3.10+, falling back to the one-line installer on a fresh machine and to Docker only when an audit dossier needs byte-identical reproduction. Runtime configuration (provider + API key) is identical across all three paths because the CLI reads configuration from the same environment regardless of how the package was installed. Source: docs/installation.md:100-120, scripts/install.sh:1-20, scripts/install.ps1:1-40, docs/llm_providers.md:1-40, docs/api-key-configuration.md:1-50

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

CLI Commands and Doctor Diagnostics

Related topics: Installation Paths and Runtime Configuration, Modelling Memory and Learning Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

CLI Commands and Doctor Diagnostics

The aiswmm command-line interface is the user-facing surface of the Agentic SWMM workflow. It exposes a verb-first command tree built around three concerns: invoking the agent runtime, inspecting the persisted "modeling memory" store, and running environment diagnostics. The aiswmm doctor subcommand is the canonical health probe and is the contract CI pipelines use to decide whether the local toolchain is trustworthy enough to produce a reproducible run.

CLI Architecture and Command Tree

The CLI entry point is implemented in agentic_swmm/cli.py, which dispatches to grouped submodules under agentic_swmm/commands/. The top-level verbs reflect the operational lifecycle of a modelling session: run (start the agent), expert (ask the expert skill group), memory (manage the on-disk session store), and doctor (verify environment and store integrity) (Source: agentic_swmm/cli.py:1-80).

Subcommands are organized as thin parsers that delegate to focused implementation modules. For example, aiswmm memory is routed to agentic_swmm/commands/memory.py, while its observability verbs live alongside it in agentic_swmm/commands/memory_health.py so that reporting and mutation logic stay close (Source: agentic_swmm/commands/memory.py:1-60).

The expert group aggregates the domain-specific skills surfaced to the agent (calibration, sensitivity, design storms, network synthesis, etc.). It is registered as a namespace through agentic_swmm/commands/expert/__init__.py, which keeps the skill roster isolated from the runtime verbs (Source: agentic_swmm/commands/expert/__init__.py:1-40).

This separation — root dispatcher, per-verb module, per-skill namespace — makes the CLI predictable for both human operators and the MCP/Agent SDK wrappers that invoke it programmatically.

The `aiswmm doctor` Diagnostic

aiswmm doctor walks a fixed checklist of subsystems and prints a status table. Each row is rendered through agentic_swmm/diagnostics/doctor_report.py, which normalizes per-check results into a stable, machine-parseable format suitable for CI logs (Source: agentic_swmm/diagnostics/doctor_report.py:1-120).

The check list is defined in agentic_swmm/commands/doctor.py and currently includes the Sessions store as a first-class row. The store check reports one of four explicit states, which were introduced in v0.7.0a2 to give CI an unambiguous signal (Source: agentic_swmm/commands/doctor.py:1-100).

State	Meaning	Exit-code impact
`ok`	Sessions store readable, schema valid	0
`absent`	No sessions directory yet (fresh install)	0
`unreadable`	Directory exists but cannot be opened	non-zero
`corrupt`	Records present but fail integrity validation	non-zero

A non-zero exit on corrupt or unreadable is the mechanism by which CI catches data-loss conditions before they propagate into a published run. This was a deliberate hardening in the v0.7.0a2 release (Source: agentic_swmm/diagnostics/doctor_report.py:40-90).

Memory Health and Repair Verbs

Because aiswmm doctor is read-only, mutation of a degraded store is exposed through a parallel aiswmm memory repair-sessions verb. This lets operators diagnose with doctor, repair in a separate auditable step, and re-verify with doctor — the same three-step pattern used in database operations (Source: agentic_swmm/commands/memory.py:60-140).

agentic_swmm/commands/memory_health.py contains the observability helpers that the doctor report calls into, keeping health-introspection logic out of the mutation path. Splitting reads from writes is what allows doctor to remain a safe, idempotent probe that can run on every CI job (Source: agentic_swmm/commands/memory_health.py:1-80).

Operational Patterns and Version Notes

The recommended operator loop is therefore:

aiswmm doctor — confirm ok or absent before starting a run.
If corrupt / unreadable, run aiswmm memory repair-sessions (added in v0.7.0a2).
Re-run aiswmm doctor to confirm recovery.
Only then invoke the agent runtime.

This loop was promoted to stable in v0.7.0, which folded the v0.7.0a1/a2 prereleases — including the doctor exit-code change and the repair verb — into the default pip install aiswmm line (Source: agentic_swmm/cli.py:40-120).

For users on older lines, the relevant pin notes are documented in the v0.7.0a1 and v0.7.0a2 release notes: pip install aiswmm==0.7.0a1 for the alpha, or pip install --pre aiswmm for any prerelease. Byte-reproducibility pinning guidance lives separately in the v0.6.4 release notes and is orthogonal to the doctor diagnostics contract (Source: agentic_swmm/diagnostics/doctor_report.py:1-60).

Summary

aiswmm exposes a verb-first CLI organised into run, expert, memory, and doctor groups.
aiswmm doctor is the canonical CI health probe and emits one of four explicit Sessions store states.
corrupt and unreadable return non-zero exit codes, enabling CI to gate reproducible runs.
aiswmm memory repair-sessions is the sanctioned repair verb, kept separate from the read-only doctor path.
The full lifecycle — diagnose, repair, re-verify — became stable in v0.7.0 and is the supported operator workflow today.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Extensibility, Agent Runtimes, and Cross-Project Integration

Related topics: MCP Servers and Skill Layer, Installation Paths and Runtime Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Extensibility, Agent Runtimes, and Cross-Project Integration

The agentic-swmm-workflow repository treats stormwater modeling as an auditable agent workflow rather than a monolithic CLI. Its extensibility story rests on three orthogonal axes: (1) a skill system that lets third parties package new modeling capabilities as discoverable markdown bundles, (2) a pluggable agent runtime that swaps the LLM harness behind a stable typed-tool interface, and (3) a thin integration layer that bridges external SWMM-related projects (SWMManywhere, SWMM Canada, custom .inp sources) into the canonical runs/<date>/<id>/ layout. Together these axes let the same agent drive an end-to-end workflow whether the user talks to it through Codex, OpenClaw, or the Claude Agent SDK, while external projects contribute data and runners without forking the core.

Skill System: Authoring New Capabilities

Skills are the unit of capability extension. Each skill lives under skills/<name>/SKILL.md and follows a front-matter contract that the loader parses at runtime. The skill-author skill is itself a skill whose purpose is to scaffold new ones, closing the loop on extensibility.

The contract enforced by skills/skill-author/SKILL.md covers scope, inputs, outputs, and an invocation example. A new skill declares its triggers, the verbs it adds to the agent's toolkit, and the artefacts it materialises in the run directory. Because the agent indexes skills at startup, adding a directory under skills/ is sufficient to make a capability available — no core code changes are required. This pattern is what enables the v0.7.2 release notes to advertise "three new skills" without re-architecting the dispatcher. Source: skills/skill-author/SKILL.md.

Agent Runtimes: Codex, OpenClaw, and the Claude Agent SDK

The runtime layer is intentionally thin so that different agent harnesses can host the same skills and tools. Two runtimes are documented as first-class execution paths.

Codex runtime. The Codex adapter (docs/codex-runtime.md) describes how the agent is driven inside an OpenAI Codex-style loop. It maps the core typed tools onto Codex's tool-calling schema, preserves the deterministic run-layout convention, and forwards the audit dossier generation unchanged. The runtime is responsible only for I/O and credential plumbing; reasoning, planning, and tool selection remain with the model.

OpenClaw execution path. docs/openclaw-execution-path.md documents a second harness with different concurrency and tool-isolation guarantees. It exposes the same verbs to the agent but runs them through OpenClaw's executor, which is relevant for users who need stricter process isolation between SWMM runs and the LLM control plane.

The Claude Agent SDK is delivered as an optional extra: pip install "aiswmm[claude]==...". The PyPI extras mechanism keeps the default install lightweight while still allowing the same skill bundle to be hosted on Anthropic's SDK. This is the same pattern referenced in the v0.7.0a1 release notes, where the Claude provider is gated behind an extra so that pip install aiswmm stays stable on v0.6.x for users who have not opted into the 0.7.x line. Source: docs/codex-runtime.md, Source: docs/openclaw-execution-path.md.

Because each runtime consumes the same typed-tool surface, a workflow authored against one runtime is portable across the others; the skill markdown is the source of truth, not the harness.

Cross-Project Integration Layer

External SWMM-adjacent projects are integrated through two narrow Python modules rather than through forking or vendoring.

agentic_swmm/integrations/inp_source.py is the abstraction for *where an .inp file comes from*. It normalises several upstream sources — local files, generated networks, and synthesised drainage from SWMManywhere — into a single object the rest of the pipeline can consume. The v0.7.1 release notes describe the headline use case: a single natural-language sentence referring to a WGS84 bounding box is enough to drive synthesis via SWMManywhere (Imperial College) and produce a valid .inp without the user supplying one. Source: agentic_swmm/integrations/inp_source.py.

agentic_swmm/integrations/swmmcanada_runner.py wraps the SWMM Canada runner, allowing Canadian datasets and configuration conventions to flow through the same audit and dossier pipeline as native runs. It exists because the canonical run layout is the integration contract: any runner that can deposit a .inp, an .rpt, and an .out under runs/<date>/<id>/ participates in the workflow without further coupling.

The integrations directory also hosts a human-readable entry point: integrations/README.md lists supported external projects and their status, providing the discovery surface users actually navigate. Source: integrations/README.md.

How the Three Axes Compose

The following sequence shows how a user request travels through the extensibility layers in a typical session:

flowchart LR
  U[User prompt] --> R{Runtime}
  R -->|Codex| A1[Codex adapter]
  R -->|OpenClaw| A2[OpenClaw executor]
  R -->|Claude SDK| A3[Claude adapter]
  A1 --> S[Skill loader<br/>skills/*/SKILL.md]
  A2 --> S
  A3 --> S
  S --> T[Typed tools]
  T --> I{Integration}
  I --> P1[inp_source.py]
  I --> P2[swmmcanada_runner.py]
  I --> P3[Custom runner]
  P1 --> L[runs/&lt;date&gt;/&lt;id&gt;/]
  P2 --> L
  P3 --> L
  L --> D[Audit dossier + map]

A new capability typically arrives in one of three ways: a contributor drops a new SKILL.md (skill axis), a downstream user installs a different runtime extra (runtime axis), or an upstream project ships a runner that conforms to the run-layout contract (integration axis). The byte-reproducibility guarantees introduced in v0.6.4 apply across all three because the deterministic layout is enforced below the extensibility surface, not above it.

Practical Guidance for Extending the Project

To add a capability, write a new skills/<verb>/SKILL.md and follow the contract in skills/skill-author/SKILL.md; no Python changes are required.
To target a different LLM harness, implement the typed-tool adapter described in docs/codex-runtime.md or docs/openclaw-execution-path.md and gate any heavy dependency behind a PyPI extra.
To plug in an external data source, return an .inp (directly or through inp_source.py) and place runner outputs under runs/<date>/<id>/; consult integrations/README.md for the supported matrix and conventions.

This separation — skills as capability, runtimes as harness, integrations as data/runners — is what makes the workflow auditable, reproducible, and open to community contribution without compromising the deterministic core.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Operations, Workflows, and Common Failure Modes

Related topics: Modelling Memory and Learning Layer, Installation Paths and Runtime Configuration, CLI Commands and Doctor Diagnostics

Section Related Pages

Continue reading this section for the full explanation and source context.

Operations, Workflows, and Common Failure Modes

Scope and Purpose

The agentic-swmm-workflow project wraps stormwater modelling around an agent runtime that accepts natural-language intents and dispatches typed tools against SWMM (Storm Water Management Model). From an operator's perspective, three concerns dominate: how a run is initiated and laid out on disk, how the agent reasons across calibration, sensitivity, and design-storm steps, and how the system surfaces or recovers from failure states. This page consolidates the operational surface exposed by the aiswmm CLI, the canonical workflow patterns demonstrated in the example folders, and the failure modes the maintainers have explicitly hardened against since v0.6.x. Source: README.md:1-40.

CLI Operations and Run Layout

Day-to-day operations are anchored on the aiswmm command and the runs/<date>/<id>/ directory convention. Every workflow produces a deterministic folder whose contents can be re-hashed and compared across machines. The CLI exposes verbs such as aiswmm doctor and aiswmm memory repair-sessions that are explicitly intended for operational health checks rather than modelling work. Source: scripts/acceptance/run_acceptance.py:1-80.

The two most important operational verbs are:

aiswmm doctor — runs an integrity sweep that prints a Sessions store row reporting one of ok, corrupt, unreadable, or absent. Exit codes are non-zero for CORRUPT and UNREADABLE, so CI pipelines and cron jobs can fail fast when the session store degrades. Source: README.md:40-90.
aiswmm memory repair-sessions — a safer maintenance verb introduced alongside doctor for recovering or rebuilding the sessions index without dropping user data. Source: README.md:90-140.

Installation itself is treated as an operational concern. v0.7.3 reworked the one-line installers so a single command provisions the toolchain on a fresh machine, with the Windows path cloning into %LOCALAPPDATA% rather than C:\Windows\System32 and applying a process-scope ExecutionPolicy Bypass. Defaulting to the latest published release keeps unattended installs predictable. Source: README.md:1-40.

Canonical Workflows

The example folders document three recurring patterns that operators should recognise.

Natural-language end-to-end run. A single sentence that names a WGS84 bounding box is sufficient to synthesise the drainage network from public OSM and DEM data via SWMManywhere (Imperial College), execute SWMM, write a deterministic audit dossier, and render a spatial network map — all written under runs/<date>/<id>/. This is the workflow targeted by the v0.7.1 release notes. Source: docs/swmm-anywhere-quickstart.md:1-60.

Calibration and sensitivity. The calibration example pairs observed hydrographs with SWMM parameters and exercises the agent-reachable calibration and sensitivity tools introduced in v0.7.2 alongside the *AI for Engineering* paper. Operators typically invoke a sequence of run → calibrate → sensitivity → design-storm steps, each leaving an audit trail in the same run folder. Source: examples/calibration/README.md:1-60.

Coupled 1D/2D modelling. The TUFLOW ↔ SWMM coupling in examples/tuflow-swmm-module03/ illustrates a multi-engine workflow in which the agent orchestrates boundary exchange between TUFLOW and SWMM rather than running either in isolation. Source: examples/tuflow-swmm-module03/README.md:1-80.

For repeatability, the project pins dependencies and ships auto-built Docker images; v0.6.4 introduced this hardening so that re-running the same intent on a clean container reproduces byte-identical outputs. Source: docs/byte-identical-reproducibility.md:1-80.

Common Failure Modes

Operational incidents fall into a small, well-characterised set. The table below maps each mode to its observable signal and the documented recovery path.

Failure mode	Observable signal	Recovery / mitigation
Sessions store corruption	`aiswmm doctor` reports `corrupt` (exit ≠ 0)	Run `aiswmm memory repair-sessions`; re-run `doctor` to confirm `ok`. Source: README.md:40-90
Sessions store unreadable	`doctor` reports `unreadable` (exit ≠ 0)	Inspect file permissions on the sessions store path; rerun `doctor`. Source: README.md:40-90
Installer writes to a protected directory	Clone or `pip install` fails under `C:\Windows\System32` on Windows	Use the v0.7.3 one-line installer, which targets `%LOCALAPPDATA%` and applies `ExecutionPolicy Bypass`. Source: README.md:1-40
Dependency drift across agents	SWMM outputs differ between machines despite identical intent	Pin via `pip install aiswmm==<version>` or use the auto-built Docker images documented for byte-reproducibility. Source: docs/byte-identical-reproducibility.md:1-80
Pre-release confusion	`pip install aiswmm` returns a stable build while a newer pre-release exists	Use `pip install --pre aiswmm` or pin explicitly (`aiswmm==0.7.0a1`). Source: README.md:1-40
Missing example inputs	Workflow aborts mid-synthesise because OSM/DEM tiles are unavailable	Pre-stage inputs under the example folder or run inside the published Docker image. Source: examples/tecnopolo/README.md:1-60

The acceptance runner at scripts/acceptance/run_acceptance.py codifies these failure paths into executable checks, which is why CI in the repository can fail builds on a CORRUPT or UNREADABLE sessions state rather than silently passing. Source: scripts/acceptance/run_acceptance.py:1-80.

Observability and Memory

Memory operations are first-class verbs because the agent re-uses prior runs to keep calibration histories and design-storm results coherent. v0.7.0 promoted "modelling memory" to a stable feature, and v0.7.0a2 added the safer repair-sessions verb specifically to avoid destructive maintenance steps. Operators should treat doctor as the canonical pre-flight check before any batch run and treat memory repair-sessions as an interactive recovery action rather than a routine step. Source: README.md:40-140.

Source: https://github.com/Zhonghao1995/agentic-swmm-workflow / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: identity.distribution | https://github.com/Zhonghao1995/agentic-swmm-workflow

2. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/Zhonghao1995/agentic-swmm-workflow

3. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/Zhonghao1995/agentic-swmm-workflow

4. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

5. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/Zhonghao1995/agentic-swmm-workflow

6. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/Zhonghao1995/agentic-swmm-workflow

7. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

8. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/Zhonghao1995/agentic-swmm-workflow

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 10

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using agentic-swmm-workflow with real data or production workflows.

v0.7.3 - github / github_release
v0.7.2 — agent-reachable calibration & sensitivity, three new skills, de - github / github_release
v0.7.1 - github / github_release
v0.7.0 - github / github_release
v0.7.0a2 - github / github_release
v0.7.0a1 - github / github_release
v0.6.4 — Byte-reproducibility hardening - github / github_release
v0.6.3a1 - github / github_release
v0.6.2a1 - github / github_release
Installation risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence