# https://github.com/dagster-io/dagster Project Manual

Generated at: 2026-06-22 08:49:58 UTC

## Table of Contents

- [Dagster Overview and Core Concepts](#page-1)
- [System Architecture and Deployment](#page-2)
- [Integrations and Extensibility](#page-3)
- [Operations, Observability, and Community Roadmap](#page-4)

<a id='page-1'></a>

## Dagster Overview and Core Concepts

### Related Pages

Related topics: [System Architecture and Deployment](#page-2), [Integrations and Extensibility](#page-3), [Operations, Observability, and Community Roadmap](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/dagster-io/dagster/blob/main/README.md)
- [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/main/python_modules/dagster/README.md)
- [examples/README.md](https://github.com/dagster-io/dagster/blob/main/examples/README.md)
- [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/main/examples/quickstart_etl/README.md)
- [examples/assets_dbt_python/README.md](https://github.com/dagster-io/dagster/blob/main/examples/assets_dbt_python/README.md)
- [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_fully_featured/README.md)
- [examples/development_to_production/README.md](https://github.com/dagster-io/dagster/blob/main/examples/development_to_production/README.md)
- [examples/docs_projects/project_ask_ai_dagster/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_ask_ai_dagster/README.md)
- [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_atproto_dashboard/README.md)
- [examples/assets_modern_data_stack/README.md](https://github.com/dagster-io/dagster/blob/main/examples/assets_modern_data_stack/README.md)
- [python_modules/automation/README.md](https://github.com/dagster-io/dagster/blob/main/python_modules/automation/README.md)
- [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/README.md)

</details>

# Dagster Overview and Core Concepts

## Purpose and Scope

Dagster is an open-source data orchestration framework that enables the declarative definition, execution, and observability of data pipelines. As described in the top-level [README.md](https://github.com/dagster-io/dagster/blob/main/README.md), Dagster is designed to integrate with popular data tools and deploy to the user's existing infrastructure. The project is Apache 2.0 licensed and the Python package is located at [python_modules/dagster](https://github.com/dagster-io/dagster/blob/main/python_modules/dagster/README.md).

Dagster's value proposition is twofold: it provides a unified programming model for data assets and their dependencies, and it ships with a web-based UI (Dagit / the Dagster webserver) for inspecting pipeline lineage, running jobs, and observing materialization events. The framework supports both batch and streaming workflows and is intentionally extensible through integrations such as [dagster-dbt](https://github.com/dagster-io/dagster/blob/main/python_modules/dagster/README.md), [dagster-pagerduty](https://github.com/dagster-io/dagster/blob/main/python_modules/libraries/dagster-pagerduty/README.md), [dagster-github](https://github.com/dagster-io/dagster/blob/main/python_modules/libraries/dagster-github/README.md), and [dagster-papertrail](https://github.com/dagster-io/dagster/blob/main/python_modules/libraries/dagster-papertrail/README.md).

## Core Concepts

Dagster is built around a small set of first-class abstractions. These abstractions appear consistently across the example projects and form the vocabulary a developer needs to learn.

### Assets, Ops, Jobs, and Resources

The most prominent concept is the **Software-Defined Asset** (often just "asset"). An asset represents a logical data product, such as a table, file, or model artifact, that a function knows how to produce. The [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/main/examples/quickstart_etl/README.md) demonstrates assets like `hackernews_topstories` and `hackernews_topstories_word_cloud`, where each function declares its output and Dagster infers the dependency graph. The example project also attaches arbitrary `MetadataValue` to materializations, which is then displayed in the **Asset Details** page and **Activity** tab of the UI.

**Ops** are the lower-level compute units — individual Python functions that read inputs, perform work, and produce outputs. Ops are composed into **Jobs**, which are executable graphs of ops or assets. The [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_fully_featured/README.md) example groups its assets into three logical asset groups — `core`, `recommender`, and `activity_analytics` — that demonstrate how jobs can be organized.

**Resources** provide pluggable interfaces for external systems (databases, APIs, blob storage). [examples/development_to_production/README.md](https://github.com/dagster-io/dagster/blob/main/examples/development_to_production/README.md) explicitly highlights "swappable resources" as a mechanism for ensuring production data is not overwritten during local development. **IOManagers** are a related abstraction that govern how asset inputs are read and outputs are written, and they are also referenced in the fully featured project example.

### Schedules, Sensors, and Partitions

Dagster provides declarative primitives for triggering pipeline runs. **Schedules** run jobs on a cron expression; **Sensors** react to external state changes (for example, the appearance of a new file or the freshness of a downstream dynamic table). The Snowflake Cortex example at [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/main/examples/snowflake_cortex/dagster_snowflake/README.md) describes schedules running daily at 2 AM EST and weekly at Sunday 3 AM EST, along with a sensor that monitors dynamic-table freshness.

**Partitions** divide an asset into slices (commonly time-based windows). The quickstart ETL partitions the HackerNews fetch by hour, and [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_atproto_dashboard/README.md) uses *dynamic partitions* together with declarative automation and concurrency limits for the ATProto ingestion pipeline. A recurring community request — issue [#17005](https://github.com/dagster-io/dagster/issues/17005), "Partitioned asset checks" (69 comments) — asks for checks to be evaluated per-partition rather than per-asset, which would let users validate each slice independently.

## Project Layout and Examples

The repository is organized into several top-level directories. [examples/README.md](https://github.com/dagster-io/dagster/blob/main/examples/README.md) describes the example projects as small demonstrations of how Dagster is used in practice; some are flagged `UNMAINTAINED` in their READMEs, such as [examples/assets_modern_data_stack/README.md](https://github.com/dagster-io/dagster/blob/main/examples/assets_modern_data_stack/README.md) and [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_fully_featured/README.md). Actively maintained, smaller examples include `quickstart_etl`, `assets_dbt_python`, `development_to_production`, and the docs projects under `examples/docs_projects/`.

Documentation tooling lives in `js_modules/dg-docs-components/`, which provides React components for both the `dg docs` standalone site and embedded documentation surfaces inside the Dagster app ([js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/README.md)). The `python_modules/automation/` directory contains Docker and PyPI publishing tooling ([python_modules/automation/README.md](https://github.com/dagster-io/dagster/blob/main/python_modules/automation/README.md)).

```mermaid
flowchart LR
    A[Asset / Op Definition] --> B[Job Graph]
    B --> C{Trigger Source}
    C -->|Manual| D[CLI / UI / Python API]
    C -->|Schedule| E[Cron Expression]
    C -->|Sensor| F[External State Change]
    D --> G[Executor]
    E --> G
    F --> G
    G --> H[IOManager / Resource]
    H --> I[(Storage: DB, S3, etc.)]
    G --> J[Dagit Webserver Observability]
```

## Getting Started and Community

The recommended path for new users is the `quickstart_etl` example. As documented in [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/main/examples/quickstart_etl/README.md), the user scaffolds the project, installs dependencies with `pip install -e ".[dev]"`, and starts the UI via `dagster dev` (or, in newer examples such as [examples/docs_projects/project_ask_ai_dagster/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_ask_ai_dagster/README.md), `dg dev`). The UI is served at `http://localhost:3000`, where the user can materialize assets, view logs, and inspect the lineage graph. The project can also be deployed to Dagster Cloud, the managed offering referenced in [examples/assets_dbt_python/README.md](https://github.com/dagster-io/dagster/blob/main/examples/assets_dbt_python/README.md).

The latest release at the time of writing is **1.13.10** (core) and **0.29.10** (libraries), which includes a bugfix for `DagsterInstance.get_latest_materialization_event` returning stale pre-wipe materializations. Longer-running feature requests from the community include Authentication and RBAC in Dagit ([#2219](https://github.com/dagster-io/dagster/issues/2219), 87 comments), multi-threaded/async executors ([#4041](https://github.com/dagster-io/dagster/issues/4041), 17 comments), OpenTelemetry trace IDs in log lines ([#12353](https://github.com/dagster-io/dagster/issues/12353)), and a [Sqlmesh](https://github.com/dagster-io/dagster/issues/21655) integration modeled on the existing dbt integration. These issues indicate active community interest in stronger observability, parallel execution, and broader third-party tool coverage.

## See Also

- [Dagster Assets and Materialization](dagster-assets-and-materialization.md)
- [Dagster Schedules, Sensors, and Automation](dagster-schedules-sensors-automation.md)
- [Dagster Resources and IOManagers](dagster-resources-iomanagers.md)
- [Dagster Integrations Library](dagster-integrations-library.md)

---

<a id='page-2'></a>

## System Architecture and Deployment

### Related Pages

Related topics: [Dagster Overview and Core Concepts](#page-1), [Integrations and Extensibility](#page-3), [Operations, Observability, and Community Roadmap](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/dagster-io/dagster/blob/master/README.md)
- [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/README.md)
- [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)
- [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_atproto_dashboard/README.md)
- [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)
- [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/master/examples/snowflake_cortex/dagster_snowflake/README.md)
- [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/README.md)
- [examples/docs_projects/project_prompt_eng/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_prompt_eng/README.md)
- [examples/experimental/README.md](https://github.com/dagster-io/dagster/blob/master/examples/experimental/README.md)
- [examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)
- [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/README.md)
- [js_modules/dg-docs-components/package.json](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/package.json)

</details>

# System Architecture and Deployment

Dagster is an open-source data orchestration platform that unifies data pipelines, observability, and asset-centric development into a single framework. As of the 1.13.x release line, Dagster offers a layered deployment topology that ranges from fully local development to managed cloud platforms, giving teams flexibility on where their orchestration code and execution happen. Source: [README.md](https://github.com/dagster-io/dagster/blob/master/README.md)

This page documents the high-level system architecture, the supported deployment topologies, and the runtime components that production Dagster installations rely on.

## High-Level Architecture

A Dagster deployment is composed of three logical tiers: the **user-defined code** containing assets, ops, jobs, schedules, and sensors; the **Dagster core services** that load and execute that code; and the **storage layer** that persists run history, event logs, and asset materializations.

```mermaid
flowchart LR
    A[User Code<br/>assets, ops, jobs, schedules] -->|load| B(Dagster Webserver / Dagit)
    A -->|load| C(Dagster Daemon)
    C -->|schedules/sensors| D[Run Coordinator]
    D -->|launch| E[Run Worker / Executor]
    E -->|read/write| F[(Instance Storage<br/>event log, runs, asset catalog)]
    B -->|query| F
    C -->|query| F
```

The webserver provides the GraphQL API and the React-based UI used to inspect runs and assets. The daemon is a long-running process that ticks schedules and sensors, evaluates declarative automations, and reconciles run state. Both services point at the same `DagsterInstance` storage backend, so they share a consistent view of the world. Source: [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/README.md)

## Deployment Topologies

Dagster supports three primary deployment shapes, each described in the official deployment documentation.

| Topology | Where code runs | Where execution runs | Best fit |
|---|---|---|---|
| **OSS / self-hosted** | Customer-managed code locations | Customer-managed run workers | Teams that need full control over infrastructure |
| **Dagster+ Hybrid** | Customer-managed code locations | Customer-managed run workers | Organizations with data-resident or compliance requirements |
| **Dagster+ Serverless** | Dagster-managed serverless code locations | Dagster-managed serverless run workers | Teams that want fully managed infrastructure with branching CI/CD |

In all three topologies, the conceptual components above (webserver, daemon, run workers, storage) exist; what changes is who operates them and where the user code lives. Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)

### OSS / Self-Hosted Deployment

An OSS deployment is fully under the operator's control. A workspace file (`workspace.yaml`) declares one or more code locations, each pointing at a Python module that defines assets and jobs. A typical local OSS workflow looks like:

```bash
pip install -e ".[dev]"
dagster dev
```

The `dagster dev` command launches both the webserver and the daemon together, pointed at the workspace defined in the current directory. Source: [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)

For component-based projects generated with `dg`, the equivalent command is `dg dev`, which performs the same role for a components-compatible code location. Source: [examples/docs_projects/project_prompt_eng/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_prompt_eng/README.md)

Production OSS deployments typically separate the webserver, daemon, and run workers into different processes or containers, all pointing at a shared instance storage backend (Postgres, MySQL, or a cloud equivalent). The `fully_featured` reference project illustrates this pattern by loading different assets based on the `DAGSTER_DEPLOYMENT` environment variable (`prod`, `staging`, or local). Source: [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/README.md)

### Dagster+ Hybrid

In the Hybrid deployment model, user code stays inside the customer's environment, while Dagster+ provides the control plane: the webserver, the GraphQL API, and the coordination of schedules and sensors. The user maintains the code locations and the run workers that materialize assets, which keeps data resident inside their own network. The `multi_tenant` example project demonstrates this shape with three independent code locations (`harbor_outfitters`, `summit_financial`, `beacon_hq`) sharing a single workspace configuration. Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)

The project also demonstrates per-location runtime isolation by bundling optional marker packages (`catalog_coach_runtime`, `risk_reviewer_runtime`, `briefing_writer_runtime`) under `vendor/` that can be installed into isolated per-location environments. Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)

### Dagster+ Serverless

Serverless deployments push the code location into a fully managed build and deploy pipeline. Each push to a Git branch results in a new serverless deployment, and Dagster+ handles scheduling, execution, and storage. The `atproto_dashboard` reference project is a good example: it combines an ingestion pipeline, a dbt project, and a BI layer, and is designed to be deployed end-to-end on a managed runtime. Source: [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_atproto_dashboard/README.md)

## Core Runtime Components

### Code Locations and Workspaces

A code location is a Python package that exports a `Definitions` object. Multiple code locations are grouped by a `workspace.yaml` file at the deployment root. In a Dagster+ Hybrid context, code location entries are mirrored by a `dagster_cloud.yaml` file that adds deployment-specific metadata. Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)

### Schedules, Sensors, and Declarative Automation

Long-running orchestration requires a process that can fire schedules and evaluate sensors even when no user is logged in. This is the role of the `dagster-daemon` service. Daemon-managed automation includes cron-style schedules, polling sensors, and the newer declarative automation policies attached directly to assets. Source: [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/master/examples/snowflake_cortex/dagster_snowflake/README.md)

### Execution and Compute Logs

Each run is launched by a run coordinator and executed by an executor inside a worker. The default executor runs steps in subprocesses; community discussion has highlighted that multi-threaded / async execution would require evolving the `ComputeLogManager` to operate at process scope rather than step scope. Source: [community context — #4041](https://github.com/dagster-io/dagster/issues/4041)

### Instance Storage

Every Dagster deployment needs a persistent storage backend for run history, event logs, and asset materialization records. The `1.13.10` release notes for example fixed a bug where `DagsterInstance.get_latest_materialization_event` could return a stale pre-wipe materialization, illustrating how the storage layer is the system of record for asset state. Source: [community context — release notes 1.13.10]

## Local Development Patterns

### Quickstart ETL

The `quickstart_etl` example walks through the canonical local development loop:

1. Install the project with `pip install -e ".[dev]"`.
2. Run `dagster dev` to start both the webserver and the daemon.
3. Open `http://localhost:3000` to inspect the three demo assets (`hackernews_topstory_ids`, `hackernews_topstories`, `hackernews_stories_word_cloud`).
4. Click **Reload definition** after editing asset code so Dagster picks up the latest changes. Source: [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)

### Components-Based Projects

Newer projects generated with `dg` use a components-compatible layout where definitions are assembled from YAML component configuration rather than purely Python code. The migration guide tests in `docs_snippets` validate that existing projects can be made components-compatible without breaking existing definitions. Source: [examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)

## Observability and Forward-Looking Concerns

Dagster's webserver exposes per-run trace-like structure visually. The community has requested that those traces be exported as OpenTelemetry spans with trace/span identifiers attached to log lines, so the same execution structure visible in the UI can be piped into external observability platforms. Source: [community context — #12353](https://github.com/dagster-io/dagster/issues/12353)

For multi-user deployments, authentication and role-based access control for Dagit is a long-standing feature request tracked in the issue tracker. Source: [community context — #2219](https://github.com/dagster-io/dagster/issues/2219)

## Common Failure Modes and Caveats

- **Definition drift.** Code edits are only visible after a reload; in production this means restarting or updating the code location. Source: [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)
- **Experimental APIs.** Some examples under `examples/experimental` use private interfaces that can change without notice. Source: [examples/experimental/README.md](https://github.com/dagster-io/dagster/blob/master/examples/experimental/README.md)
- **Package naming collisions.** Bundled example projects sometimes have naming conflicts with installed packages; documentation in each example notes any required install tweaks. Source: [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/master/examples/snowflake_cortex/dagster_snowflake/README.md)
- **Stale materialization reads.** The 1.13.10 release fixed a bug where wiping an asset's partitions could leave the wiped materialization reported as the asset's latest; upgrade to at least that release if you rely on `get_latest_materialization_event`. Source: [community context — release notes 1.13.10]

## See Also

- [Dagster README](https://github.com/dagster-io/dagster/blob/master/README.md)
- [Quickstart ETL walkthrough](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)
- [Multi-tenant reference project](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)
- [ATProto dashboard serverless reference](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_atproto_dashboard/README.md)
- [Components and `dg` dev workflow](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_prompt_eng/README.md)
- [Docs snippet testing infrastructure](https://github.com/dagster-io/dagster/blob/master/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)

---

<a id='page-3'></a>

## Integrations and Extensibility

### Related Pages

Related topics: [Dagster Overview and Core Concepts](#page-1), [System Architecture and Deployment](#page-2), [Operations, Observability, and Community Roadmap](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/dagster-io/dagster/blob/main/README.md)
- [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/main/python_modules/dagster/README.md)
- [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/main/examples/quickstart_etl/README.md)
- [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_atproto_dashboard/README.md)
- [examples/docs_projects/project_prompt_eng/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_prompt_eng/README.md)
- [examples/docs_projects/project_ask_ai_dagster/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_ask_ai_dagster/README.md)
- [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/main/examples/snowflake_cortex/dagster_snowflake/README.md)
- [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_multi_tenant/README.md)
- [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_fully_featured/README.md)
- [examples/experimental/README.md](https://github.com/dagster-io/dagster/blob/main/examples/experimental/README.md)
- [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/README.md)
- [js_modules/dg-docs-components/package.json](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/package.json)
- [js_modules/dg-docs-components/src/ComponentHeader.tsx](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/src/ComponentHeader.tsx)
- [js_modules/ui-core/package.json](https://github.com/dagster-io/dagster/blob/main/js_modules/ui-core/package.json)
- [examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_migrating_definitions/my-existing-project/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_migrating_definitions/my-existing-project/README.md)
- [examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_adding_components_to_existing_project/my-existing-project/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_adding_components_to_existing_project/my-existing-project/README.md)
- [examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)
</details>

# Integrations and Extensibility

Dagster is designed as a "data orchestrator" that connects to a wide ecosystem of external systems — warehouses, transformation tools, ML platforms, LLMs, and bespoke user code — while remaining extensible through a component model and an external-pipeline protocol. The top-level `README.md` positions Dagster as a unified orchestrator for both scheduled batch jobs and event-driven workflows, and most of the in-repo example projects exist specifically to demonstrate that breadth of integration [Source: [README.md](https://github.com/dagster-io/dagster/blob/main/README.md)].

## Library and Example Integrations

Dagster ships a wide catalog of first-party integrations, each maintained as its own library and demonstrated by an example project in `examples/`. The `examples/` directory is the canonical surface for showing how a particular integration is intended to be consumed:

- **dbt** — demonstrated in the production-style `project_fully_featured` example, which combines Python assets with dbt models, S3, and Snowflake [Source: [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_fully_featured/README.md)].
- **Snowflake / Snowflake Cortex AI** — `examples/snowflake_cortex/dagster_snowflake/README.md` illustrates multiple phases: structured entity extraction with `SNOWFLAKE.CORTEX.COMPLETE`, `AI_AGG` aggregations, dynamic tables as external assets, and environment-driven authentication (password vs. PEM/DER private key) [Source: [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/main/examples/snowflake_cortex/dagster_snowflake/README.md)].
- **Anthropic** — `project_prompt_eng` uses Anthropic prompt engineering to look up alternative fuel stations, with `uv sync` + `dg dev` as the standard bootstrap path [Source: [examples/docs_projects/project_prompt_eng/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_prompt_eng/README.md)].
- **OpenAI + Pinecone** — `project_ask_ai_dagster` implements a RAG support bot, with the asset lineage rendered in the README showing how ingestion, embedding, and retrieval are stitched together [Source: [examples/docs_projects/project_ask_ai_dagster/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_projects/project_ask_ai_dagster/README.md)].
- **HackerNews + Pandas** — `quickstart_etl` is the smallest end-to-end reference, fetching from the HackerNews API, transforming with Pandas, and producing a word-cloud asset; it is also the recommended scaffold for Dagster Cloud Serverless onboarding [Source: [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/main/examples/quickstart_etl/README.md)].

The `examples/experimental/` directory is reserved for integrations that depend on undocumented or private APIs, with an explicit warning that those APIs "are liable to change at any time without warning" [Source: [examples/experimental/README.md](https://github.com/dagster-io/dagster/blob/main/examples/experimental/README.md)].

## Components: The Extensibility Surface

For users who want to extend Dagster itself — or wrap their own services as reusable building blocks — the project exposes a **components model** plus a dedicated React component library for documentation surfaces.

The components layer is exercised by two test fixtures under `examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/`:

- `test_adding_components_to_existing_project/` — demonstrates how an existing code location can be made "components-compatible" by adopting the new scaffolding.
- `test_migrating_definitions/` — covers the migration path for definitions that pre-date the components model.

Both fixtures use a shared `my-existing-project/README.md` sample to drive the doc-snippet tests [Source: [examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_adding_components_to_existing_project/my-existing-project/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_adding_components_to_existing_project/my-existing-project/README.md), [examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_migrating_definitions/my-existing-project/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/guides/dg/test_migrating_definitions/my-existing-project/README.md)].

On the web/UI side, `@dagster-io/dg-docs-components` is a dedicated package for rendering documentation both inside the Dagster app and as a standalone site. Its package manifest declares a tight dependency footprint: `react-markdown`, `remark-gfm`, `strip-markdown`, `highlight.js`, and `clsx`, with peer dependencies on React 18 [Source: [js_modules/dg-docs-components/package.json](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/package.json)]. The `ComponentHeader.tsx` source file shows how individual components render Markdown descriptions, support a `truncated` vs. `full` description style, and emit Blueprint-style headings — confirming the docs surface is itself built from composable, user-extendable UI primitives [Source: [js_modules/dg-docs-components/src/ComponentHeader.tsx](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/src/ComponentHeader.tsx), [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/README.md)].

The `ui-core` package pulls `@dagster-io/dg-docs-components` in as a workspace dependency, which means user-defined docs components are first-class citizens of the main Dagster UI build [Source: [js_modules/ui-core/package.json](https://github.com/dagster-io/dagster/blob/main/js_modules/ui-core/package.json)].

## Multi-Tenancy and Cross-Cutting Concerns

Several community-tracked gaps shape the practical limits of the current extensibility story:

| Area | Community Issue | Implication for Integrations |
|------|-----------------|------------------------------|
| Auth & RBAC in Dagit | #2219 (87 comments) | Per-tenant isolation in deployments like `project_multi_tenant` currently relies on per-code-location auth, not Dagit-level RBAC [Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_multi_tenant/README.md)] |
| Multi-threaded / async executor | #4041 (17 comments) | Limits how aggressively an integration can fan out work inside a single run |
| OpenTelemetry traces in log lines | #12353 (6 comments) | Limits observability when integrating with OTel-based systems |
| Sqlmesh integration | #21655 (51 comments) | A frequently-requested sibling to the existing dbt integration |

The `project_multi_tenant` example illustrates one pattern for working within those limits: separate code locations (`beacon_hq`, `harbor_outfitters`, `summit_financial`) each pin their own LLM model and runtime marker package, and an embedded backend can be swapped for Ollama via the `LLM_BACKEND` environment variable [Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/main/examples/project_multi_tenant/README.md)].

## Doc-Snippet Testing Pipeline

Because integration examples double as documentation, the repo includes a dedicated snapshot-test pipeline at `examples/docs_snippets/docs_snippets_tests/snippet_checks/`. It validates that the contents of files in `./docs_snippets/` match the expected output and can regenerate them via `tox -e docs_snapshot_update`. A `DAGSTER_GIT_REPO_DIR` environment variable and a `make dev_install` in the repo root are prerequisites [Source: [examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)]. This pipeline is what keeps the integration guides in lockstep with CLI and library output across releases — the latest being **1.13.10 (core) / 0.29.10 (libraries)**, which fixed stale materialization bugs in `DagsterInstance.get_latest_materialization_event` and `get_asset_records`.

## See Also

- [Dagster README](https://github.com/dagster-io/dagster/blob/main/README.md)
- [python_modules/dagster README](https://github.com/dagster-io/dagster/blob/main/python_modules/dagster/README.md)
- [Examples directory overview](https://github.com/dagster-io/dagster/tree/main/examples)
- [dg-docs-components package](https://github.com/dagster-io/dagster/blob/main/js_modules/dg-docs-components/README.md)
- [Doc snippet testing guide](https://github.com/dagster-io/dagster/blob/main/examples/docs_snippets/docs_snippets_tests/snippet_checks/README.md)

---

<a id='page-4'></a>

## Operations, Observability, and Community Roadmap

### Related Pages

Related topics: [Dagster Overview and Core Concepts](#page-1), [System Architecture and Deployment](#page-2), [Integrations and Extensibility](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/dagster-io/dagster/blob/master/README.md)
- [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/README.md)
- [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)
- [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/README.md)
- [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)
- [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_atproto_dashboard/README.md)
- [examples/feature_graph_backed_assets/README.md](https://github.com/dagster-io/dagster/blob/master/examples/feature_graph_backed_assets/README.md)
- [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/README.md)
- [examples/docs_projects/project_ask_ai_dagster/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_ask_ai_dagster/README.md)
- [examples/docs_projects/project_prompt_eng/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_prompt_eng/README.md)
- [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/master/examples/snowflake_cortex/dagster_snowflake/README.md)
</details>

# Operations, Observability, and Community Roadmap

## Overview

Dagster is a data orchestration platform that emphasizes software-engineering best practices for data pipelines. The project ships with a core Python package, a web-based UI (Dagit/webserver), and a growing catalog of integrations, examples, and reusable components. Operations, observability, and a community-driven roadmap form three intertwined pillars: users need to *run* their pipelines reliably, *understand* what is happening when they fail or slow down, and *influence* the direction of upstream features.

This page synthesizes the public documentation, example projects, and the most-engaged community issues into a single reference for operators and contributors.

Source: [README.md](https://github.com/dagster-io/dagster/blob/master/README.md)
Source: [python_modules/dagster/README.md](https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/README.md)

## Operations: Running and Managing Pipelines

Dagster pipelines are organized around **assets**, **jobs**, **schedules**, and **sensors**, and the project provides multiple deployment patterns depending on scale.

### Local and CI-style development

The quickest way to operate Dagster is the `dagster dev` (or `dg dev`) command, which launches a local webserver and reloads definitions on code change. The quickstart ETL project demonstrates the standard flow: install with `pip install -e ".[dev]"`, run `dagster dev`, then navigate to `http://localhost:3000` to see the asset lineage and toggle schedules.

Source: [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md)

The quickstart project defines a daily schedule that can be turned on with a switch in the UI; the README notes that "congratulations, you now have a daily job running in production" once the schedule is enabled. This pattern — define once in code, control at runtime — is the central operational primitive for many users.

### Multi-environment and multi-tenant deployments

Larger workloads are demonstrated by the `project_fully_featured` example, which loads a single codebase into three deployments (production, staging, local) by switching the `DAGSTER_DEPLOYMENT` environment variable. Production and staging point at S3 and Snowflake under different prefixes; local falls back to the filesystem and DuckDB.

Source: [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/README.md)

For organizations running isolated business units, `project_multi_tenant` shows a workspace with multiple code locations (`harbor_outfitters`, `summit_financial`, `beacon_hq`) sharing a common LLM resource, with per-location runtime marker packages and a `workspace.yaml` that mirrors the Dagster+ multi-location layout. The example ships with an embedded LLM backend so it runs without Docker.

Source: [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md)

### Graph-backed and composable assets

When a single Python function is not enough, operators compose **ops** into **graphs** and wrap the result as a **graph-backed asset**. The `feature_graph_backed_assets` example parses airline passenger data by separating each transformation into an op and chaining them, exposing a clean asset boundary while still benefiting from step-level observability.

Source: [examples/feature_graph_backed_assets/README.md](https://github.com/dagster-io/dagster/blob/master/examples/feature_graph_backed_assets/README.md)

## Observability: Logs, Health, and Lineage

Observability is delivered through three channels: structured logs, asset lineage visualized in the UI, and reusable documentation components.

### Lineage and asset views

Every project in the example catalog ships with a lineage diagram (PNG or SVG) showing the dependency graph between assets. For instance, `project_atproto_dashboard` shows an end-to-end flow from ATProto ingestion through dbt models to a Power BI dashboard, and explicitly enumerates the Dagster features used (dynamic partitions, declarative automation, concurrency limits).

Source: [examples/docs_projects/project_atproto_dashboard/README.md](https://github.com/dagster-io/dagster/blob/master/examples/docs_projects/project_atproto_dashboard/README.md)

### Reusable docs components

The `dg-docs-components` package is a small React/TypeScript component library that renders component metadata (name, tags, description) in both the standalone `dg docs` site and the in-app documentation panel. The header component (`ComponentHeader.tsx`) supports truncated or full description modes and renders an icon for the component type, illustrating how the same building blocks power both authoring and runtime observability surfaces.

Source: [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/README.md)
Source: [js_modules/dg-docs-components/src/ComponentHeader.tsx](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/src/ComponentHeader.tsx)

### Integration-level observability

Integration examples, such as the Snowflake Cortex project, expose query-history cost tracking and freshness sensors for dynamic tables, giving operators signals that go beyond simple success/failure. The Snowflake project also includes password and private-key authentication patterns, illustrating that operational concerns (auth, cost, freshness) are first-class design considerations.

Source: [examples/snowflake_cortex/dagster_snowflake/README.md](https://github.com/dagster-io/dagster/blob/master/examples/snowflake_cortex/dagster_snowflake/README.md)

The diagram below summarizes how the operational and observability layers interact with community-driven features.

```mermaid
flowchart LR
    A[Author: code-defined assets, jobs, schedules] --> B[Dagit / webserver UI]
    B --> C[Run history and lineage]
    B --> D[Asset health & freshness]
    A --> E[Compute logs]
    E --> F[External telemetry<br/>e.g. OpenTelemetry]
    B --> G[Alerts and sensors]
    G --> H[PagerDuty / Slack / email]
    subgraph Community Roadmap
        R1[Auth & RBAC]
        R2[Partitioned asset checks]
        R3[Async / multi-threaded executor]
        R4[OpenTelemetry traces]
    end
    R1 -. gap .-> B
    R2 -. gap .-> D
    R3 -. gap .-> A
    R4 -. gap .-> E
```

## Community Roadmap

The most-engaged open issues in the Dagster repository cluster around three operational and observability gaps.

| Issue | Theme | Why it matters |
|---|---|---|
| #2219 — Auth & RBAC in Dagit | Operations | Operators need authentication and role-based access for production deployments; the issue has 87 comments and is still tracked as future work. |
| #17005 — Partitioned asset checks | Observability | Users want per-partition asset checks, not just per-asset, to detect bad slices in large tables. |
| #4041 — Multi-threaded / async executor | Operations | Running multiple ops in the same process unlocks throughput for I/O-bound workloads; the prerequisite is a process-scope `ComputeLogManager`. |
| #12353 — OpenTelemetry trace IDs in logs | Observability | Forwarding Dagster's natural trace structure to OTel tooling would unify debugging across systems. |
| #21655 — sqlmesh integration | Ecosystem | A first-class adapter (mirroring the dbt integration) is requested to register sqlmesh models as Dagster assets. |

Source: community context from repository issues.

The latest stable release (1.13.10 core / 0.29.10 libraries) shipped a fix for stale materialization events in `DagsterInstance.get_latest_materialization_event`, which is directly relevant to observability: operators querying "what is the latest run for this asset?" now get post-wipe data instead of ghost records.

Source: community release notes in repository context.

## See Also

- [examples/quickstart_etl/README.md](https://github.com/dagster-io/dagster/blob/master/examples/quickstart_etl/README.md) — minimal end-to-end pipeline.
- [examples/project_fully_featured/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/README.md) — multi-environment reference.
- [examples/project_multi_tenant/README.md](https://github.com/dagster-io/dagster/blob/master/examples/project_multi_tenant/README.md) — workspace and code locations.
- [js_modules/dg-docs-components/README.md](https://github.com/dagster-io/dagster/blob/master/js_modules/dg-docs-components/README.md) — shared UI components for docs and in-app help.

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: dagster-io/dagster

Summary: Found 13 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/24989

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/29674

## 3. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/33946

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/33945

## 5. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/dagster-io/dagster

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/33944

## 7. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/29693

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/dagster-io/dagster/issues/33943

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/dagster-io/dagster

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/dagster-io/dagster

## 11. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/dagster-io/dagster

## 12. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/dagster-io/dagster

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/dagster-io/dagster

<!-- canonical_name: dagster-io/dagster; human_manual_source: deepwiki_human_wiki -->
