# https://github.com/Netflix/metaflow Project Manual

Generated at: 2026-06-19 19:55:48 UTC

## Table of Contents

- [Framework Overview and Core Concepts](#page-1)
- [Datastore, Metadata, and Storage Backends](#page-2)
- [Production Orchestration, Schedulers, and Cloud Execution](#page-3)
- [Secrets, Cards, Extensions, and Ecosystem Integrations](#page-4)

<a id='page-1'></a>

## Framework Overview and Core Concepts

### Related Pages

Related topics: [Datastore, Metadata, and Storage Backends](#page-2), [Production Orchestration, Schedulers, and Cloud Execution](#page-3), [Secrets, Cards, Extensions, and Ecosystem Integrations](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)
- [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)
- [metaflow/plugins/cards/ui/src/types.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/types.ts)
- [metaflow/plugins/cards/ui/src/utils.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/utils.ts)
- [metaflow/plugins/cards/ui/src/store.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/store.ts)
- [metaflow/plugins/cards/ui/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md)
- [metaflow/plugins/cards/ui/package.json](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/package.json)
- [metaflow/tutorials/00-helloworld/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/00-helloworld/README.md)
- [metaflow/tutorials/04-playlist-plus/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/04-playlist-plus/README.md)
- [metaflow/tutorials/07-worldview/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/07-worldview/README.md)
- [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
- [R/README.md](https://github.com/Netflix/metaflow/blob/main/R/README.md)
- [R/inst/tutorials/README.md](https://github.com/Netflix/metaflow/blob/main/R/inst/tutorials/README.md)
- [stubs/README.md](https://github.com/Netflix/metaflow/blob/main/stubs/README.md)
</details>

# Framework Overview and Core Concepts

## Purpose and Scope

Metaflow is a human-centric framework for building and managing real-life AI and ML systems, originally developed at Netflix and now maintained by Outerbounds. It unifies code, data, and compute across the full development lifecycle, from rapid prototyping in notebooks to production deployment on workflow orchestrators such as Argo Workflows and AWS Step Functions (Source: [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)).

The framework is delivered as both a Python package and an R package, with R bindings exposing the Python library as a backend so that flows can be authored entirely in R (Source: [R/README.md](https://github.com/Netflix/metaflow/blob/main/R/README.md), [R/inst/tutorials/README.md](https://github.com/Netflix/metaflow/blob/main/R/inst/tutorials/README.md)). Type stubs are also published separately to `metaflow-stubs` on PyPI to provide IDE and language-server hints (Source: [stubs/README.md](https://github.com/Netflix/metaflow/blob/main/stubs/README.md)).

The repository organizes its surface area into several top-level directories:

| Directory | Purpose |
|---|---|
| `metaflow/` | Core Python library: flowspec, parameters, decorators, datastores, client API |
| `metaflow/plugins/` | Pluggable subsystems (cards UI, environments, orchestrators, metadata providers) |
| `metaflow/tutorials/` | Episode-based Python tutorials (00-helloworld through 07-worldview) |
| `R/` | R language bindings and R tutorials |
| `devtools/` | Local Kubernetes dev stack (Minikube + Tilt) |
| `stubs/` | PyPI-stub package for editor type hints |

## Core Architecture

A Metaflow flow is a Python class whose methods become steps of a directed acyclic graph (DAG). Each step is decorated with metadata, parameters, and runtime hooks. Tasks are units of work executed per step, producing persisted artifacts and metadata events (Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)).

The client API in `metaflow/client/core.py` exposes a hierarchical object model:

- `Run` (rooted at a `Flow`) — owns steps, metadata, and tags
- `Step` — owns tasks and a list of associated `DataArtifact` objects
- `Task` — exposes `MetaflowData`, `MetaflowArtifacts`, `MetaflowCode`, `metadata`, `stdout`, `stderr`, `finished_at`, and `runtime_name` (Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py))
- `DataArtifact` — named, typed payload persisted to the datastore

The `Task` class documents that `MetaflowData` downloads all artifacts on access (slower), while `MetaflowArtifacts` returns a container of individual `DataArtifact` objects for selective retrieval. The class registry `_CLASSES` is populated at the bottom of the module, which makes the client API extensible to new entity types (Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)).

```mermaid
flowchart LR
  Flow --> Step
  Step --> Task
  Task --> Artifact1[DataArtifact]
  Task --> Artifact2[DataArtifact]
  Task --> Code[MetaflowCode]
  Task --> Meta[Metadata Events]
  Task --> Logs[stdout / stderr]
```

## Cards UI and Visualization

Metaflow ships a Svelte/TypeScript-based "cards" UI that renders step results in a standalone HTML file. The UI is built and bundled from `metaflow/plugins/cards/ui/` and consumed by the `current.card` runtime object (Source: [metaflow/plugins/cards/ui/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md)).

The card component model is described by `types.ts` and includes these building blocks: `section`, `page`, `image`, `title`, `subtitle`, `text`, `progressBar`, `heading`, `table`, `artifacts`, `dag`, `log`, `markdown`, `valueBox`, `vegaChart`, `pythonCode`, and `eventsTimeline` (Source: [metaflow/plugins/cards/ui/src/types.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/types.ts)). Pages and sections form a navigable hierarchy computed at render time by `getPageHierarchy()` in `utils.ts` (Source: [metaflow/plugins/cards/ui/src/utils.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/utils.ts)).

The UI depends on `svelte`, `svelte-vega`, `svelte-markdown`, `@iconify/svelte`, `vega`, `vega-embed`, and `vega-lite` for charting, and uses `cypress` for end-to-end testing. The build pipeline is `vite` driven, with `prebuild` running `svelte-check` and `eslint` (Source: [metaflow/plugins/cards/ui/package.json](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/package.json)). The store module (`store.ts`) keeps a writable `cardData` subject and exposes a global `window.metaflow_card_update()` function so the host page can mutate the rendered tree in place (Source: [metaflow/plugins/cards/ui/src/store.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/store.ts)).

## Extensibility, Deployment, and Community

Metaflow is built around plug-in points for environments, datastores, metadata providers, secrets, and orchestrators. The release history shows the project introducing a pluggable artifact serializer framework in 2.19.26, optional cronjob schedules in 2.19.32, and a StepMutator-based late-attached decorator mechanism (regressed in 2.19.33, see issue #3258) for card-attaching custom decorators. Recent fixes also touch the AWS Secrets Manager secrets provider, where there is currently no configuration path for custom credentials or session variables distinct from the S3 datastore (issue #3275) (Source: [README.md](https://github.com/Netflix/metaflow/blob/main/README.md), community context).

For local development, the `devtools/` directory provides a Metaflow Devstack: a Minikube + Tilt environment that brings up MinIO (S3-compatible storage), PostgreSQL, a metadata service, the Metaflow UI, Argo Workflows, Argo Events, JobSet, and a local AWS Batch emulator. `make up` launches an interactive service picker, and `make shell` opens a shell with Metaflow config pre-loaded (Source: [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)).

The tutorial episodes demonstrate the framework's breadth:

- `00-helloworld` — linear workflow basics and step decorators (Source: [metaflow/tutorials/00-helloworld/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/00-helloworld/README.md))
- `04-playlist-plus` — conda-based per-step dependency isolation (Source: [metaflow/tutorials/04-playlist-plus/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/04-playlist-plus/README.md))
- `07-worldview` — notebook-based dashboards using the client API (Source: [metaflow/tutorials/07-worldview/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/07-worldview/README.md))

The R package mirrors the same episode numbering and is installed with `devtools::install_github("Netflix/metaflow", subdir="R")`, with tutorials fetched locally via `metaflow::pull_tutorials()` (Source: [R/README.md](https://github.com/Netflix/metaflow/blob/main/R/README.md), [R/inst/tutorials/README.md](https://github.com/Netflix/metaflow/blob/main/R/inst/tutorials/README.md)).

Long-standing community requests — Kubernetes/Argo integration (issues #16, #50), artifact-as-microservice hosting (#3), in-memory large-dataframe processing (#4), and R support (which is now partially delivered via the R package) — reflect Metaflow's evolution from a Netflix-internal tool into a general-purpose AI/ML orchestration framework (Source: [README.md](https://github.com/Netflix/metaflow/blob/main/README.md), community context).

## See Also

- [Cards UI Internals](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md)
- [Metaflow Devstack](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
- [Python Tutorials Index](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/00-helloworld/README.md)
- [R Package and Tutorials](https://github.com/Netflix/metaflow/blob/main/R/README.md)

---

<a id='page-2'></a>

## Datastore, Metadata, and Storage Backends

### Related Pages

Related topics: [Framework Overview and Core Concepts](#page-1), [Production Orchestration, Schedulers, and Cloud Execution](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)
- [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py)
- [metaflow/datastore/flow_datastore.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/flow_datastore.py)
- [metaflow/datastore/content_addressed_store.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/content_addressed_store.py)
- [metaflow/metadata_provider/metadata.py](https://github.com/Netflix/metaflow/blob/main/metaflow/metadata_provider/metadata.py)
- [metaflow/metaflow_config.py](https://github.com/Netflix/metaflow/blob/main/metaflow/metaflow_config.py)
- [metaflow/plugins/__init__.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/__init__.py)
- [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)
- [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
</details>

# Datastore, Metadata, and Storage Backends

## Overview

Metaflow separates **metadata** (the structured record of what ran, where, and how) from **datastores** (the byte-level storage of artifacts, logs, and pickled state). The two layers are pluggable: shipped providers can be swapped at runtime through configuration, while custom providers can be registered as plugins. The [README.md](https://github.com/Netflix/metaflow/blob/main/README.md) frames the system as unifying "code, data, and compute at every stage", and the storage layer is the mechanism that makes artifacts addressable across runs, steps, and tasks. 

The client side, exposed through `metaflow.client`, surfaces a uniform read API on top of these backends. The `Task` object returned by the client hides whether a metadata event came from the local SQLite file or a remote HTTP service, and whether an artifact was pulled from S3, Azure Blob, or the local filesystem. This decoupling is what allows the same flow code to execute unchanged locally and on cloud orchestrators.

```mermaid
flowchart LR
    A[Flow Code] --> B[Step Runtime]
    B --> C[Metadata Provider]
    B --> D[Datastore Storage]
    C --> E[(Local SQLite / Service)]
    D --> F[(S3 / Azure / GS / Local FS)]
    G[metaflow.client] -->|reads| C
    G -->|reads| D
    H[FileCache] -->|caches| D
    G --> H
```

## Metadata Provider System

The metadata layer tracks every step, task, artifact, and runtime event as a discrete record. In `metaflow/client/core.py` the public entry points are `metadata()`, `get_metadata()`, and `default_metadata()`, which together let the runtime and client negotiate which backend is in scope. Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py).

The `metadata(ms)` function accepts three forms of selection:

| Form | Meaning |
|---|---|
| `<path>` | Local metadata rooted at that filesystem path |
| `http(s)://<url>` | Remote metadata service (REST) |
| `<type>@<info>` | Explicit provider name and configuration string |

Selecting a provider has global effect: the function rebinds the module-level `current_metadata` pointer. If no provider matches, the call prints a diagnostic and returns the current one rather than raising, which keeps flows resilient during early setup. Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py).

The default provider is resolved by scanning the registered `METADATA_PROVIDERS` list for one whose `TYPE` matches `DEFAULT_METADATA`, and the chosen profile is overridable through the `METAFLOW_PROFILE` environment variable. The same module imports `MAX_ATTEMPTS` from `metaflow.metaflow_config` so retry budgets are centrally controlled. Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py), [metaflow/metaflow_config.py](https://github.com/Netflix/metaflow/blob/main/metaflow/metaflow_config.py).

Each task record exposes a `Metadata` named tuple of `(name, value, created_at, type, task)` plus a convenience `metadata_dict` that reduces the list to its latest value per key, as documented in the `Task` class attributes. Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py).

## Datastore and FileCache

The datastore layer owns the bytes. Tasks persist artifacts by writing to a content-addressed store, and the client side reads them through `FileCache`, defined in `metaflow/client/filecache.py`. The cache sits in front of the slower storage backends and is bounded by three configuration knobs from `metaflow.metaflow_config`: `CLIENT_CACHE_PATH`, `CLIENT_CACHE_MAX_SIZE`, and `CLIENT_CACHE_MAX_FLOWDATASTORE_COUNT`. Source: [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py), [metaflow/metaflow_config.py](https://github.com/Netflix/metaflow/blob/main/metaflow/metaflow_config.py).

Internally the cache holds two ordered structures: a blob cache for artifact content and a metadata cache keyed by `(ds_type, ds_root, attempt, flow_name, run_id, step_name, task_id, name)`. Artifact retrieval goes through `FlowDataStore` (in `metaflow/datastore/flow_datastore.py`) and the lower-level `BlobCache` exposed by `metaflow/datastore/content_addressed_store.py`. The `get_artifact` and `get_all_artifacts` helpers compose these into a single streaming interface for the client. Source: [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py), [metaflow/datastore/flow_datastore.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/flow_datastore.py), [metaflow/datastore/content_addressed_store.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/content_addressed_store.py).

Cache writes use an atomic temp-file-then-rename pattern: a `NamedTemporaryFile` is flushed and then `os.rename`-d into place, after which size accounting is updated and `_garbage_collect()` is invoked to enforce the configured limit. Reading falls back gracefully if a file has been concurrently evicted by another process, returning `None` instead of raising `IOError`. Source: [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py).

The module also defines `NEW_FILE_QUARANTINE = 10`, a short window during which newly created files are not eligible for eviction, which prevents race conditions when a process is still streaming data into a freshly written object. Source: [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py).

## Backend Selection and Common Failure Modes

Backends are registered through `metaflow.plugins`, which exports `DATASTORES` and `METADATA_PROVIDERS` collections that the client and runtime consume. The README emphasises that production deployment requires configuring Metaflow and the underlying infrastructure, including the storage layer, before flows can run on cloud orchestrators. Source: [metaflow/plugins/__init__.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/__init__.py), [README.md](https://github.com/Netflix/metaflow/blob/main/README.md).

The local development stack documented in [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md) shows the canonical wiring: a `minio` service (S3-compatible) exposes ports 9000/9001 for object storage, a `postgresql` instance backs the `metadata-service` on port 8080, and the `ui` and `argo-workflows` services depend on both. This mirrors the production split between a blob store and a metadata database, and is the configuration that local CI runs against.

Community issues that surface in this area include cross-account credentials ([#3275](https://github.com/Netflix/metaflow/issues/3275)), where the AWS Secrets Manager provider cannot yet accept the same custom session/credential overrides that the S3 datastore already supports, and an S3 empty-input guard added in 2.19.30 ([#3194](https://github.com/Netflix/metaflow/pull/3194)) to stop `s3._put_many_files` and `_read_many_files` from failing on empty iterables. Users configuring a non-default AWS profile for one backend while leaving the other on the default credentials chain should expect this gap until a unified client-options path lands.

## See Also

- [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py) — Client API and metadata provider switching
- [metaflow/client/filecache.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/filecache.py) — Local artifact cache
- [metaflow/datastore/flow_datastore.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/flow_datastore.py) — Flow-level datastore API
- [metaflow/datastore/content_addressed_store.py](https://github.com/Netflix/metaflow/blob/main/metaflow/datastore/content_addressed_store.py) — Blob storage primitives
- [metaflow/metadata_provider/metadata.py](https://github.com/Netflix/metaflow/blob/main/metaflow/metadata_provider/metadata.py) — Metadata provider base
- [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md) — Local stack wiring for minio/postgres/metadata-service

---

<a id='page-3'></a>

## Production Orchestration, Schedulers, and Cloud Execution

### Related Pages

Related topics: [Framework Overview and Core Concepts](#page-1), [Datastore, Metadata, and Storage Backends](#page-2), [Secrets, Cards, Extensions, and Ecosystem Integrations](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [metaflow/plugins/argo/argo_workflows.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/argo/argo_workflows.py)
- [metaflow/plugins/argo/argo_workflows_decorator.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/argo/argo_workflows_decorator.py)
- [metaflow/plugins/argo/argo_workflows_deployer.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/argo/argo_workflows_deployer.py)
- [metaflow/plugins/aws/step_functions/step_functions.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions.py)
- [metaflow/plugins/aws/step_functions/step_functions_decorator.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions_decorator.py)
- [metaflow/plugins/aws/step_functions/step_functions_deployer.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions_deployer.py)
- [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
- [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)
- [metaflow/tutorials/05-hello-cloud/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/05-hello-cloud/README.md)
- [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)
</details>

# Production Orchestration, Schedulers, and Cloud Execution

## Overview

Metaflow is structured to decouple *flow logic* from *execution infrastructure* so that the same Python flow file can run locally during development and on production-grade orchestrators in production. The framework separates three concerns: a **scheduler** that owns the DAG lifecycle (Argo Workflows, AWS Step Functions, Airflow, or simple cron triggers), a **compute runtime** that runs each step on a specific backend (`@kubernetes`, `@batch`, `@conda`, `@nvidia`), and a **client API** that lets users inspect results of deployed flows. At Netflix, Metaflow currently supports over 3,000 AI/ML projects and executes hundreds of millions of data operations. Source: [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)

The pattern is intentionally uniform: every orchestrator plugin is a triplet of files — `*_decorator.py` augments the FlowSpec at compile time, `*_workflows.py` translates the FlowSpec into the orchestrator's native resource descriptor, and `*_deployer.py` handles create/update/trigger lifecycle against the orchestrator's API. Source: [metaflow/plugins/argo/argo_workflows.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/argo/argo_workflows.py), [metaflow/plugins/aws/step_functions/step_functions.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions.py), [metaflow/plugins/aws/step_functions/step_functions_deployer.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions_deployer.py)

```mermaid
graph LR
  A[flow.py] --> B[FlowSpec]
  B --> C["@argo / @step_functions decorator"]
  C --> D[Compile-time augmentation]
  D --> E[Deployer]
  E --> F[Argo Workflow CRD / SFN State Machine]
  F --> G[Task Runtime: k8s Pod / Batch Job]
  G --> H[Metadata + Artifact Store]
  H --> I[metaflow.client]
```

## Scheduler Plugins

### Argo Workflows (Kubernetes-native)

The Argo plugin under `metaflow/plugins/argo/` targets Kubernetes and emits native `Workflow` CRDs. Argo is the most actively hardened scheduler in recent releases: `2.19.34` shipped `fix(argo): order conditional input fallback by DAG`, which ensures that conditional parameter defaults are resolved in DAG order rather than declaration order. Long-running community interest is documented in issue #50 ("Support for Kubernetes (with Argo)", 17 comments). Source: [metaflow/plugins/argo/argo_workflows_decorator.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/argo/argo_workflows_decorator.py), [Release 2.19.34](https://github.com/Netflix/metaflow/releases/tag/2.19.34), [Issue #50](https://github.com/Netflix/metaflow/issues/50)

### AWS Step Functions (AWS-native)

The Step Functions plugin under `metaflow/plugins/aws/step_functions/` compiles flows into AWS Step Functions state machines, keeping execution entirely within AWS-managed services. It is typically paired with `@batch` for task execution. Source: [metaflow/plugins/aws/step_functions/step_functions_decorator.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions_decorator.py), [metaflow/plugins/aws/step_functions/step_functions.py](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/aws/step_functions/step_functions.py)

### Airflow and Custom Schedulers

Airflow integration is tracked in community issue #16 ("Support for Airflow and Kubernetes", 20 comments) and is not a first-class plugin in the open-source repository. Because of the decorator/deployer extension pattern, teams can plug in additional orchestrators without modifying the core Metaflow runtime. Source: [Issue #16](https://github.com/Netflix/metaflow/issues/16)

## Compute Runtimes and the Local Devstack

Once a flow is scheduled, individual steps are dispatched to a compute backend by runtime decorators. The `@kubernetes` decorator maps steps to Kubernetes Pods and is the focus of tutorial `05-hello-cloud`. `@batch` maps steps to AWS Batch jobs. `@conda` packages a reproducible Python environment for the step, `@nvidia` adds GPU support, and `@retry` provides transparent transient-failure tolerance — commonly stacked on top of the cloud decorators. Source: [metaflow/tutorials/05-hello-cloud/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/05-hello-cloud/README.md), [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)

For local end-to-end testing of these pipelines, the repository ships a complete Kubernetes-based devstack under `devtools/` driven by Minikube and Tilt. Its services mirror the production stack:

| Service | Depends on | Host port | Purpose |
|---|---|---|---|
| `argo-workflows` | — | 2746 | Argo controller + server |
| `argo-events` | argo-workflows | 12000 | Event-driven triggers |
| `jobset` | — | — | Kubernetes JobSet controller |
| `localbatch` | minio | (internal) | Local AWS Batch emulator |
| `metadata-service` | postgresql | 8080 | Metaflow metadata API |
| `ui` | postgresql, minio | 3000 | Metaflow UI |
| `minio` | — | 9000 / 9001 | S3-compatible object storage |
| `postgresql` | — | 5432 | Metadata database |

`make up`, `make shell`, `make all-up`, and `SERVICES_OVERRIDE=argo-workflows,minio make up` give a fast iteration loop. The `localbatch` service is especially useful — it emulates AWS Batch locally so a flow can be tested with `@batch` semantics without an AWS account. Source: [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)

## Known Issues, Recent Changes, and the Client API

Several recent releases have focused on hardening scheduler and execution paths:

- **Issue #3275 — AWS Secrets Manager credentials gap.** The AWS Secrets Manager secrets provider currently has no way to receive custom credentials or session variables, even though the S3 datatools client does. This is a known limitation when the secrets provider must authenticate against a different AWS account than the default datastore. Source: [Issue #3275](https://github.com/Netflix/metaflow/issues/3275)
- **Issue #3258 (affecting 2.19.33) — `StepMutator` regression.** Changes to `_process_late_attached_decorator` in PR #3238 caused card-attaching custom decorators to be initialized twice, producing warnings like `Multiple @card decorator have been annotated with duplicate ids`. Source: [Issue #3258](https://github.com/Netflix/metaflow/issues/3258)
- **2.19.32 — Optional schedules for `@cron`.** A cron-scheduled flow no longer requires a paired `@schedule`. Source: [Release 2.19.32](https://github.com/Netflix/metaflow/releases/tag/2.19.32)
- **2.19.26 — Pluggable artifact serializer framework.** Artifact (de)serialization is now decoupled from the storage layer, simplifying custom serializers. Source: [Release 2.19.26](https://github.com/Netflix/metaflow/releases/tag/2.19.26)
- **2.19.30 — Monitor metric tagging.** Step and project context are now sent to the monitor for tagging emitted metrics. Source: [Release 2.19.30](https://github.com/Netflix/metaflow/releases/tag/2.19.30)

Once flows are running in production, the `metaflow.client` package is the canonical inspection surface. It exposes `Run`, `Step`, `Task`, and `DataArtifact` classes, with `MetaflowData` lazily downloading artifact contents and `MetaflowArtifacts` exposing them as `DataArtifact` objects. The internal `_get_matching_pathspecs` method enables targeted metadata queries across all runs that match a given metadata pattern. Source: [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)

## See Also

- [Kubernetes + Argo (Issue #50)](https://github.com/Netflix/metaflow/issues/50)
- [Airflow + Kubernetes Support (Issue #16)](https://github.com/Netflix/metaflow/issues/16)
- [AWS Secrets Manager credentials gap (Issue #3275)](https://github.com/Netflix/metaflow/issues/3275)
- [`StepMutator` decorator regression (Issue #3258)](https://github.com/Netflix/metaflow/issues/3258)
- [Local Devstack Guide (devtools/README.md)](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
- [Tutorial 05 — Hello Cloud](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/05-hello-cloud/README.md)
- [Metaflow Production Documentation](https://docs.metaflow.org/production/introduction)

---

<a id='page-4'></a>

## Secrets, Cards, Extensions, and Ecosystem Integrations

### Related Pages

Related topics: [Framework Overview and Core Concepts](#page-1), [Production Orchestration, Schedulers, and Cloud Execution](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [metaflow/plugins/cards/ui/src/types.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/types.ts)
- [metaflow/plugins/cards/ui/src/utils.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/utils.ts)
- [metaflow/plugins/cards/ui/src/store.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/store.ts)
- [metaflow/plugins/cards/ui/src/global.d.ts](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/global.d.ts)
- [metaflow/plugins/cards/ui/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md)
- [metaflow/plugins/cards/ui/package.json](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/package.json)
- [metaflow/client/core.py](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py)
- [README.md](https://github.com/Netflix/metaflow/blob/main/README.md)
- [stubs/README.md](https://github.com/Netflix/metaflow/blob/main/stubs/README.md)
- [R/README.md](https://github.com/Netflix/metaflow/blob/main/R/README.md)
- [R/inst/tutorials/README.md](https://github.com/Netflix/metaflow/blob/main/R/inst/tutorials/README.md)
- [metaflow/tutorials/00-helloworld/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/00-helloworld/README.md)
- [metaflow/tutorials/07-worldview/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/07-worldview/README.md)
- [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md)
</details>

# Secrets, Cards, Extensions, and Ecosystem Integrations

Metaflow ships a small set of opinionated extension surfaces — visualization, programmatic introspection, language bindings, type hints, and a local dev stack — that let users integrate Metaflow into broader workflows without forking the core. This page covers how those surfaces are structured in the repository, what they expose, and where the most common integration pitfalls have surfaced in community discussions.

## Cards Visualization System

The Cards plugin is a Svelte-based UI that is compiled to a standalone HTML/JS bundle and embedded inside Metaflow run artifacts. Source: [metaflow/plugins/cards/ui/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md).

### Component Model

The UI's public contract with Python is encoded as a TypeScript discriminated union of component types. Source: [metaflow/plugins/cards/ui/src/types.ts:1-180](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/types.ts).

| Component | `type` value | Purpose |
| --- | --- | --- |
| `SectionComponent` | `"section"` | Groups children into titled, multi-column blocks |
| `PageComponent` | `"page"` | Top-level navigable page; `title` doubles as its DOM id |
| `TableComponent` | `"table"` | Tabular data with optional `vertical` layout |
| `VegaChartComponent` | `"vega"` (via `svelte-vega`) | Declarative Vega-Lite visualizations |
| `ImageComponent` | `"image"` | Static or runtime-generated image |
| `ProgressBarComponent` | `"progressBar"` | Numeric progress indicator |
| `HeadingComponent` | `"heading"` | Section header |
| `MarkdownComponent` / `TextComponent` | `"markdown"` / `"text"` | Prose blocks |

The `Status` type (`"success" | "error" | "idle" | "in-progress"`) drives task-state badges, while `PathSpecObject` (`flowname / runid / stepname / taskid`) is the canonical addressing scheme used across the runtime. Source: [metaflow/plugins/cards/ui/src/types.ts:7-15](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/types.ts).

### Runtime Store and Live Updates

Cards that are updated mid-flow rely on a global Svelte writable store. The window exposes a `metaflow_card_update` hook that mutates the in-place component tree, so subsequent renders pick up new data without rebuilding the bundle. Source: [metaflow/plugins/cards/ui/src/store.ts:1-50](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/src/store.ts).

```mermaid
flowchart LR
  A[Python flow step] --> B[current.card append/mutate]
  B --> C[Card payload as JSON]
  C --> D[metaflow_card_update hook]
  D --> E[Svelte writable store]
  E --> F[DOM re-render]
```

### Local Development

Developers iterate on the UI without re-running Python flows:

1. `npm install` once at [metaflow/plugins/cards/ui](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/).
2. `npm run dev` starts a Vite watch server on http://localhost:8080 that loads `public/card-example.json`.
3. After edits to `.svelte` or `.css` files, run `npm run lint` (enforced by `prebuild`) and `npm run build` to emit `main.js` and `bundle.css` to the directory configured in `package.json`. Source: [metaflow/plugins/cards/ui/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/plugins/cards/ui/README.md).

### Known Card Pitfalls

Community reports show that recent changes to `_process_late_attached_decorator` (release `2.19.33`) caused card-attaching decorators to initialize twice, producing `[@card WARNING] Multiple @card decorator have been annotated with duplicate ids` and breaking `current.card['id']` lookups. Source: issue [#3258](https://github.com/Netflix/metaflow/issues/3258).

## Programmatic Access via the Client API

The `metaflow.client` package is the read-only introspection surface for runs already executed. The `Task` class exposes:

- `data` — lazy container that downloads all artifacts on access.
- `artifacts` — fine-grained `MetaflowArtifacts` view, excluding private ids prefixed with `_`.
- `metadata` / `metadata_dict` — chronological and latest-keyed event streams.
- `successful`, `exception`, `finished_at`, `runtime_name` — execution diagnostics.
- `code` — bundled `MetaflowCode` snapshot.
- `environment_info` — dictionary describing the runtime.

Source: [metaflow/client/core.py:1-120](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py).

A `metadata(...)` helper at module scope lets callers switch providers (`@dataclass`-style metadata struct with `name`, `value`, `created_at`, `type`, `task`). Source: [metaflow/client/core.py:60-100](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py).

The Card UI and the client API share the same `flowname/runid/stepname/taskid` pathspec scheme, which lets notebooks (e.g. `07-worldview`) drive dashboards directly from historical runs. Source: [metaflow/tutorials/07-worldview/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/07-worldview/README.md).

## Ecosystem Extensions

### Type Hints via `metaflow-stubs`

A separately distributed stub package provides type hints for IDEs (VSCode) and language servers (Pylance). Install with:

```sh
pip install metaflow-stubs
```

Source: [stubs/README.md](https://github.com/Netflix/metaflow/blob/main/stubs/README.md).

### R Language Bindings

`R/` ships a thin R wrapper around the Python core. Installation from GitHub:

```R
devtools::install_github("Netflix/metaflow", subdir = "R")
metaflow::install_metaflow()
metaflow::pull_tutorials()
```

Source: [R/README.md](https://github.com/Netflix/metaflow/blob/main/R/README.md) and [R/inst/tutorials/README.md](https://github.com/Netflix/metaflow/blob/main/R/inst/tutorials/README.md). Long-running community requests for first-class R support (issue [#1](https://github.com/Netflix/metaflow/issues/1)) trace their lineage to this package.

### Local DevStack

`devtools/` is a Minikube + Tilt environment that mirrors production dependencies locally. Service summary:

| Service | Depends on | Host port(s) |
| --- | --- | --- |
| `minio` (S3-compatible) | — | 9000, 9001 |
| `postgresql` | — | 5432 |
| `metadata-service` | postgresql | 8080 |
| `ui` | postgresql, minio | 3000 |
| `argo-workflows` | — | 2746 |
| `argo-events` | argo-workflows | 12000 |
| `localbatch` (AWS Batch emulator) | minio | — |

Bring it up with `make up` (interactive picker) or `SERVICES_OVERRIDE=localbatch,minio make up` for a subset. Source: [devtools/README.md](https://github.com/Netflix/metaflow/blob/main/devtools/README.md).

### Tutorial Episodes as Integrations

The `tutorials/` directory doubles as integration documentation:

- `00-helloworld` — linear flow with a `Step` decorator.
- `07-worldview` — notebook dashboard over the client API.
- `04-playlist-plus` — dependency management via the `conda` decorator; ships with a Miniconda prerequisite.

Source: [metaflow/tutorials/00-helloworld/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/00-helloworld/README.md) and [metaflow/tutorials/04-playlist-plus/README.md](https://github.com/Netflix/metaflow/blob/main/metaflow/tutorials/04-playlist-plus/README.md).

## Common Failure Modes and Community Pain Points

Three patterns recur in issues and release notes:

1. **Secret-provider authentication drift** — Issue [#3275](https://github.com/Netflix/metaflow/issues/3275) notes that the AWS Secrets Manager provider cannot yet accept custom session variables or client parameters, forcing users to share credentials between the secrets backend and the default S3 datastore even when those should differ.
2. **Decorator double-init under StepMutators** — Issue [#3258](https://github.com/Netflix/metaflow/issues/3258) documents how `_process_late_attached_decorator` changes in `2.19.33` regressed card attachment.
3. **Dependency hygiene** — Patches through `2.19.27`–`2.19.34` repeatedly fix `pip` config parsing, conda decorator version detection, and async card-process timeout semantics (e.g. moving from `time.time()` to `time.monotonic()` in PR #3225). Source: release notes for [2.19.34](https://github.com/Netflix/metaflow/releases/tag/2.19.34), [2.19.32](https://github.com/Netflix/metaflow/releases/tag/2.19.32), and [2.19.27](https://github.com/Netflix/metaflow/releases/tag/2.19.27).

The default metadata provider can be overridden via the `METADATA_PROVIDERS` registry imported at [metaflow/client/core.py:55-65](https://github.com/Netflix/metaflow/blob/main/metaflow/client/core.py), and the documented `MAX_ATTEMPTS` constant controls retry budget for transient metadata failures.

## See Also

- [Metaflow client API reference](https://docs.metaflow.org/metaflow/client) — programmatic introspection of runs.
- [Cards guide](https://docs.metaflow.org/metaflow/visualizing-results) — Python-side authoring of card components.
- [Devstack setup](https://docs.metaflow.org/infrastructure/metaflow-on-aws) — production-equivalent local Kubernetes environment.

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: Netflix/metaflow

Summary: Found 8 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

## 1. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/Netflix/metaflow/issues/3275

## 2. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/Netflix/metaflow

## 3. Runtime risk - Runtime risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/Netflix/metaflow/issues/3258

## 4. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Netflix/metaflow

## 5. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/Netflix/metaflow

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/Netflix/metaflow

## 7. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Netflix/metaflow

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/Netflix/metaflow

<!-- canonical_name: Netflix/metaflow; human_manual_source: deepwiki_human_wiki -->
