Doramagic Project Pack · Human Manual
kernels
Build compute kernels and load them from the Hub.
Project Overview and System Architecture
Related topics: Loading Kernels with the kernels Python Package, Building Kernels with kernel-builder and the Nix Builder, Example Kernels and Backend Variants
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Loading Kernels with the kernels Python Package, Building Kernels with kernel-builder and the Nix Builder, Example Kernels and Backend Variants
Project Overview and System Architecture
The huggingface/kernels project provides a unified stack for distributing and consuming hardware-specific compute kernels as ordinary Python packages. It is designed around three roles: a Python runtime that loads kernels from the Hub, a kernel-builder CLI that turns upstream C++/CUDA/HIP/XPU/TVM-FFI sources into Hub-ready artifacts, and a kernels-data shared library that holds the canonical build configuration and metadata schema. The project is "kernel repo type" aware on the Hub, meaning kernels are first-class repositories with their own repo_type="kernel", versioning, and discovery surface (nix-builder/README.md).
High-Level Architecture
The system is best understood as a pipeline: a kernel author writes build.toml + sources, the kernel-builder CLI compiles per-backend variants and uploads them to the Hub, and an end user resolves a kernel through the kernels Python package. The diagram below shows the major components and the data that flows between them.
flowchart LR
A[Kernel source<br/>build.toml + C++/CUDA] --> B[kernel-builder CLI]
B -->|pyproject + setup.py| C[Per-backend wheels<br/>build/torch-cuda, ...]
C -->|upload| D[(Hugging Face Hub<br/>repo_type=kernel)]
D -->|get_kernel| E[kernels Python package]
E -->|torch/TVM-FFI| F[User model]
B -.-> G[kernels-data<br/>config + metadata]
E -.-> G
G -->|Backend enum| BThe shared kernels-data crate defines the Backend enum (Cann, Cpu, Cuda, Metal, Neuron, Rocm, Xpu) and the build.toml schema used by every component, ensuring that the CLI, the Python loader, and the Hub metadata stay in lock-step (kernels-data/src/lib.rs, kernels-data/bindings/python/src/lib.rs).
Build System: `kernel-builder`
The Rust-based kernel-builder CLI exposes the full kernel lifecycle through subcommands. Reading kernel-builder/src/main.rs, the available commands include Init (scaffold a new kernel), CheckConfig / CheckBuilds (validate build.toml and outputs), CreatePyproject (render setup.py and pyproject.toml for a specific backend), Devshell (drop into a Nix dev shell), FillCard (render the Hub model card), and Upload.
`build.toml` and kernel identification
Every kernel declares its identity in build.toml. The [general] section lists the supported backends, the Python name, license, and an optional [general.hub] block with the destination repo-id and branch. The kernels-data parser materializes this into a typed Build structure and rejects invalid configurations (kernels-data/src/config/v1.rs).
At build time each artifact is suffixed with a unique identifier to avoid module name collisions when multiple versions of the same kernel are loaded side by side. KernelIdentifier::new derives a Git short hash from the source tree and falls back to a random string when Git is unavailable, then composes identifiers of the form _<name>_<backend>_<unique_id> (kernel-builder/src/pyproject/ops_identifier.rs).
Per-backend templates
The CLI generates a backend-specific setup.py from Jinja templates. The Torch CPU/noarch template implements a custom BuildKernel setuptools command that reads build.toml, intersects the requested --backends with those declared in [general], and invokes the per-backend builder under build/ (kernel-builder/src/pyproject/templates/torch/noarch/setup.py). A parallel tvm_ffi template renders the same scaffolding for the experimental TVM-FFI framework (kernel-builder/src/pyproject/tvm_ffi/mod.rs). The nix-builder wraps the same flow inside a reproducible Nix expression, leveraging the Hugging Face binary cache for fast incremental builds (nix-builder/README.md).
Uploading and model cards
upload resolves the target repo_id and branch from CLI args, the build.toml [general.hub] block, or the per-variant metadata.json files. This fallback chain lets authors pin a build to a specific branch such as build-toml-branch while still allowing CI overrides (kernel-builder/src/upload.rs). The companion fill-card command renders CARD.md from a template that includes usage snippets, available functions, optional layers, and benchmark instructions (kernel-builder/src/card.rs, kernel-builder/src/init/templates/CARD.md).
Python Runtime: the `kernels` Package
The runtime is the consumer-facing half of the project. It centers on get_kernel, which resolves a repo_id plus version to a metadata.json, picks the best variant for the current PyTorch build, downloads the wheel, and registers it as a loadable Python module — even from a path outside PYTHONPATH (kernels/src/kernels/utils.py). The KERNELS_CACHE environment variable configures the cache directory, and LOCAL_KERNELS allows overriding repo IDs with local build directories for development without uploading.
Versioning, lockfiles, and reproducibility
Starting with v0.15.1, version is mandatory when calling get_kernel; bare repo lookups are no longer accepted (see v0.15.1 release notes). The runtime ships a lockfile mechanism that pins every file in a variant to its LFS/Blob SHA, which is essential for reproducibility (kernels/src/kernels/lockfile.py). This is complemented by a proposal to record the kernel-builder Git SHA plus a dirty flag in the build metadata so that consumers can detect builds produced from uncommitted sources (issue #648).
Layers, functions, and benchmarks
Beyond raw module loading, the package exposes higher-level integration patterns:
| Construct | Purpose | Source |
|---|---|---|
FuncRepository | Reference a single function (e.g. silu_and_mul) inside a kernel repo; supports can_torch_compile / can_backward (v0.15.2) | kernels/src/kernels/layer/func.py |
use_kernel_func_from_hub | Decorator factory that makes a function kernel-pluggable | kernels/src/kernels/layer/func.py |
Benchmark | Base class for kernels benchmark <repo> scripts with setup, verify_*, and timing hooks | kernels/src/kernels/benchmark.py |
Community and Operations
A few community-driven concerns have shaped the architecture. The Hub is now a first-class "kernel" repo type, with an overview page at huggingface.co/kernels that lets users filter by backend (v0.14.0). Trust gating uses the Hub API to check publishers, addressing issue around untrusted kernels. Security analysis reports for kernels-community repos are tracked in issue #657, and additional integration requests (FlagOS, causal-conv1d, XPU skill packaging) are openly discussed in the tracker (kernel-builder/skills/xpu-kernels/README.md, issue #130, issue #317).
See Also
Source: https://github.com/huggingface/kernels / Human Manual
Loading Kernels with the `kernels` Python Package
Related topics: Project Overview and System Architecture, Building Kernels with kernel-builder and the Nix Builder, Example Kernels and Backend Variants
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview and System Architecture, Building Kernels with kernel-builder and the Nix Builder, Example Kernels and Backend Variants
Loading Kernels with the `kernels` Python Package
The kernels package is a thin runtime layer that lets Python applications and libraries pull pre-built compute kernels (CUDA, ROCm, XPU, Metal, CPU, Neuron, CANN) directly from the Hugging Face Hub and load them as if they were regular Python modules. Unlike a normal pip-installed extension, a Hub-loaded kernel is portable (it can be loaded from paths outside PYTHONPATH), unique (multiple versions can coexist in a single process), and compatible (it works across recent Python versions and multiple PyTorch ABIs) Source: [kernels/README.md:1-12].
High-Level Loading Flow
When a caller invokes get_kernel("owner/repo", version=N), the package resolves the requested Hub revision, enumerates the available per-backend build variants, selects the variant that matches the host's current backend (e.g. cuda, xpu, cpu, metal, rocm, neuron, cann), downloads the wheel into a local cache, and imports the resulting Python module. The selected module is then registered in a process-global registry so it can be inspected with get_loaded_kernels() Source: [kernels/src/kernels/utils.py:1-120].
flowchart LR
A[User calls get_kernel] --> B[Resolve Hub repo + version]
B --> C[Fetch variant list via Hub API]
C --> D{Backend matches host?}
D -- No --> E[Raise unsupported backend]
D -- Yes --> F[Download wheel to KERNELS_CACHE]
F --> G[Import module from cache path]
G --> H[Register LoadedKernel in _loaded_kernels]
H --> I[Return module to caller]Core Public API
The package exposes a small, focused set of entry points. The table below summarises the loaders and their parameters as defined in the source.
| Function | Source | Key arguments | Returns |
|---|---|---|---|
get_kernel(repo_id, version=..., revision=..., backend=..., trust_remote_code=...) | kernels/src/kernels/utils.py | repo_id ("owner/name"), version (required integer, see v0.15.1 below), revision (branch/tag/sha), backend (auto-detected), trust_remote_code | Imported ModuleType |
get_local_kernel(path, ...) | kernels/src/kernels/utils.py | Local Path to a built kernel tree (e.g. build/) | Imported ModuleType |
load_kernel(repo_id, lockfile=..., backend=..., revision=...) | kernels/src/kernels/utils.py:170-200 | Mutually exclusive lockfile or revision; if both absent the locked SHA is read from caller package metadata | Imported ModuleType |
get_loaded_kernels() | kernels/src/kernels/utils.py:60-80 | None | list[LoadedKernel] snapshot |
get_local_kernel_overrides() (via LOCAL_KERNELS) | kernels/src/kernels/utils.py:90-120 | Colon-separated name=path entries | Mapping of repo name → local path |
A minimal end-to-end example
import torch
from kernels import get_kernel
# `version` is required since v0.15.1
activation = get_kernel("kernels-community/activation", version=1)
x = torch.randn((10, 10), dtype=torch.float16, device="cuda")
y = torch.empty_like(x)
activation.gelu_fast(y, x)
Source: [kernels/README.md:21-39]
The `LoadedKernel` Data Model
Every successfully imported kernel is wrapped in a LoadedKernel dataclass that captures both the runtime handle and the descriptive metadata needed for introspection, logging, and reproducibility checks Source: [kernels/src/kernels/utils.py:30-80].
| Field | Type | Meaning | |
|---|---|---|---|
metadata | Metadata | Backend-agnostic descriptor: id, name, version, license, upstream, source, python_depends, backend | |
module | ModuleType | The imported Python module exposing the kernel ops | |
repo_info | `RepoInfo \ | None` | (repo_id, revision) for Hub loads; None for get_local_kernel / load_kernel / get_locked_kernel |
Metadata and the Backend enum are produced by the Rust core (kernels-data) and re-exported to Python through PyO3 bindings. The supported backends are CANN, CPU, CUDA, Metal, Neuron, ROCm, and XPU Source: [kernels-data/bindings/python/src/lib.rs:1-60].
Configuration & Environment Variables
Two environment variables shape the runtime behaviour of the loader:
KERNELS_CACHE— overrides the directory where downloaded wheels are stored. If unset, the package falls back to its default cache location Source: [kernels/src/kernels/utils.py:82-88].LOCAL_KERNELS— a colon-separated list ofrepo_name=pathentries that take precedence over Hub downloads. This is the recommended way to point an application at a freshly built kernel during development, and is exactly how thekernel-builder initexample script tests a local build Source: [kernels/src/kernels/utils.py:90-120; kernel-builder/src/init/templates/example.py:1-20].
The trust_remote_code flag (passed to get_kernel) controls whether the loader will execute repository code that is not signed by a trusted Hub publisher. As of v0.14.1 the package consults the Hub API to verify publisher trust rather than relying solely on a local allow-list Source: [kernels/src/kernels/utils.py:130-160].
Integration Patterns: `FuncRepository` and Decorators
For library authors who want to *map* PyTorch modules onto Hub-backed kernel functions, the package ships a layer system. FuncRepository references a single function inside a Hub kernel repo and exposes it as a torch.nn.Module subclass; LocalFuncRepository does the same for a function inside a locally built kernel directory. Both classes override __hash__ / __eq__ so they can be used as dictionary keys in registry-style dispatch tables Source: [kernels/src/kernels/layer/func.py:1-120].
The companion decorator use_kernel_func_from_hub(func_name) rewrites a plain Python function so that, when called on a torch.nn.Module, it is replaced by the Hub kernel version — provided the caller has explicitly opted in via trust_remote_code. v0.15.2 added first-class support for can_torch_compile and can_backward flags on FuncRepository, allowing the loader to skip functions that would not survive torch.compile or autograd tracing Source: [kernels/src/kernels/layer/func.py:120-180; release v0.15.2].
Benchmarking Loaded Kernels
The CLI subcommand kernels benchmark <repo_id> runs user-defined benchmarks against a loaded kernel. Users subclass kernels.benchmark.Benchmark, implement setup() plus one or more benchmark_*() methods, and (optionally) verify_*() methods that return a reference tensor. The runner handles device synchronisation across CUDA, XPU, and MPS, supports warmup/iteration counts, and can upload results to the Hub via POST /api/kernels/{repo_id}/benchmarks Source: [kernels/src/kernels/benchmark.py:1-50; kernels/src/kernels/cli/benchmark.py:1-120].
import torch
from kernels import Benchmark
class SiluBenchmark(Benchmark):
def setup(self):
self.x = torch.randn(128, 1024, device=self.device, dtype=torch.float16)
def benchmark_silu(self):
self.kernel.silu_and_mul(self.x)
Source: [kernels/src/kernels/benchmark.py:10-30]
Common Failure Modes
| Symptom | Likely cause | Fix |
|---|---|---|
ValueError: version is required | Pre-v0.15.1 call without version=. Since v0.15.1, specifying the kernel version is mandatory. | Pass an explicit version=N. Source: [release v0.15.1] |
RuntimeError: no matching backend | Host PyTorch is built for a backend that the repo does not publish (e.g. loading a CUDA-only kernel on an XPU machine). | Pick a repo that publishes the matching variant, or set backend= explicitly. Source: [kernels/src/kernels/utils.py:130-160] |
ValueError: lockfile and revision both cannot be specified | Passed both to load_kernel. | Use exactly one. Source: [kernels/src/kernels/utils.py:170-200] |
| Stale local build being ignored | A wheel from a previous run is still in the cache and is preferred over LOCAL_KERNELS. | Clear KERNELS_CACHE or set LOCAL_KERNELS so local paths win. Source: [kernels/src/kernels/utils.py:82-120] |
| "Dirty" build warning | Build emitted by an uncommitted kernel-builder checkout. | Pin a released kernel-builder version. Source: [issue #648] |
See Also
- kernels/README.md — package overview and quick start
- kernel-builder/README.md — building and uploading kernels
- kernels/src/kernels/utils.py — loader implementation
- kernels/src/kernels/layer/func.py —
FuncRepositoryand decorator - kernels/src/kernels/benchmark.py — benchmarking base class
- Hub kernels overview — searchable registry of published kernels
- Release notes: v0.15.2, v0.15.1, v0.14.1, v0.14.0
Source: https://github.com/huggingface/kernels / Human Manual
Building Kernels with `kernel-builder` and the Nix Builder
Related topics: Project Overview and System Architecture, Loading Kernels with the kernels Python Package, Example Kernels and Backend Variants
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview and System Architecture, Loading Kernels with the kernels Python Package, Example Kernels and Backend Variants
Building Kernels with `kernel-builder` and the Nix Builder
Overview and Purpose
The kernel-builder CLI (with its companion Nix package in nix-builder/) is the upstream half of the Hugging Face Kernels system. It scaffolds, compiles, validates, and uploads compute kernels that are then loaded at runtime by the kernels Python package. According to nix-builder/README.md, the builder exists to guarantee three properties:
- Portable — kernels can be loaded from paths outside
PYTHONPATH. - Unique — multiple versions of the same kernel can coexist in a single Python process.
- Compatible — kernels support recent Python versions and the various PyTorch build configurations (different CUDA versions and C++ ABIs).
Under the hood, the builder is a Rust binary defined in kernel-builder/src/main.rs that orchestrates a Nix-based build via the nix CLI, while a separate Python setup.py template drives the per-backend compilation. The whole pipeline produces Hub-compatible kernel repositories that can later be installed with kernels.install_kernel (see kernels/src/kernels/__init__.py).
The Kernel-Builder CLI
Subcommands
The CLI exposes its commands via the Cli enum in kernel-builder/src/main.rs. The headline commands are:
| Command | Purpose |
|---|---|
init | Scaffold a new kernel project from a template. |
build / build-and-copy | Compile the kernel variants via Nix. |
devshell | Spawn a Nix development shell for hacking on the kernel. |
create-pyproject | Render the generated CMake/pyproject.toml for inspection. |
check-config | Validate build.toml. |
check-abi | Verify ABI compatibility of an extension. |
check-builds | Validate already-built artifacts. |
upload | Push build artifacts to the Hugging Face Hub. |
build-and-upload | Build and upload in one step. |
fill-card | Render the CARD.md template for the kernel. |
Scaffolding a Kernel
kernel-builder init parses positional arguments through the InitArgs struct in kernel-builder/src/init.rs. It accepts an optional --name OWNER/REPO flag and a --backends list whose default values come from default_init_backends(). The BackendSelection enum supports the literal "all" (matching every supported backend) or any value accepted by Backend::from_str, which is defined in kernels-data/src/config/mod.rs. The supported backends are listed below.
Build, Upload, and Reproducibility
run_build and run_build_and_copy in kernel-builder/src/build.rs prepare a Nix flake and dispatch either nix build (per-variant attribute redistributable.{variant}) or nix run against the build-and-copy attribute. Each produced kernel carries a unique identifier assembled by KernelIdentifier::to_string_for_backend in kernel-builder/src/pyproject/ops_identifier.rs, formatted as _{name}_{backend}_{unique_id}. The identifier is derived from a Git short hash when available, otherwise from a random string — which is the reproducibility signal discussed in issue #648, where a dirty boolean and the commit SHA of kernel-builder itself should be embedded in the build metadata.
run_upload in kernel-builder/src/upload.rs reads --repo-id and --branch either from CLI flags or from [general.hub] in build.toml and metadata.json. If neither is set, it falls back to detect_branch_from_metadata, which inspects each variant directory for metadata.json to derive the version branch.
Build Configuration and Backend Support
Backend Matrix
kernels-data/src/config/mod.rs defines Backend::all() as the canonical seven-backend set: Cann, Cpu, Cuda, Metal, Neuron, Rocm, Xpu. nix-builder/README.md summarises the current support tier:
| Backend | Kernels runtime | Kernel-builder | CI validated | Tier |
|---|---|---|---|---|
| CUDA | ✓ | ✓ | ✓ | 1 |
| ROCm | ✓ | ✓ | ✗ | 2 |
| XPU | ✓ | ✓ | ✗ | 2 |
| Metal | ✓ | ✓ | ✗ | 2 |
| Huawei NPU | ✓ | ✗ | ✗ | 3 |
| Neuron | ✓ (experimental) | ✗ | ✗ | 3 |
The same file warns that Neuron support is experimental and currently requires pre-release packages.
FFI Backends: Torch and TVM FFI
Per-backend scaffolding is generated by two sibling modules. kernel-builder/src/pyproject/torch/mod.rs handles Torch FFI builds, embedding CMake helpers such as build-variants.cmake, kernel.cmake, and backend-specific toolchains (e.g. compile-metal.cmake, hipify.py, metallib_to_header.py). kernel-builder/src/pyproject/tvm_ffi/mod.rs does the same for the newer TVM FFI backend, including a CUDA capability-detection script.
Both modules feed the same render_kernel_components function in kernel-builder/src/pyproject/kernel.rs, which switches on the Kernel variant (Cpu, Cuda, Rocm, Metal, Xpu) and emits the correct set of source paths for each kernel.
Setup and `build.toml`
After scaffolding, the project ships a build.toml and a generated setup.py. The template at kernel-builder/src/pyproject/templates/torch/noarch/setup.py defines a BuildKernel command that:
- Reads
build.tomlwithtomllib(Python 3.11+) ortomli. - Reads
general.backendsand intersects with--backends=…if provided. - Creates a
build/directory and invokesbuild_backendfor each requested backend.
A typical Nix invocation against an example project is shown in the README:
cd examples/relu
nix run .#build-and-copy \
--max-jobs 2 \
--cores 8 \
-L
To accelerate rebuilds, the README recommends enabling the Hugging Face binary cache with cachix use huggingface.
Uploading, Loading, and Tooling
Once the artifacts exist, kernel-builder upload (or the combined build-and-upload) pushes each variant to the Hub. After upload, downstream users consume kernels through the Python API surface listed in kernels/src/kernels/__init__.py: get_kernel, get_kernel_variants, install_kernel, get_local_kernel, get_locked_kernel, and helpers such as use_kernel_func_from_hub and replace_kernel_forward_from_hub. As announced in the v0.15.1 release notes, specifying a kernel version is now mandatory:
# Not valid anymore!
activation = kernels.get_kernel("kernels-community/activation")
# Required form:
activation = kernels.get_kernel("kernels-community/activation", version=1)
For iteration and benchmarking, kernels/src/kernels/benchmark.py defines a Benchmark base class that auto-loads the kernel from a repo_id, runs setup(), and exposes benchmark_* / verify_* methods for the kernels benchmark runner. The CLI reference for kernel-builder is itself a recent pain point (see docs issue #621), and the Builder README notes that contributors can provision an EC2 development workspace via the scripts in terraform/README.md, which seeds a nix develop shell and a 1 TiB data volume for kernel work.
See Also
Source: https://github.com/huggingface/kernels / Human Manual
Example Kernels and Backend Variants
Related topics: Project Overview and System Architecture, Loading Kernels with the kernels Python Package, Building Kernels with kernel-builder and the Nix Builder
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Project Overview and System Architecture, Loading Kernels with the kernels Python Package, Building Kernels with kernel-builder and the Nix Builder
Example Kernels and Backend Variants
The examples/kernels/ directory in the repository contains reference kernel projects that demonstrate how to structure, configure, build, and distribute kernels for the Hugging Face Hub. Each example pairs a simple kernel implementation (ReLU) with a different execution backend or FFI layer, giving kernel authors a working template they can copy and adapt.
The three example projects currently shipped are:
examples/kernels/relu— a baseline ReLU kernel built with the default Torch extension pipelineexamples/kernels/relu-torch-stable-abi— a ReLU kernel built against PyTorch's stable C++ ABIexamples/kernels/relu-tvm-ffi— a ReLU kernel exposed through the TVM-FFI layer
Supported Backends
The build pipeline resolves backend targets from the general.backends array in build.toml. The full set of supported backends is enumerated in the shared Backend enum used across the data model and CLI (Source: kernels-data/src/config/mod.rs:60-100):
| Backend | Description |
|---|---|
cpu | CPU reference implementation |
cuda | NVIDIA CUDA |
metal | Apple Metal |
rocm | AMD ROCm / HIP |
xpu | Intel XPU |
neuron | AWS Trainium / Neuron (NKI) |
cann | Huawei CANN |
Each backend produces a separate artifact under the repository's build/ directory, identified by a directory name such as torch-cuda, torch-xpu, or torch-metal. The variant string and CMake/pyproject generation for these backends is implemented in the Torch pyproject module (Source: kernel-builder/src/pyproject/torch/mod.rs:1-30).
FFI Variants
Beyond the execution backend, the example kernels illustrate two distinct foreign-function interface (FFI) layers used to bridge C++ kernels to Python:
- Torch C++ extension — the default FFI used by the
reluandrelu-torch-stable-abiexamples. Source files are compiled into a CPython extension and the functions are registered throughtorch::Library/TORCH_LIBRARYmacros. Thesetup.pytemplate used for backend-agnostic builds lives atSource: kernel-builder/src/pyproject/templates/torch/noarch/setup.py. - TVM FFI — used by the
relu-tvm-ffiexample. Kernels are registered through the TVM-FFI object system instead oftorch::Library, which makes them consumable by any TVM-based runtime. The corresponding pyproject generation is implemented inSource: kernel-builder/src/pyproject/tvm_ffi/mod.rs:1-40, including a dedicatedtvm_ffi/setup.pytemplate.
Selecting a non-default FFI is driven by the build.toml schema parsed in Source: kernels-data/src/config/v1.rs:1-50, which the builder migrates into the current internal representation before rendering templates.
Common Project Layout
All three examples follow the same scaffold produced by kernel-builder init and described in the Nix-builder README (Source: nix-builder/README.md:1-40):
build.toml— declarative build configuration. Required fields aregeneral.name,general.license, andgeneral.backends; the optionalgeneral.hubsection suppliesrepo-idandbranchfor Hub distribution.flake.nix— Nix expression that drops the user into a reproducible development shell with thekernel-builderCLI onPATHand a writable Cachix cache for pre-built dependencies.CARD.md— Jinja2 template for the Hub model card. The master template lives atSource: kernel-builder/src/init/templates/CARD.mdand is filled in at upload time bySource: kernel-builder/src/card.rs, which inspects the kernel'storch-ext/<module>/__init__.pyandlayers/__init__.pyto enumerate functions and layers.torch-ext/<module>/— directory containing the C++ / CUDA / Metal / HIP source for the kernel, together with an__init__.pydeclaring the exposed functions and (optionally) alayers/__init__.pydeclaringnn.Modulewrappers.
The init command is also where authors declare which backends to enable when scaffolding a new project; the supported values are described in Source: kernel-builder/src/init.rs and map one-to-one to the Backend enum values listed above.
Build and Distribution Flow
The canonical build path for an example kernel is:
nix develop path:examples/kernels/relu
kernel-builder build
kernel-builder build reads build.toml, generates a per-backend setup.py / CMake configuration, and produces one wheel per backend under build/. Uploading those wheels to the Hub is handled by kernel-builder upload, which infers the destination repo-id and branch from CLI arguments, the general.hub section of build.toml, or the variant metadata.json (in that order of precedence) (Source: kernel-builder/src/upload.rs:1-40).
Once a kernel is on the Hub, it can be loaded from Python with an explicit version (required as of kernels v0.15.1):
from kernels import get_kernel
activation = get_kernel("kernels-community/relu", version=1)
relu = activation.relu
The current environment is matched against the available variants by get_kernel_variants, which returns a sorted list of compatibility decisions with the most preferred variant first (Source: kernels/src/kernels/utils.py:1-60). The matching criteria combine Python version, Torch version, the active CUDA/XPU/Metal backend, and the kernel's declared backends array.
See Also
kernel-builderCLI referencebuild.tomlschema reference (v1 migration handled inkernels-data/src/config/v1.rs)- Hub kernel repository type (introduced in v0.14.0)
- Issue #648 — tracking dirty-build reproducibility metadata that future versions of the example kernels will surface.
Source: https://github.com/huggingface/kernels / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 9 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/huggingface/kernels/issues/651
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/huggingface/kernels/issues/648
3. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/huggingface/kernels
4. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/huggingface/kernels
5. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/huggingface/kernels
6. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/huggingface/kernels
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/huggingface/kernels/issues/657
8. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/huggingface/kernels
9. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/huggingface/kernels
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using kernels with real data or production workflows.
- Push security analysis reports on the Hub for kernels-community kernel r - github / github_issue
- Guide users without kernel publishing access to open a discussion on ker - github / github_issue
- Signal when a kernel build comes from an uncommitted (dirty) kernel-buil - github / github_issue
- v0.15.2 - github / github_release
- v0.15.1 - github / github_release
- v0.14.1 - github / github_release
- v0.14.0 - github / github_release
- v0.14.0.dev1 - github / github_release
- v0.14.0.dev0 - github / github_release
- v0.13.0 - github / github_release
- v0.12.3 - github / github_release
- v0.12.2 - github / github_release
Source: Project Pack community evidence and pitfall evidence