Match the project to your task before installing it.
Agent SDK and Runtime · Preview
promptfoo
Agent SDK project for checking tool calls, state, handoffs, traces, evaluation, and permission boundaries.
Check whether this project matches your task before installing it.
What it can doAgent runtime preflights, tool permissions, state/handoff boundaries, trace acceptance, and evaluation checksReview the portable capability path.
Before continuingVerify in a sandboxDo not treat a preview pack as a proven local install.
GitHub snapshot22k stars2.0k forks · 299 contributors
Doramagic.ai Last verification date: 2026-06-21 Verification method: source evidence, semantic profile, public page gate, and static build acceptance.
Preview status · 2026-06-21
What is promptfoo?
- promptfoo is an Agent SDK or runtime for tool calls, state, handoffs, tracing, and evaluation boundaries.
- Best fit: Developers building observable, testable, multi-tool agent applications.
- Not for: Not for one prompt, simple API calls, or environments that cannot isolate tool permissions.
- Capability added to an AI workflow: Agent runtime preflights, tool permissions, state/handoff boundaries, trace acceptance, and evaluation checks
- First safe verification step: Verify one minimal agent loop with fake tools and temporary credentials first.
- Verification state: source, Quick Start, and sandbox install checks are recorded as passed.
- Top risk: Upgrade or migration may change expected behavior: 0.121.8
- Evidence base: https://github.com/promptfoo/promptfoo, https://github.com/promptfoo/promptfoo#readme, Human Manual, Pitfall Log
01
Quick decision
Use this section to decide whether the project is worth a deeper read.Agent SDK project for checking tool calls, state, handoffs, traces, evaluation, and permission boundaries.
22k stars · 2.0k forks
02
What it can do
Translate the upstream project into concrete capabilities the user can judge before installing.Core Evaluation Engine & Architecture
Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment
Source: https://github.com/promptfoo/promptfoo / Human Manual
LLM Provider Ecosystem & Custom Integrations
Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing
Source: https://github.com/promptfoo/promptfoo / Human Manual
Red Teaming & Adversarial Security Testing
Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment
Source: https://github.com/promptfoo/promptfoo / Human Manual
Web UI, Code Scanning, Server & Deployment
Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing
Source: https://github.com/promptfoo/promptfoo / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Source: Doramagic discovery, validation, and Project Pack records
Sources: https://github.com/promptfoo/promptfoo, Human Manual, Project Pack evidence, and downstream validation signals.
03
Community Discussion Evidence
Project-level external discussion stays visible on the detail page, not only inside the manual.Community Discussion Evidence
12 source-linked itemsReview these external discussions before using promptfoo with real data or production workflows. They are review inputs, not standalone proof that the project is production-ready.
-
01
Per-test-case `repeat` option to control how many times individual tests
github / github_issue
-
02
code-scan-action: 0.1.8
github / github_release
-
03
0.121.17
github / github_release
-
04
0.121.16
github / github_release
-
05
0.121.15
github / github_release
-
06
0.121.14
github / github_release
-
07
code-scan-action: 0.1.7
github / github_release
-
08
0.121.13
github / github_release
-
09
code-scan-action: 0.1.6
github / github_release
-
10
0.121.12
github / github_release
-
11
0.121.11
github / github_release
-
12
0.121.10
github / github_release
04
How to start
Only source-backed commands are shown here. Verify them in an isolated environment first.Try the prompt first
Test the workflow without installing the upstream project.
previewRead the Human Manual
Understand inputs, outputs, limits, and failure modes.
manualTake context to your AI host
Use the compiled assets in your preferred AI environment.
contextRun sandbox verification
Confirm install commands and rollback before using a primary environment.
verifynpm install -g promptfooOfficial start command · https://github.com/promptfoo/promptfoo#readme · verified: yes
05
Human Manual
The English page must expose the real manual, not a short placeholder.8+ sections · Human Manual
promptfoo Manual
Promptfoo is described in its manifest as an "LLM eval & testing toolkit" distributed as a Node.js ES module with dual entry points for import and require, and ships CLI binaries promptfoo...
Open the full manual- https://github.com/promptfoo/promptfoo Project Manual
- Table of Contents
- Core Evaluation Engine & Architecture
- Related Pages
- Purpose and Scope
- MCP Tool Surface
- Provider and Assertion Architecture
- Redteam Subsystem
Core Evaluation Engine & Architecture
Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment
Source: https://github.com/promptfoo/promptfoo / Human Manual
LLM Provider Ecosystem & Custom Integrations
Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing
Source: https://github.com/promptfoo/promptfoo / Human Manual
Red Teaming & Adversarial Security Testing
Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment
Source: https://github.com/promptfoo/promptfoo / Human Manual
Web UI, Code Scanning, Server & Deployment
Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing
Source: https://github.com/promptfoo/promptfoo / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Source: Doramagic discovery, validation, and Project Pack records
06
AI Context Pack and portable assets
After deciding to continue, take the project context into your own AI host.Complete pack plus user-owned assets
These files are planning and verification assets for Claude Code, Codex, Gemini, Cursor, ChatGPT, and other AI hosts.
07
Preflight checks
Treat this page as a planning asset, not proof that your local environment is ready.- The manual is generated from source-linked project files and Doramagic validation signals.
- Community evidence warnings stay visible instead of being converted into marketing claims.
- This preview remains noindex and excluded from sitemap/llms citation targets until English quality and index gates pass.
- Use the upstream repository as the final authority for installation commands, license, and version-specific behavior.
08
Pitfall Log and verification risks
Doramagic surfaces high-risk items before users treat a candidate capability as verified.Installation risk requires verification
Upgrade or migration may change expected behavior: 0.121.8
Installation risk requires verification
Upgrade or migration may change expected behavior: code-scan-action: 0.1.6
Configuration risk requires verification
May increase setup, validation, or first-run risk for the user.
Configuration risk requires verification
Upgrade or migration may change expected behavior: 0.121.15
Configuration risk requires verification
Developers may misconfigure credentials, environment, or host setup: Per-test-case `repeat` option to control how many times individual tests run
Configuration risk requires verification
May increase setup, validation, or first-run risk for the user.
Capability evidence risk requires verification
May increase setup, validation, or first-run risk for the user.
Runtime risk requires verification
Upgrade or migration may change expected behavior: 0.121.12