Doramagic.ai Chinese

Agent SDK and Runtime · Preview

promptfoo

Agent SDK project for checking tool calls, state, handoffs, traces, evaluation, and permission boundaries.

Agent SDKTool callsHandoffsTracingEvaluation boundaries

Last verification date: 2026-06-21 Verification method: source evidence, semantic profile, public page gate, and static build acceptance.

Preview status · 2026-06-21

What is promptfoo?

01

Quick decision

Use this section to decide whether the project is worth a deeper read.
Best forDevelopers building observable, testable, multi-tool agent applications.

Match the project to your task before installing it.

CapabilityAgent runtime preflights, tool permissions, state/handoff boundaries, trace acceptance, and evaluation checks

Agent SDK project for checking tool calls, state, handoffs, traces, evaluation, and permission boundaries.

Repositorypromptfoo/promptfoo

22k stars · 2.0k forks

02

What it can do

Translate the upstream project into concrete capabilities the user can judge before installing.
1

Core Evaluation Engine & Architecture

Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment

Source: https://github.com/promptfoo/promptfoo / Human Manual
2

LLM Provider Ecosystem & Custom Integrations

Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing

Source: https://github.com/promptfoo/promptfoo / Human Manual
3

Red Teaming & Adversarial Security Testing

Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment

Source: https://github.com/promptfoo/promptfoo / Human Manual
4

Web UI, Code Scanning, Server & Deployment

Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing

Source: https://github.com/promptfoo/promptfoo / Human Manual
5

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

Source: Doramagic discovery, validation, and Project Pack records

Sources: https://github.com/promptfoo/promptfoo, Human Manual, Project Pack evidence, and downstream validation signals.

03

Community Discussion Evidence

Project-level external discussion stays visible on the detail page, not only inside the manual.
Stars22k stars
Forks2.0k forks
Contributors299 contributors
Licenseunknown

Community Discussion Evidence

12 source-linked items

Review these external discussions before using promptfoo with real data or production workflows. They are review inputs, not standalone proof that the project is production-ready.

04

How to start

Only source-backed commands are shown here. Verify them in an isolated environment first.
1

Try the prompt first

Test the workflow without installing the upstream project.

preview
2

Read the Human Manual

Understand inputs, outputs, limits, and failure modes.

manual
3

Take context to your AI host

Use the compiled assets in your preferred AI environment.

context
4

Run sandbox verification

Confirm install commands and rollback before using a primary environment.

verify
npm install -g promptfoo

Official start command · https://github.com/promptfoo/promptfoo#readme · verified: yes

05

Human Manual

The English page must expose the real manual, not a short placeholder.

8+ sections · Human Manual

promptfoo Manual

Promptfoo is described in its manifest as an "LLM eval & testing toolkit" distributed as a Node.js ES module with dual entry points for import and require, and ships CLI binaries promptfoo...

Open the full manual
  1. https://github.com/promptfoo/promptfoo Project Manual
  2. Table of Contents
  3. Core Evaluation Engine & Architecture
  4. Related Pages
  5. Purpose and Scope
  6. MCP Tool Surface
  7. Provider and Assertion Architecture
  8. Redteam Subsystem
1

Core Evaluation Engine & Architecture

Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment

Source: https://github.com/promptfoo/promptfoo / Human Manual
2

LLM Provider Ecosystem & Custom Integrations

Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing

Source: https://github.com/promptfoo/promptfoo / Human Manual
3

Red Teaming & Adversarial Security Testing

Related topics: LLM Provider Ecosystem & Custom Integrations, Web UI, Code Scanning, Server & Deployment

Source: https://github.com/promptfoo/promptfoo / Human Manual
4

Web UI, Code Scanning, Server & Deployment

Related topics: Core Evaluation Engine & Architecture, Red Teaming & Adversarial Security Testing

Source: https://github.com/promptfoo/promptfoo / Human Manual
5

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

Source: Doramagic discovery, validation, and Project Pack records

06

AI Context Pack and portable assets

After deciding to continue, take the project context into your own AI host.

Complete pack plus user-owned assets

These files are planning and verification assets for Claude Code, Codex, Gemini, Cursor, ChatGPT, and other AI hosts.

07

Preflight checks

Treat this page as a planning asset, not proof that your local environment is ready.

08

Pitfall Log and verification risks

Doramagic surfaces high-risk items before users treat a candidate capability as verified.
medium

Installation risk requires verification

Upgrade or migration may change expected behavior: 0.121.8

medium

Installation risk requires verification

Upgrade or migration may change expected behavior: code-scan-action: 0.1.6

medium

Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium

Configuration risk requires verification

Upgrade or migration may change expected behavior: 0.121.15

medium

Configuration risk requires verification

Developers may misconfigure credentials, environment, or host setup: Per-test-case `repeat` option to control how many times individual tests run

medium

Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium

Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium

Runtime risk requires verification

Upgrade or migration may change expected behavior: 0.121.12