这个 skill 适合什么用户？能做哪些任务？

适合需要从 LLM 拿到强类型结构化输出的工程师：信息抽取、表单解析、JSON API 直接返回 Pydantic 模型、agent 工具调用参数解析等。20 个 provider 一致 API。访问 doramagic.ai/r/instructor 查看完整用例。

需要准备什么环境？依赖什么？

Python 3.9+（instructor 在 pyproject 中声明 >=3.9）。Pydantic v2 事实上必须（function_calls.py 用 model_validate_json + TypeAdapter，都是 v2-only，v1 在 Partial 路径上 AttributeError）。

Instructor 结构化输出

Q: 会踩哪些坑？这个 skill 怎么防护？

本 skill 内置 47 条约束（4 条 fatal）。典型踩坑：(1) failed_attempts XML 每次重试线性增长，max_retries=5 可超 context window；(2) from_openai 的 mode 验证用 assert，python -O 下静默剥离；

Instructor：声明 Pydantic BaseModel 即可从 20 个 LLM provider 拿到类型化实例。核心是 monkey-patch（instructor.patch / from_*）拦截 create()，注入 schema-aware kwargs，tenacity 重试 +

API AI 机器学习

✓ 0 人报告成功·v0.1.0·更新于 2026-04-25

晶体简介

Instructor 是把 Pydantic BaseModel 直接绑到 LLM 输出的 Python 框架（github.com/jxnl/instructor）。核心机制：monkey-patch（instructor.patch / instructor.from_*）拦截 provider client 的 create() 调用，注入 schema-aware kwargs，在 tenacity 重试循环里跑，验证 JSON 响应到模型，ValidationError 时把 failed_attempts 作为 XML 重写 prompt 再试。支持 20 个 provider × 36 个 Mode 枚举值 = 720 个 (provider, mode) 组合，走两个 dict 表分发。OpenAI 是默认 monkey-patch 目标（Mode.TOOLS 默认）；Anthropic / Google（gemini / vertexai / genai）/ 9 个 SaaS provider 各有from_* 工厂。本 skill 自带 47 条约束（含 4 条 fatal），覆盖典型踩坑：failed_attempts XML 每次重试线性增长（max_retries=5 可超 context window）、from_openai mode 验证用 assert（python -O 下静默剥离）、ollama / azure_openai / google / litellm 落到 Provider.UNKNOWN（assert 和 ModeError 都不触发）。

Blueprint Source

finance-bp-139

jxnl/instructor3f1d6dd1 source file

Constraints

4total

4fatal

4 must-not-violate

Evidence Quality

Confidence90%

High confidence — strong evidence base

4 条不可违反的约束

FATALdomain_ruleinstructor-C-001

WHENWhen deploying instructor with from_openai (or routes converging on it: OpenAI / OpenRouter / Anyscale / Together / Databricks) to production

ACTIONrun the Python interpreter with the -O optimization flag, because from_openai validates the (provider, mode) pair via Python assert statements that -O strips silently

CONSEQUENCEUnder python -O, the assert mode in {...} blocks in from_openai are removed; invalid (provider, mode) combinations reach the LLM call producing malformed kwargs, undefined provider responses, or silently wrong-shaped completions across 5 OpenAI-family base_urls

domain-constraint

FATALdomain_ruleinstructor-C-003

WHENWhen pointing instructor at self-hosted OpenAI-compatible endpoints (vLLM / TGI / Ollama / LiteLLM proxy) or providers whose base_url is not in the 16-substring table (azure_openai / google / litellm / ollama)

ACTIONrely on instructor's automatic mode validation, because get_provider() will return Provider.UNKNOWN — neither the from_openai assert blocks nor the raise ModeError branches fire, leaving the (provider, mode) pair entirely unchecked

CONSEQUENCESelf-hosted endpoints fall to Provider.UNKNOWN; assert blocks dispatch on Provider enum values (OPENROUTER/ANYSCALE/TOGETHER/OPENAI/DATABRICKS) so all assertions silently pass, and provider-specific optimizations are skipped — debugging wrong-shaped responses requires reading the dispatch table source

domain-constraint

FATALdomain_ruleinstructor-C-007

WHENWhen passing max_retries to instructor (especially via from_provider)

ACTIONtreat max_retries as a single semantic — it appears at three independent code points with different defaults: patch.py default=1 (reask only), Instructor.create default=3 (reask only), and auto_client.py:180-185 transparently passes it to openai.OpenAI(max_retries=...) which is the SDK's HTTP-level retry (network only) — a single max_retries=5 to from_provider can yield 5 reasks × 5 SDK HTTP retries = 25 worst-case API calls

CONSEQUENCEPassing one max_retries through from_provider transparently amplifies into both instructor reask and SDK HTTP retry layers, producing up to N×N API calls; on rate-limited or pay-per-call providers this drains the cost budget and triggers vendor throttling cascades within a single user request

domain-constraint

常见问题

讨论 (0)

类型

📎附加 .md 文件（可选，≤500KB）

暂无讨论，成为第一个发言的人吧！

更新历史

v0.1.02026-04-25·贡献者： tangweigang-jpg

v0.1.0: 首次发布到 Doramagic.ai。基于 jxnl/instructor 的 Pydantic 结构化输出框架，中英双语 + 47 条 anti-pattern 约束（4 条 fatal）+ 3 条 FAQ。

v0.1.02026-04-25·贡献者： tangweigang-jpg

v0.1.0: 首次发布到 Doramagic.ai。基于 jxnl/instructor 的 Pydantic 结构化输出框架，中英双语 + 47 条 anti-pattern 约束（4 条 fatal）+ 3 条 FAQ。