# neo4j-graphrag-python - Doramagic AI Context Pack

> 定位：安装前体验与判断资产。它帮助宿主 AI 有一个好的开始，但不代表已经安装、执行或验证目标项目。

## 充分原则

- **充分原则，不是压缩原则**：AI Context Pack 应该充分到让宿主 AI 在开工前理解项目价值、能力边界、使用入口、风险和证据来源；它可以分层组织，但不以最短摘要为目标。
- **压缩策略**：只压缩噪声和重复内容，不压缩会影响判断和开工质量的上下文。

## 给宿主 AI 的使用方式

你正在读取 Doramagic 为 neo4j-graphrag-python 编译的 AI Context Pack。请把它当作开工前上下文：帮助用户理解适合谁、能做什么、如何开始、哪些必须安装后验证、风险在哪里。不要声称你已经安装、运行或执行了目标项目。

## Claim 消费规则

- **事实来源**：Repo Evidence + Claim/Evidence Graph；Human Wiki 只提供显著性、术语和叙事结构。
- **事实最低状态**：`supported`
- `supported`：可以作为项目事实使用，但回答中必须引用 claim_id 和证据路径。
- `weak`：只能作为低置信度线索，必须要求用户继续核实。
- `inferred`：只能用于风险提示或待确认问题，不能包装成项目事实。
- `unverified`：不得作为事实使用，应明确说证据不足。
- `contradicted`：必须展示冲突来源，不得替用户强行选择一个版本。

## 它最适合谁

- **正在使用 Claude/Codex/Cursor/Gemini 等宿主 AI 的开发者**：README 或插件配置提到多个宿主 AI。 证据：`README.md` Claim：`clm_0002` supported 0.86

## 它能做什么

- **命令行启动或安装流程**（需要安装后验证）：项目文档中存在可执行命令，真实使用需要在本地或宿主环境中运行这些命令。 证据：`README.md` Claim：`clm_0001` supported 0.86

## 怎么开始

- `pip install neo4j-graphrag` 证据：`README.md` Claim：`clm_0003` supported 0.86, `clm_0004` supported 0.86
- `pip install "neo4j-graphrag[openai]"` 证据：`README.md` Claim：`clm_0004` supported 0.86

## 继续前判断卡

- **当前建议**：需要管理员/安全审批
- **为什么**：继续前可能涉及密钥、账号、外部服务或敏感上下文，建议先经过管理员或安全审批。

### 30 秒判断

- **现在怎么做**：需要管理员/安全审批
- **最小安全下一步**：先跑 Prompt Preview；若涉及凭证或企业环境，先审批再试装
- **先别相信**：真实输出质量不能在安装前相信。
- **继续会触碰**：命令执行、宿主 AI 配置、本地环境或项目文件

### 现在可以相信

- **适合人群线索：正在使用 Claude/Codex/Cursor/Gemini 等宿主 AI 的开发者**（supported）：有 supported claim 或项目证据支撑，但仍不等于真实安装效果。 证据：`README.md` Claim：`clm_0002` supported 0.86
- **能力存在：命令行启动或安装流程**（supported）：可以相信项目包含这类能力线索；是否适合你的具体任务仍要试用或安装后验证。 证据：`README.md` Claim：`clm_0001` supported 0.86
- **存在 Quick Start / 安装命令线索**（supported）：可以相信项目文档出现过启动或安装入口；不要因此直接在主力环境运行。 证据：`README.md` Claim：`clm_0003` supported 0.86, `clm_0004` supported 0.86

### 现在还不能相信

- **真实输出质量不能在安装前相信。**（unverified）：Prompt Preview 只能展示引导方式，不能证明真实项目中的结果质量。
- **宿主 AI 版本兼容性不能在安装前相信。**（unverified）：Claude、Cursor、Codex、Gemini 等宿主加载规则和版本差异必须在真实环境验证。
- **不会污染现有宿主 AI 行为，不能直接相信。**（inferred）：Skill、plugin、AGENTS/CLAUDE/GEMINI 指令可能改变宿主 AI 的默认行为。 证据：`AGENTS.md`
- **可安全回滚不能默认相信。**（unverified）：除非项目明确提供卸载和恢复说明，否则必须先在隔离环境验证。
- **真实安装后是否与用户当前宿主 AI 版本兼容？**（unverified）：兼容性只能通过实际宿主环境验证。
- **项目输出质量是否满足用户具体任务？**（unverified）：安装前预览只能展示流程和边界，不能替代真实评测。
- **安装命令是否需要网络、权限或全局写入？**（unverified）：这影响企业环境和个人环境的安装风险。 证据：`README.md`

### 继续会触碰什么

- **命令执行**：包管理器、网络下载、本地插件目录、项目配置或用户主目录。 原因：运行第一条命令就可能产生环境改动；必须先判断是否值得跑。 证据：`README.md`
- **宿主 AI 配置**：Claude/Codex/Cursor/Gemini/OpenCode 等宿主的 plugin、Skill 或规则加载配置。 原因：宿主配置会改变 AI 后续工作方式，可能和用户已有规则冲突。 证据：`AGENTS.md`
- **本地环境或项目文件**：安装结果、插件缓存、项目配置或本地依赖目录。 原因：安装前无法证明写入范围和回滚方式，需要隔离验证。 证据：`README.md`
- **环境变量 / API Key**：项目入口文档明确出现 API key、token、secret 或账号凭证配置。 原因：如果真实安装需要凭证，应先使用测试凭证并经过权限/合规判断。 证据：`README.md`, `examples/build_graph/from_config_files/simple_kg_pipeline_config.json`, `examples/build_graph/from_config_files/simple_kg_pipeline_config_url.json`
- **宿主 AI 上下文**：AI Context Pack、Prompt Preview、Skill 路由、风险规则和项目事实。 原因：导入上下文会影响宿主 AI 后续判断，必须避免把未验证项包装成事实。

### 最小安全下一步

- **先跑 Prompt Preview**：用安装前交互式试用判断工作方式是否匹配，不需要授权或改环境。（适用：任何项目都适用，尤其是输出质量未知时。）
- **只在隔离目录或测试账号试装**：避免安装命令污染主力宿主 AI、真实项目或用户主目录。（适用：存在命令执行、插件配置或本地写入线索时。）
- **先备份宿主 AI 配置**：Skill、plugin、规则文件可能改变 Claude/Cursor/Codex 的默认行为。（适用：存在插件 manifest、Skill 或宿主规则入口时。）
- **不要使用真实生产凭证**：环境变量/API key 一旦进入宿主或工具链，可能产生账号和合规风险。（适用：出现 API、TOKEN、KEY、SECRET 等环境线索时。）
- **安装后只验证一个最小任务**：先验证加载、兼容、输出质量和回滚，再决定是否深用。（适用：准备从试用进入真实工作流时。）

### 退出方式

- **保留安装前状态**：记录原始宿主配置和项目状态，后续才能判断是否可恢复。
- **准备移除宿主 plugin / Skill / 规则入口**：如果试装后行为异常，可以把宿主 AI 恢复到试装前状态。
- **记录安装命令和写入路径**：没有明确卸载说明时，至少要知道哪些目录或配置需要手动清理。
- **准备撤销测试 API key 或 token**：测试凭证泄露或误用时，可以快速止损。
- **如果没有回滚路径，不进入主力环境**：不可回滚是继续前阻断项，不应靠信任或运气继续。

## 哪些只能预览

- 解释项目适合谁和能做什么
- 基于项目文档演示典型对话流程
- 帮助用户判断是否值得安装或继续研究

## 哪些必须安装后验证

- 真实安装 Skill、插件或 CLI
- 执行脚本、修改本地文件或访问外部服务
- 验证真实输出质量、性能和兼容性

## 边界与风险判断卡

- **把安装前预览误认为真实运行**：用户可能高估项目已经完成的配置、权限和兼容性验证。 处理方式：明确区分 prompt_preview_can_do 与 runtime_required。 Claim：`clm_0005` inferred 0.45
- **命令执行会修改本地环境**：安装命令可能写入用户主目录、宿主插件目录或项目配置。 处理方式：先在隔离环境或测试账号中运行。 证据：`README.md` Claim：`clm_0006` supported 0.86
- **待确认**：真实安装后是否与用户当前宿主 AI 版本兼容？。原因：兼容性只能通过实际宿主环境验证。
- **待确认**：项目输出质量是否满足用户具体任务？。原因：安装前预览只能展示流程和边界，不能替代真实评测。
- **待确认**：安装命令是否需要网络、权限或全局写入？。原因：这影响企业环境和个人环境的安装风险。

## 开工前工作上下文

### 加载顺序

- 先读取 how_to_use.host_ai_instruction，建立安装前判断资产的边界。
- 读取 claim_graph_summary，确认事实来自 Claim/Evidence Graph，而不是 Human Wiki 叙事。
- 再读取 intended_users、capabilities 和 quick_start_candidates，判断用户是否匹配。
- 需要执行具体任务时，优先查 role_skill_index，再查 evidence_index。
- 遇到真实安装、文件修改、网络访问、性能或兼容性问题时，转入 risk_card 和 boundaries.runtime_required。

### 任务路由

- **命令行启动或安装流程**：先说明这是安装后验证能力，再给出安装前检查清单。 边界：必须真实安装或运行后验证。 证据：`README.md` Claim：`clm_0001` supported 0.86

### 上下文规模

- 文件总数：262
- 重要文件覆盖：40/262
- 证据索引条目：77
- 角色 / Skill 条目：9

### 证据不足时的处理

- **missing_evidence**：说明证据不足，要求用户提供目标文件、README 段落或安装后验证记录；不要补全事实。
- **out_of_scope_request**：说明该任务超出当前 AI Context Pack 证据范围，并建议用户先查看 Human Manual 或真实安装后验证。
- **runtime_request**：给出安装前检查清单和命令来源，但不要替用户执行命令或声称已执行。
- **source_conflict**：同时展示冲突来源，标记为待核实，不要强行选择一个版本。

## Prompt Recipes

### 适配判断

- 目标：判断这个项目是否适合用户当前任务。
- 预期输出：适配结论、关键理由、证据引用、安装前可预览内容、必须安装后验证内容、下一步建议。

```text
请基于 neo4j-graphrag-python 的 AI Context Pack，先问我 3 个必要问题，然后判断它是否适合我的任务。回答必须包含：适合谁、能做什么、不能做什么、是否值得安装、证据来自哪里。所有项目事实必须引用 evidence_refs、source_paths 或 claim_id。
```

### 安装前体验

- 目标：让用户在安装前感受核心工作流，同时避免把预览包装成真实能力或营销承诺。
- 预期输出：一段带边界标签的体验剧本、安装后验证清单和谨慎建议；不含真实运行承诺或强营销表述。

```text
请把 neo4j-graphrag-python 当作安装前体验资产，而不是已安装工具或真实运行环境。

请严格输出四段：
1. 先问我 3 个必要问题。
2. 给出一段“体验剧本”：用 [安装前可预览]、[必须安装后验证]、[证据不足] 三种标签展示它可能如何引导工作流。
3. 给出安装后验证清单：列出哪些能力只有真实安装、真实宿主加载、真实项目运行后才能确认。
4. 给出谨慎建议：只能说“值得继续研究/试装”“先补充信息后再判断”或“不建议继续”，不得替项目背书。

硬性边界：
- 不要声称已经安装、运行、执行测试、修改文件或产生真实结果。
- 不要写“自动适配”“确保通过”“完美适配”“强烈建议安装”等承诺性表达。
- 如果描述安装后的工作方式，必须使用“如果安装成功且宿主正确加载 Skill，它可能会……”这种条件句。
- 体验剧本只能写成“示例台词/假设流程”：使用“可能会询问/可能会建议/可能会展示”，不要写“已写入、已生成、已通过、正在运行、正在生成”。
- Prompt Preview 不负责给安装命令；如用户准备试装，只能提示先阅读 Quick Start 和 Risk Card，并在隔离环境验证。
- 所有项目事实必须来自 supported claim、evidence_refs 或 source_paths；inferred/unverified 只能作风险或待确认项。

```

### 角色 / Skill 选择

- 目标：从项目里的角色或 Skill 中挑选最匹配的资产。
- 预期输出：候选角色或 Skill 列表，每项包含适用场景、证据路径、风险边界和是否需要安装后验证。

```text
请读取 role_skill_index，根据我的目标任务推荐 3-5 个最相关的角色或 Skill。每个推荐都要说明适用场景、可能输出、风险边界和 evidence_refs。
```

### 风险预检

- 目标：安装或引入前识别环境、权限、规则冲突和质量风险。
- 预期输出：环境、权限、依赖、许可、宿主冲突、质量风险和未知项的检查清单。

```text
请基于 risk_card、boundaries 和 quick_start_candidates，给我一份安装前风险预检清单。不要替我执行命令，只说明我应该检查什么、为什么检查、失败会有什么影响。
```

### 宿主 AI 开工指令

- 目标：把项目上下文转成一次对话开始前的宿主 AI 指令。
- 预期输出：一段边界明确、证据引用明确、适合复制给宿主 AI 的开工前指令。

```text
请基于 neo4j-graphrag-python 的 AI Context Pack，生成一段我可以粘贴给宿主 AI 的开工前指令。这段指令必须遵守 not_runtime=true，不能声称项目已经安装、运行或产生真实结果。
```

## 角色 / Skill 索引

- 共索引 9 个角色 / Skill / 项目文档条目。

- **Sphinx Documentation**（project_doc）：Building the docs requires Python 3.9+ 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`docs/README.md`
- **Neo4j GraphRAG Package for Python**（project_doc）：The official Neo4j GraphRAG package for Python enables developers to build graph retrieval augmented generation GraphRAG https://neo4j.com/blog/graphrag-manifesto/ applications using the power of Neo4j and Python. As a first-party library, it offers a robust, feature-rich, and high-performance solution, with the added assurance of long-term support and maintenance directly from Neo4j. 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`README.md`
- **Examples Index**（project_doc）：This folder contains examples usage for the different features supported by the neo4j-graphrag package: 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`examples/README.md`
- **Usage Instructions**（project_doc）：You will need both a Pinecone vector database and a Neo4j database to use this retriever. 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`examples/customize/retrievers/external/pinecone/README.md`
- **Start services locally**（project_doc）：Run the following command to spin up Neo4j and Qdrant containers. 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`examples/customize/retrievers/external/qdrant/README.md`
- **Start services locally**（project_doc）：This is a manual task you need to do in the terminal. 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`examples/customize/retrievers/external/weaviate/README.md`
- **AGENTS.md**（project_doc）：Learnings and patterns for future agents working on this project. 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`AGENTS.md`
- **Contributing to the Neo4j Ecosystem**（project_doc）：Contributing to the Neo4j Ecosystem 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`CONTRIBUTING.md`
- **@neo4j/neo4j-graphrag-python**（project_doc）：- Experimental: GraphSchema validation now rejects KEY and EXISTENCE constraints on the same node or relationship property including composite KEY members , since KEY already implies mandatory presence. Legacy PropertyType.required migration no longer adds redundant EXISTENCE constraints for KEY-covered properties. The schema-from-text extraction prompt includes the same rule. - Experimental: the schema-from-text ex… 激活提示：当用户需要理解项目结构、安装方式或边界时参考。 证据：`CHANGELOG.md`

## 证据索引

- 共索引 77 条证据。

- **Sphinx Documentation**（documentation）：Building the docs requires Python 3.9+ 证据：`docs/README.md`
- **Neo4j GraphRAG Package for Python**（documentation）：The official Neo4j GraphRAG package for Python enables developers to build graph retrieval augmented generation GraphRAG https://neo4j.com/blog/graphrag-manifesto/ applications using the power of Neo4j and Python. As a first-party library, it offers a robust, feature-rich, and high-performance solution, with the added assurance of long-term support and maintenance directly from Neo4j. 证据：`README.md`
- **Examples Index**（documentation）：This folder contains examples usage for the different features supported by the neo4j-graphrag package: 证据：`examples/README.md`
- **Usage Instructions**（documentation）：You will need both a Pinecone vector database and a Neo4j database to use this retriever. 证据：`examples/customize/retrievers/external/pinecone/README.md`
- **Start services locally**（documentation）：Run the following command to spin up Neo4j and Qdrant containers. 证据：`examples/customize/retrievers/external/qdrant/README.md`
- **Start services locally**（documentation）：This is a manual task you need to do in the terminal. 证据：`examples/customize/retrievers/external/weaviate/README.md`
- **AGENTS.md**（documentation）：Learnings and patterns for future agents working on this project. 证据：`AGENTS.md`
- **Contributing to the Neo4j Ecosystem**（documentation）：Contributing to the Neo4j Ecosystem 证据：`CONTRIBUTING.md`
- **Kg Builder**（source_file）：node types = relationship types = patterns = pipe = Pipeline ⋮---- pipe inputs = { ⋮---- async def main - PipelineResult ⋮---- llm = OpenAILLM driver = neo4j.GraphDatabase.driver res = await define and run pipeline driver, llm ⋮---- res = asyncio.run main 证据：`examples/kg_builder.py`
- **Llm Entity Relation Extractor**（source_file）：async def main llm: LLMInterface - Neo4jGraph ⋮---- extractor = LLMEntityRelationExtractor graph = await extractor.run 证据：`examples/customize/build_graph/components/extractors/llm_entity_relation_extractor.py`
- **Schema**（source_file）：async def main - None ⋮---- schema builder = SchemaBuilder result = await schema builder.run 证据：`examples/customize/build_graph/components/schema_builders/schema.py`
- **Anthropic Llm**（source_file）：api key = None ⋮---- res: LLMResponse = llm.invoke "say something" 证据：`examples/customize/llms/anthropic_llm.py`
- **Bedrock Llm**（source_file）：llm = BedrockLLM ⋮---- res = llm.invoke "say something" 证据：`examples/customize/llms/bedrock_llm.py`
- **Cohere Llm**（source_file）：api key = None ⋮---- res: LLMResponse = llm.invoke "say something" 证据：`examples/customize/llms/cohere_llm.py`
- **Google Genai Llm**（source_file）：api key = os.getenv "GOOGLE API KEY" ⋮---- llm = GeminiLLM res = llm.invoke "say something" 证据：`examples/customize/llms/google_genai_llm.py`
- **Ollama Llm**（source_file）：res: LLMResponse = llm.invoke "What is the additive color model?" 证据：`examples/customize/llms/ollama_llm.py`
- **Openai Llm**（source_file）：api key = None ⋮---- res: LLMResponse = llm.invoke "say something" 证据：`examples/customize/llms/openai_llm.py`
- **Vertexai Llm**（source_file）：generation config = GenerationConfig temperature=1.0 llm = VertexAILLM res: LLMResponse = llm.invoke 证据：`examples/customize/llms/vertexai_llm.py`
- **Graphrag**（source_file）：URI = "neo4j+s://demo.neo4jlabs.com" AUTH = "recommendations", "recommendations" DATABASE = "recommendations" INDEX = "moviePlotsEmbedding" logger = logging.getLogger "neo4j graphrag" ⋮---- def formatter record: neo4j.Record - RetrieverResultItem driver = neo4j.GraphDatabase.driver embedder = OpenAIEmbeddings retriever = VectorCypherRetriever llm = OpenAILLM model name="gpt-5", model params={"temperature": 0} rag = GraphRAG retriever=retriever, llm=llm result = rag.search 证据：`examples/question_answering/graphrag.py`
- **Graphrag With Neo4J Message History**（source_file）：URI = "neo4j+s://demo.neo4jlabs.com" AUTH = "recommendations", "recommendations" DATABASE = "recommendations" INDEX = "moviePlotsEmbedding" driver = neo4j.GraphDatabase.driver embedder = OpenAIEmbeddings retriever = VectorCypherRetriever llm = OpenAILLM model name="gpt-5", model params={"temperature": 0} rag = GraphRAG history = Neo4jMessageHistory session id="123", driver=driver, window=10 questions = ⋮---- result = rag.search answer = result.answer 证据：`examples/question_answering/graphrag_with_neo4j_message_history.py`
- **Similarity Search For Vector**（source_file）：URI = "neo4j+s://demo.neo4jlabs.com" AUTH = "recommendations", "recommendations" DATABASE = "recommendations" INDEX NAME = "moviePlotsEmbedding" ⋮---- retriever = VectorRetriever query vector: list float = EMBEDDINGS AVATAR 证据：`examples/retrieve/similarity_search_for_vector.py`
- **Base**（source_file）：class Embedder ABC ⋮---- def init self, rate limit handler: Optional RateLimitHandler = None ⋮---- @abstractmethod def embed query self, text: str - list float async def async embed query self, text: str - list float 证据：`src/neo4j_graphrag/embeddings/base.py`
- **Openai**（source_file）：class BaseOpenAIEmbeddings Embedder, abc.ABC ⋮---- client: openai.OpenAI ⋮---- @abc.abstractmethod def initialize client self, kwargs: Any - Any ⋮---- @rate limit handler def embed query self, text: str, kwargs: Any - list float ⋮---- response = self.client.embeddings.create embedding: list float = response.data 0 .embedding ⋮---- class OpenAIEmbeddings BaseOpenAIEmbeddings 证据：`src/neo4j_graphrag/embeddings/openai.py`
- **If stack is not empty, add missing closing braces**（source_file）：logger = logging.getLogger name class OnError enum.Enum ⋮---- RAISE = "RAISE" IGNORE = "IGNORE" ⋮---- @classmethod def possible values cls - List str def balance curly braces json string: str - str ⋮---- stack = fixed json = in string = False escape = False ⋮---- in string = not in string ⋮---- escape = not escape ⋮---- If stack is not empty, add missing closing braces ⋮---- def fix invalid json raw json: str - str ⋮---- repaired json = json repair.repair json raw json repaired json = repaired json.strip ⋮---- class EntityRelationExtractor Component ⋮---- """Abstract class for entity relation extraction components. Args: on error OnError : What to do when an error occurs during extraction.… 证据：`src/neo4j_graphrag/experimental/components/entity_relation_extractor.py`
- **Graph Schema Extraction**（source_file）：class ExtractedPropertyType BaseModel ⋮---- name: str type: Neo4jPropertyTypeName description: str = "" model config = ConfigDict frozen=True, extra="forbid" class ExtractedNodeType BaseModel ⋮---- label: str ⋮---- properties: list ExtractedPropertyType = Field min length=1 model config = ConfigDict extra="forbid" class ExtractedRelationshipType BaseModel ⋮---- properties: list ExtractedPropertyType = Field default factory=list ⋮---- class ExtractedConstraintType BaseModel ⋮---- type: Literal "UNIQUENESS", "EXISTENCE", "KEY" property names: list str = Field default factory=list node type: str = "" relationship type: str = "" ⋮---- out: list dict str, Any = ⋮---- d = dict c rt = d.get "relat… 证据：`src/neo4j_graphrag/experimental/components/graph_schema_extraction.py`
- **Kg Writer**（source_file）：logger = logging.getLogger name ⋮---- columns: list dict str, Any = ⋮---- field = schema.field i type info = Neo4jGraphParquetFormatter.pyarrow type to type info field.type name = field.name ⋮---- def batched rows: list Any , batch size: int - Generator list Any , None, None ⋮---- index = 0 ⋮---- start = i end = min start + batch size, len rows batch = rows start:end ⋮---- nodes per label = {} ⋮---- rel per type = {} ⋮---- class KGWriterModel DataModel ⋮---- status: Literal "SUCCESS", "FAILURE" metadata: Optional dict str, Any = None class KGWriter Component class Neo4jWriter KGWriter ⋮---- def db setup self - None ⋮---- rows = ⋮---- labels = node.label ⋮---- row = node.model dump ⋮---- par… 证据：`src/neo4j_graphrag/experimental/components/kg_writer.py`
- **Lexical Graph**（source_file）：logger = logging.getLogger name class LexicalGraphBuilder Component ⋮---- graph = Neo4jGraph ⋮---- document node = self.create document node document info ⋮---- tasks = ⋮---- chunk node = self.create chunk node chunk ⋮---- chunk to doc rel = self.create chunk to document rel ⋮---- next chunk rel = self.create next chunk relationship chunk, next chunk ⋮---- def create document node self, document info: DocumentInfo - Neo4jNode ⋮---- document metadata = document info.metadata or {} properties: dict str, PropertyValue = { ⋮---- chunk id = chunk.chunk id chunk properties: Dict str, Any = { embedding properties = {} ⋮---- node to chunk rel = self.create node to chunk rel node, chunk.chunk id 证据：`src/neo4j_graphrag/experimental/components/lexical_graph.py`
- **See https://neo4j.com/docs/cypher-manual/current/values-and-types/property-structural-constructed/ property-types**（source_file）：logger = logging.getLogger name Neo4jPropertyTypeName: TypeAlias = Literal DUNDER RE = re.compile r"^ $" class GraphConstraintType str, enum.Enum ⋮---- """Constraint kinds for :class: ConstraintType . UNIQUENESS supports both node and relationship scope; composite multi-property constraints are allowed. EXISTENCE marks a mandatory non-null node or relationship property single property only — Neo4j does not support composite existence . KEY matches Neo4j NODE KEY / RELATIONSHIP KEY: mandatory and unique; supports composite multi-property constraints. KEY subsumes EXISTENCE for its properties — do not combine KEY and EXISTENCE on the same property. """ UNIQUENESS = "UNIQUENESS" EXISTENCE = "E… 证据：`src/neo4j_graphrag/experimental/components/schema.py`
- **Base**（source_file）：class TextSplitter Component ⋮---- @abstractmethod async def run self, text: str - TextChunks 证据：`src/neo4j_graphrag/experimental/components/text_splitters/base.py`
- **Types**（source_file）：logger = logging.getLogger name class GeoPoint BaseModel ⋮---- model config = ConfigDict extra="forbid" latitude: float longitude: float height: float PrimitiveValue = Union bool, int, float, str TemporalValue = Union date, time, datetime Duration = str PropertyValue = Union class DocumentType str, Enum ⋮---- PDF = "pdf" MARKDOWN = "markdown" INLINE TEXT = "inline text" class DocumentInfo DataModel ⋮---- path: str metadata: Optional Dict str, str = None uid: str = Field default factory=lambda: str uuid.uuid4 document type: Optional DocumentType = None ⋮---- @property def document id self - str class LoadedDocument DataModel ⋮---- text: str document info: DocumentInfo ⋮---- PdfDocument = Loa… 证据：`src/neo4j_graphrag/experimental/components/types.py`
- **Base**（source_file）：logger = logging.getLogger name class AbstractConfig BaseModel ⋮---- global data: dict str, Any = PrivateAttr {} def resolve param self, param: ParamConfig - Any def resolve params self, params: dict str, ParamConfig - dict str, Any def parse self, resolved data: Optional dict str, Any = None - Any 证据：`src/neo4j_graphrag/experimental/pipeline/config/base.py`
- **Object Config**（source_file）：logger = logging.getLogger name T = TypeVar "T" class ObjectConfig AbstractConfig, Generic T ⋮---- class : Optional str = Field default=None, validate default=True params : dict str, ParamConfig = {} DEFAULT MODULE: ClassVar str = "." INTERFACE: ClassVar type = object REQUIRED PARAMS: ClassVar list str = ⋮---- @field validator "params " @classmethod def validate params cls, params : dict str, Any - dict str, Any def get module self - str def get interface self - type ⋮---- @classmethod def get class cls, class path: str, optional module: Optional str = None - type ⋮---- """Get class from string and an optional module Will first try to import the class from class path alone. If it results in… 证据：`src/neo4j_graphrag/experimental/pipeline/config/object_config.py`
- **Base**（source_file）：logger = logging.getLogger name class TemplatePipelineConfig AbstractPipelineConfig ⋮---- COMPONENTS: ClassVar list str = def get component self, component name: str - Optional ComponentDefinition ⋮---- method = getattr self, f" get {component name}" component = method ⋮---- method = getattr self, f" get run params for {component name}", None run params = method if method else {} component definition = ComponentDefinition ⋮---- def get components self - list ComponentDefinition ⋮---- components = ⋮---- comp = self. get component component name ⋮---- def get run params self, user input: dict str, Any - dict str, Any 证据：`src/neo4j_graphrag/experimental/pipeline/config/template_pipeline/base.py`
- **Simple Kg Builder**（source_file）：class DefaultPathDataLoader DataLoader ⋮---- path str = str filepath suffix = Path path str .suffix.lower ⋮---- class SimpleKGPipelineConfig TemplatePipelineConfig ⋮---- COMPONENTS: ClassVar list str = template : Literal PipelineType.SIMPLE KG PIPELINE = from file: bool = False from pdf: Optional bool = Field entities: Sequence EntityInputType = relations: Sequence RelationInputType = potential schema: Optional list tuple str, str, str = None schema : Optional GraphSchema = Field default=None, alias="schema" on error: OnError = OnError.IGNORE prompt template: Union ERExtractionTemplate, str = ERExtractionTemplate perform entity resolution: bool = True lexical graph config: Optional LexicalG… 证据：`src/neo4j_graphrag/experimental/pipeline/config/template_pipeline/simple_kg_builder.py`
- **Types**（source_file）：class PipelineType str, enum.Enum ⋮---- NONE = "none" SIMPLE KG PIPELINE = "SimpleKGPipeline" 证据：`src/neo4j_graphrag/experimental/pipeline/config/types.py`
- **Kg Builder**（source_file）：logger = logging.getLogger name class SimpleKGPipeline ⋮---- from file = from pdf ⋮---- file loader = pdf loader ⋮---- config = SimpleKGPipelineConfig.model validate 证据：`src/neo4j_graphrag/experimental/pipeline/kg_builder.py`
- **Schema**（source_file）：EntityInputType = Union str, dict str, Union str, list dict str, str RelationInputType = Union str, dict str, Union str, list dict str, str 证据：`src/neo4j_graphrag/experimental/pipeline/types/schema.py`
- **Schema**（source_file）：VisualizationGraph = Node = Relationship = None ⋮---- schema object = GraphSchema.model validate schema def format property name p: PropertyType, existence names: set str - str def relationship properties rel type: str - dict str, str ⋮---- """Returns a dict {prop name: prop type} for all relationship properties. Args: rel type str : the relationship type Returns: dict str, str : the relationship properties {name: type} mapping for display """ existence = schema object.existence property names for relationship rel type ⋮---- def node properties node type: NodeType - dict str, str ⋮---- """Returns a dict {prop name: prop type} for all node properties. Args: node type NodeType : the node type… 证据：`src/neo4j_graphrag/experimental/utils/schema.py`
- **langchain-core is an optional dependency**（source_file）：logger = logging.getLogger name class GraphRAG ⋮---- validated data = RagInitModel ⋮---- """ .. warning:: The default value of 'return context' will change from 'False' to 'True' in a future version. This method performs a full RAG search: 1. Retrieval: context retrieval 2. Augmentation: prompt formatting 3. Generation: answer generation with LLM Args: query text str : The user question. message history Optional Union List LLMMessage , MessageHistory : A collection of previous messages, with each message having a specific role assigned. examples str : Examples added to the LLM prompt. retriever config Optional dict : Parameters passed to the retriever. search method; e.g.: top k return cont… 证据：`src/neo4j_graphrag/generation/graphrag.py`
- **Result:**（source_file）：class PromptTemplate ⋮---- DEFAULT SYSTEM INSTRUCTIONS: str = "" DEFAULT TEMPLATE: str = "" EXPECTED INPUTS: list str = list ⋮---- def format self, kwargs: Any - str def format self, args: Any, kwargs: Any - str ⋮---- """This method is used to replace parameters with the provided values. Parameters must be provided: - as kwargs - as args if using the same order as in the expected inputs Example: .. code-block:: python prompt template = PromptTemplate template='''Explain the following concept to {target audience}: Concept: {concept} Answer: ''', expected inputs= 'target audience', 'concept' prompt = prompt template.format '12 yo children', concept='graph database' print prompt Result: '''Exp… 证据：`src/neo4j_graphrag/generation/prompts.py`
- **Types**（source_file）：class RagInitModel BaseModel ⋮---- retriever: Retriever llm: Any prompt template: RagTemplate model config = ConfigDict arbitrary types allowed=True ⋮---- @field validator "llm" def check llm cls, value: Any - Any ⋮---- invoke = getattr value, "invoke", None ⋮---- class RagSearchModel BaseModel ⋮---- query text: str examples: str = "" retriever config: dict str, Any = {} return context: bool = False response fallback: Optional str = None class RagResultModel BaseModel ⋮---- answer: str retriever result: Optional RetrieverResult = None 证据：`src/neo4j_graphrag/generation/types.py`
- **implementaions**（source_file）：class AnthropicLLM LLMBase ⋮---- implementaions ⋮---- """Sends text to the LLM and returns a response. Args: input str : The text to send to the LLM. message history Optional Union List LLMMessage , MessageHistory : A collection previous messages, with each message having a specific role assigned. system instruction Optional str : An option to override the llm system message for this invocation. Returns: LLMResponse: The response from the LLM. """ ⋮---- message history = message history.messages messages = self.get messages input, message history response = self.client.messages.create response content = response.content ⋮---- text = response content 0 .text ⋮---- usage = LLMUsage ⋮---- resp… 证据：`src/neo4j_graphrag/llm/anthropic_llm.py`
- **Base**（source_file）：logger = logging.getLogger name class LLMInterface ABC ⋮---- supports structured output: bool = False ⋮---- class LLMInterfaceV2 ABC class LLMBase LLMInterface, LLMInterfaceV2, ABC ⋮---- def close self - None ⋮---- loop = asyncio.new event loop ⋮---- async def aclose self - None def enter self - "LLMBase" def exit self, exc type: Any, exc val: Any, exc tb: Any - None async def aenter self - "LLMBase" async def aexit self, exc type: Any, exc val: Any, exc tb: Any - None 证据：`src/neo4j_graphrag/llm/base.py`
- **subsidiary methods**（source_file）：boto3 = None DEFAULT BEDROCK LLM MODEL = os.getenv class BedrockLLM LLMInterface, LLMInterfaceV2 ⋮---- client kwargs: dict str, Any = { kwargs} ⋮---- messages = self.get messages input, message history converse kwargs = self. build converse kwargs response = self.client.converse converse kwargs ⋮---- loop = asyncio.get event loop ⋮---- tool config = self. get tool config tools ⋮---- subsidiary methods ⋮---- """Constructs the message list for the Bedrock Converse API.""" messages: list dict str, Any = ⋮---- message history = message history.messages ⋮---- role = message.get "role" content = message.get "content", "" ⋮---- system instruction: Optional str = None ⋮---- system instruction = con… 证据：`src/neo4j_graphrag/llm/bedrock_llm.py`
- **implementations**（source_file）：class CohereLLM LLMBase ⋮---- def extract text content self, content items: Any - str ⋮---- text = getattr content items 0 , "text", None ⋮---- implementations ⋮---- """Sends text to the LLM and returns a response. Args: input str : The text to send to the LLM. message history Optional Union List LLMMessage , MessageHistory : A collection previous messages, with each message having a specific role assigned. system instruction Optional str : An option to override the llm system message for this invocation. Returns: LLMResponse: The response from the LLM. """ ⋮---- message history = message history.messages messages = self.get messages input, message history, system instruction res = self.cli… 证据：`src/neo4j_graphrag/llm/cohere_llm.py`
- **Google Genai Llm**（source_file）：genai = None types = None class GeminiLLM LLMInterface, LLMInterfaceV2 ⋮---- contents = self.get messages input, message history config = self. build config system instruction=system instruction response = self.client.models.generate content ⋮---- response = await self.client.aio.models.generate content ⋮---- contents=contents, type: ignore arg-type ⋮---- config = self. build config ⋮---- messages: list types.Content = ⋮---- message history = message history.messages ⋮---- role = message.get "role" content = message.get "content", "" ⋮---- system instruction = None ⋮---- system instruction = content ⋮---- config kwargs: dict str, Any = {} ⋮---- tool calls = 证据：`src/neo4j_graphrag/llm/google_genai_llm.py`
- **subsdiary methods**（source_file）：class OllamaLLM LLMBase ⋮---- """Sends text to the LLM and returns a response. Args: input str : The text to send to the LLM. message history Optional Union List LLMMessage , MessageHistory : A collection previous messages, with each message having a specific role assigned. system instruction Optional str : An option to override the llm system message for this invocation. Returns: LLMResponse: The response from the LLM. """ ⋮---- message history = message history.messages response = self.client.chat content = response.message.content or "" usage = None ⋮---- request tokens = response.prompt eval count response tokens = response.eval count usage = LLMUsage ⋮---- """Sends text to the LLM and… 证据：`src/neo4j_graphrag/llm/ollama_llm.py`
- **subsidiary methods**（source_file）：httpx = None ⋮---- ChatCompletionMessageParam = Any ChatCompletionToolParam = Any OpenAI = Any AsyncOpenAI = Any logger = logging.getLogger name class BaseOpenAILLM LLMBase, abc.ABC ⋮---- client: OpenAI async client: AsyncOpenAI ⋮---- tools: Sequence Tool , Tools definition as a sequence of Tool objects ⋮---- async def aclose self - None subsidiary methods ⋮---- """Constructs the message list for OpenAI chat completion for legacy LLMInterface.""" messages = ⋮---- message history = message history.messages ⋮---- return messages type: ignore ⋮---- """Constructs the message list for OpenAI chat completion for LLMInterfaceV2.""" chat messages = ⋮---- message type: Type ChatCompletionMessagePara… 证据：`src/neo4j_graphrag/llm/openai_llm.py`
- **Types**（source_file）：def getattr name: str - Any class LLMUsage BaseModel ⋮---- """Token usage statistics returned by an LLM call. Attributes: request tokens Optional int : Number of tokens in the prompt/request. None when not reported by the provider. response tokens Optional int : Number of tokens in the completion/response. None when not reported by the provider. total tokens Optional int : Total tokens consumed by the call. None when not reported by the provider. """ request tokens: Optional int = None response tokens: Optional int = None total tokens: Optional int = None class LLMResponse BaseModel ⋮---- """Response returned by an LLM invocation. Attributes: content str : The text content of the LLM respon… 证据：`src/neo4j_graphrag/llm/types.py`
- **legacy and brand new implementations**（source_file）：logger = logging.getLogger name GENERATION CONFIG SCHEMA PARAMS = {"response schema", "response mime type"} ⋮---- raw = config. raw generation config sig = inspect.signature GenerationConfig. init valid params = { preserved = {} ⋮---- val = getattr raw, param, None ⋮---- val = list val ⋮---- class VertexAILLM LLMBase ⋮---- supports structured output: bool = True ⋮---- tools: Sequence Tool , Tools definition as a sequence of Tool objects ⋮---- legacy and brand new implementations ⋮---- """Sends text to the LLM and returns a response. Args: input str : The text to send to the LLM. message history Optional Union List LLMMessage , MessageHistory : A collection previous messages, with each messa… 证据：`src/neo4j_graphrag/llm/vertexai_llm.py`
- **Get the origin and args for generic types**（source_file）：T = ParamSpec "T" P = TypeVar "P" def copy function f: Callable T, P - Callable T, P ⋮---- g = types.FunctionType ⋮---- class RetrieverMetaclass ABCMeta ⋮---- get search results method = attrs.get "get search results" search method = None ⋮---- search method = getattr b, "search", None ⋮---- new search method = copy function search method ⋮---- class Retriever ABC, metaclass=RetrieverMetaclass ⋮---- index name: str VERIFY NEO4J VERSION = True def init self, driver: neo4j.Driver, neo4j database: Optional str = None def fetch index infos self, vector index name: str - None ⋮---- query = query result = self.driver.execute query ⋮---- result = query result.records 0 ⋮---- def search self, args:… 证据：`src/neo4j_graphrag/retrievers/base.py`
- **Types**（source_file）：class PineconeSearchModel VectorSearchModel ⋮---- pinecone filter: Optional class PineconeClientModel BaseModel ⋮---- client: Pinecone model config = ConfigDict arbitrary types allowed=True ⋮---- @field validator "client" def check client cls, value: Pinecone - Pinecone class PineconeNeo4jRetrieverModel BaseModel ⋮---- driver model: Neo4jDriverModel client model: PineconeClientModel index name: str id property neo4j: str embedder model: Optional EmbedderModel = None return properties: Optional list str = None retrieval query: Optional str = None result formatter: Optional Callable neo4j.Record , RetrieverResultItem = None neo4j database: Optional str = None node label neo4j: Optional str =… 证据：`src/neo4j_graphrag/retrievers/external/pinecone/types.py`
- **Types**（source_file）：class QdrantClientModel BaseModel ⋮---- client: QdrantClient model config = ConfigDict arbitrary types allowed=True ⋮---- @field validator "client" def check client cls, value: QdrantClient - QdrantClient class QdrantNeo4jRetrieverModel BaseModel ⋮---- driver model: Neo4jDriverModel client model: QdrantClientModel collection name: str id property external: str id property neo4j: str using: Optional str = None embedder model: Optional EmbedderModel = None return properties: Optional list str = None retrieval query: Optional str = None result formatter: Optional Callable neo4j.Record , RetrieverResultItem = None neo4j database: Optional str = None node label neo4j: Optional str = None 证据：`src/neo4j_graphrag/retrievers/external/qdrant/types.py`
- **Types**（source_file）：class WeaviateModel BaseModel ⋮---- client: WeaviateClient model config = ConfigDict arbitrary types allowed=True ⋮---- @field validator "client" def check client cls, value: WeaviateClient - WeaviateClient class WeaviateNeo4jRetrieverModel BaseModel ⋮---- driver model: Neo4jDriverModel client model: WeaviateModel collection: str id property external: str id property neo4j: str embedder model: Optional EmbedderModel return properties: Optional list str = None retrieval query: Optional str = None result formatter: Optional Callable neo4j.Record , RetrieverResultItem = None neo4j database: Optional str = None node label neo4j: Optional str = None class WeaviateNeo4jSearchModel VectorSearchMod… 证据：`src/neo4j_graphrag/retrievers/external/weaviate/types.py`
- **Hybrid**（source_file）：logger = logging.getLogger name class HybridRetriever Retriever ⋮---- driver model = Neo4jDriverModel driver=driver embedder model = EmbedderModel embedder=embedder if embedder else None validated data = HybridRetrieverModel ⋮---- def default record formatter self, record: neo4j.Record - RetrieverResultItem ⋮---- metadata = { node = record.get "node" ⋮---- validated data = HybridSearchModel ⋮---- parameters = validated data.model dump exclude none=True ⋮---- query vector = self.embedder.embed query query text ⋮---- use search clause = False ⋮---- use search clause = True ⋮---- search query base = build hybrid search clause query linear ⋮---- search query base = build hybrid search clause qu… 证据：`src/neo4j_graphrag/retrievers/hybrid.py`
- **Quote node labels in backticks if they contain spaces and are not already quoted**（source_file）：logger = logging.getLogger name READ ONLY QUERY TYPE = "r" def extract cypher text: str - str ⋮---- pattern = r" " matches = re.findall pattern, text, re.DOTALL cypher query = matches 0 if matches else text Quote node labels in backticks if they contain spaces and are not already quoted Anchored to node pattern ... to avoid matching map literal values cypher query = re.sub Quote property keys in backticks if they contain spaces and are not already quoted ⋮---- class Text2CypherRetriever Retriever ⋮---- driver model = Neo4jDriverModel driver=driver llm model = LLMModel llm=llm neo4j schema model = validated data = Text2CypherRetrieverModel ⋮---- neo4j schema = validated data.neo4j schema mod… 证据：`src/neo4j_graphrag/retrievers/text2cypher.py`
- **No tools available, return empty result**（source_file）：class ToolsRetriever Retriever ⋮---- VERIFY NEO4J VERSION = False ⋮---- def validate tool names self - None ⋮---- tool names = tool.get name for tool in self. tools duplicate names = ⋮---- def get default system instruction self - str ⋮---- """Get the default system instruction for the LLM.""" ⋮---- """Use the LLM to select and execute appropriate tools based on the query. Args: query text str : The user's query text. message history Optional Union List LLMMessage , MessageHistory , optional : Previous conversation history. Defaults to None. kwargs Any : Additional arguments passed to the tool execution. Returns: RawSearchResult: The combined results from the executed tools. """ ⋮---- No to… 证据：`src/neo4j_graphrag/retrievers/tools_retriever.py`
- **Vector**（source_file）：logger = logging.getLogger name class VectorRetriever Retriever ⋮---- driver model = Neo4jDriverModel driver=driver embedder model = EmbedderModel embedder=embedder if embedder else None validated data = VectorRetrieverModel ⋮---- def default record formatter self, record: neo4j.Record - RetrieverResultItem ⋮---- metadata = { node = record.get "node" ⋮---- validated data = VectorSearchModel ⋮---- parameters = validated data.model dump exclude none=True ⋮---- query vector = self.embedder.embed query query text ⋮---- use search clause = False filter cls: Optional FilterClassification = None ⋮---- filter cls = classify filter for search filters, node alias="node" missing = extract filter field… 证据：`src/neo4j_graphrag/retrievers/vector.py`
- **Sample random nodes if not exhaustive**（source_file）：BASE KG BUILDER LABEL = " KGBuilder " BASE ENTITY LABEL = " Entity " EXCLUDED LABELS = " Bloom Perspective ", " Bloom Scene " EXCLUDED RELS = " Bloom HAS SCENE " EXHAUSTIVE SEARCH LIMIT = 10000 LIST LIMIT = 128 DISTINCT VALUE LIMIT = 10 NODE PROPERTIES QUERY = REL PROPERTIES QUERY = REL QUERY = INDEX QUERY = SCHEMA COUNTS QUERY = def clean string values text: str - str def value sanitize d: Any - Any ⋮---- new dict = {} ⋮---- sanitized value = value sanitize value ⋮---- data = driver.execute query json data = r.data for r in data.records ⋮---- json data = value sanitize el for el in json data ⋮---- result = session.run Query text=query, timeout=timeout , params json data = r.data for r in r… 证据：`src/neo4j_graphrag/schema.py`
- **Types**（source_file）：class RawSearchResult BaseModel ⋮---- records: list neo4j.Record metadata: Optional dict str, Any = None model config = ConfigDict arbitrary types allowed=True ⋮---- @field validator "records" def check records cls, value: neo4j.Record - neo4j.Record class RetrieverResultItem BaseModel ⋮---- content: Any ⋮---- class RetrieverResult BaseModel ⋮---- items: list RetrieverResultItem ⋮---- class Neo4jSchemaModel BaseModel ⋮---- neo4j schema: str class IndexModel BaseModel ⋮---- driver: Any ⋮---- @field validator "driver" def check driver is valid cls, v: neo4j.Driver - neo4j.Driver class VectorIndexModel IndexModel ⋮---- name: str label: str embedding property: str dimensions: PositiveInt simila… 证据：`src/neo4j_graphrag/types.py`
- 其余 17 条证据见 `AI_CONTEXT_PACK.json` 或 `EVIDENCE_INDEX.json`。

## 宿主 AI 必须遵守的规则

- **把本资产当作开工前上下文，而不是运行环境。**：AI Context Pack 只包含证据化项目理解，不包含目标项目的可执行状态。 证据：`docs/README.md`, `README.md`, `examples/README.md`
- **回答用户时区分可预览内容与必须安装后才能验证的内容。**：安装前体验的消费者价值来自降低误装和误判，而不是伪装成真实运行。 证据：`docs/README.md`, `README.md`, `examples/README.md`

## 用户开工前应该回答的问题

- 你准备在哪个宿主 AI 或本地环境中使用它？
- 你只是想先体验工作流，还是准备真实安装？
- 你最在意的是安装成本、输出质量、还是和现有规则的冲突？

## 验收标准

- 所有能力声明都能回指到 evidence_refs 中的文件路径。
- AI_CONTEXT_PACK.md 没有把预览包装成真实运行。
- 用户能在 3 分钟内看懂适合谁、能做什么、如何开始和风险边界。

---

## Doramagic Context Augmentation

下面内容用于强化 Repomix/AI Context Pack 主体。Human Manual 只提供阅读骨架；踩坑日志会被转成宿主 AI 必须遵守的工作约束。

## Human Manual 骨架

使用规则：这里只是项目阅读路线和显著性信号，不是事实权威。具体事实仍必须回到 repo evidence / Claim Graph。

宿主 AI 硬性规则：
- 不得把页标题、章节顺序、摘要或 importance 当作项目事实证据。
- 解释 Human Manual 骨架时，必须明确说它只是阅读路线/显著性信号。
- 能力、安装、兼容性、运行状态和风险判断必须引用 repo evidence、source path 或 Claim Graph。

- **概述、架构与快速开始**：importance `high`
  - source_paths: README.md, src/neo4j_graphrag/__init__.py, pyproject.toml, src/neo4j_graphrag/exceptions.py, src/neo4j_graphrag/types.py
- **检索器与 GraphRAG 生成**：importance `high`
  - source_paths: src/neo4j_graphrag/retrievers/base.py, src/neo4j_graphrag/retrievers/vector.py, src/neo4j_graphrag/retrievers/text2cypher.py, src/neo4j_graphrag/retrievers/hybrid.py, src/neo4j_graphrag/retrievers/tools_retriever.py
- **知识图谱构建管道（实验性）**：importance `high`
  - source_paths: src/neo4j_graphrag/experimental/pipeline/kg_builder.py, src/neo4j_graphrag/experimental/pipeline/pipeline.py, src/neo4j_graphrag/experimental/pipeline/component.py, src/neo4j_graphrag/experimental/pipeline/orchestrator.py, src/neo4j_graphrag/experimental/components/pdf_loader.py
- **LLM 与嵌入提供者集成**：importance `high`
  - source_paths: src/neo4j_graphrag/llm/base.py, src/neo4j_graphrag/llm/openai_llm.py, src/neo4j_graphrag/llm/anthropic_llm.py, src/neo4j_graphrag/llm/cohere_llm.py, src/neo4j_graphrag/llm/bedrock_llm.py

## Repo Inspection Evidence / 源码检查证据

- repo_clone_verified: true
- repo_inspection_verified: true
- repo_commit: `49b2694ad276e1f037cf7c27db3a659ea9c892cb`
- inspected_files: `README.md`, `pyproject.toml`, `uv.lock`, `docs/README.md`, `docs/source/conf.py`, `docs/source/themes/neo4j/static/js/12-fragment-jumper.js`, `examples/README.md`, `examples/build_graph/automatic_schema_extraction/simple_kg_builder_schema_from_pdf.py`, `examples/build_graph/automatic_schema_extraction/simple_kg_builder_schema_from_text.py`, `examples/build_graph/from_config_files/simple_kg_pipeline_config.json`, `examples/build_graph/from_config_files/simple_kg_pipeline_config.yaml`, `examples/build_graph/from_config_files/simple_kg_pipeline_config_url.json`, `examples/build_graph/from_config_files/simple_kg_pipeline_from_config_file.py`, `examples/build_graph/from_config_files/simple_kg_pipeline_from_config_file_with_url.py`, `examples/build_graph/simple_kg_builder_from_pdf.py`, `examples/build_graph/simple_kg_builder_from_text.py`, `examples/customize/answer/custom_prompt.py`, `examples/customize/answer/langchain_compatiblity.py`, `examples/customize/build_graph/components/chunk_reader/neo4j_chunk_reader.py`, `examples/customize/build_graph/components/custom_component.py`

宿主 AI 硬性规则：
- 没有 repo_clone_verified=true 时，不得声称已经读过源码。
- 没有 repo_inspection_verified=true 时，不得把 README/docs/package 文件判断写成事实。
- 没有 quick_start_verified=true 时，不得声称 Quick Start 已跑通。

## Doramagic Pitfall Constraints / 踩坑约束

这些规则来自 Doramagic 发现、验证或编译过程中的项目专属坑点。宿主 AI 必须把它们当作工作约束，而不是普通说明文字。

### Constraint 1: 来源证据：Allow async driver in retrievers

- Trigger: GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Allow async driver in retrievers
- Why it matters: 可能增加新用户试用和生产接入成本。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/406 | 来源类型 github_issue 暴露的待验证使用条件。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 2: 来源证据：[FEATURE]: Add Anthropic's Structured Output feature

- Trigger: GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：[FEATURE]: Add Anthropic's Structured Output feature
- Why it matters: 可能影响授权、密钥配置或安全边界。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/493 | 来源讨论提到 python 相关条件，需在安装/试用前复核。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 3: 来源证据：[FEATURE]: Add MistralAI Structured Output feature

- Trigger: GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：[FEATURE]: Add MistralAI Structured Output feature
- Why it matters: 可能影响授权、密钥配置或安全边界。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/542 | 来源讨论提到 python 相关条件，需在安装/试用前复核。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 4: 仓库名和安装名不一致

- Trigger: 仓库名 `neo4j-graphrag-python` 与安装入口 `neo4j-graphrag` 不完全一致。
- Host AI rule: 在 npm/PyPI/GitHub 上确认包名映射和官方 README 说明。
- Why it matters: 用户照着仓库名搜索包或照着包名找仓库时容易走错入口。
- Evidence: identity.distribution | https://github.com/neo4j/neo4j-graphrag-python | repo=neo4j-graphrag-python; install=neo4j-graphrag
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 5: 来源证据：Migrate VertexAIEmbeddings to use google-genai SDK

- Trigger: GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Migrate VertexAIEmbeddings to use google-genai SDK
- Host AI rule: 来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- Why it matters: 可能影响升级、迁移或版本选择。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/430 | 来源讨论提到 python 相关条件，需在安装/试用前复核。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 6: 来源证据：[QUESTION]: How can i customise the entity/node extracted from SimpleKGPipeline

- Trigger: GitHub 社区证据显示该项目存在一个安装相关的待验证问题：[QUESTION]: How can i customise the entity/node extracted from SimpleKGPipeline
- Host AI rule: 来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- Why it matters: 可能增加新用户试用和生产接入成本。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/439 | 来源讨论提到 node 相关条件，需在安装/试用前复核。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 7: 能力判断依赖假设

- Trigger: README/documentation is current enough for a first validation pass.
- Host AI rule: 将假设转成下游验证清单。
- Why it matters: 假设不成立时，用户拿不到承诺的能力。
- Evidence: capability.assumptions | https://github.com/neo4j/neo4j-graphrag-python | README/documentation is current enough for a first validation pass.
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 8: 来源证据：[FEATURE]: Add possibility to truncate retrieved context

- Trigger: GitHub 社区证据显示该项目存在一个运行相关的待验证问题：[FEATURE]: Add possibility to truncate retrieved context
- Host AI rule: 来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- Why it matters: 可能增加新用户试用和生产接入成本。
- Evidence: community_evidence:github | https://github.com/neo4j/neo4j-graphrag-python/issues/446 | 来源类型 github_issue 暴露的待验证使用条件。
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

### Constraint 9: 维护活跃度未知

- Trigger: 未记录 last_activity_observed。
- Host AI rule: 补 GitHub 最近 commit、release、issue/PR 响应信号。
- Why it matters: 新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- Evidence: evidence.maintainer_signals | https://github.com/neo4j/neo4j-graphrag-python | last_activity_observed missing
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。

- Trigger: no_demo
- Evidence: downstream_validation.risk_items | https://github.com/neo4j/neo4j-graphrag-python | no_demo; severity=medium
- Hard boundary: 不要把这个坑点包装成已解决、已验证或可忽略，除非后续验证证据明确证明它已经关闭。