LlamaIndex RAG 框架

LlamaIndex:把任意文档变 LLM 可查询知识的 Python 框架。4 大支柱(Index/Retriever/QueryEngine/Synthesizer)+ 52 条 anti-pattern 约束(5 fatal)。

✓ 0 人报告成功·v0.1.0·

晶体简介

LlamaIndex 是把任意文档变成 LLM 可查询知识的 Python 框架(github.com/run-llama/llama_index)。四大支柱(Index / Retriever / QueryEngine / ResponseSynthesizer)配置化检索-合成循环;Ingestion pipeline 处理 Document → Node → Embedding → Index 转换,带 content-hash 缓存;workflow / agent 子模块(FunctionAgent / ReActAgent / CodeActAgent / multi-agent)在外部 'workflows' 原语之上叠加 tool-calling。 Settings(单例)替换 v0.9 的 ServiceContext 作为全局配置面(ServiceContext 已硬删除,3 个入口直接 raise ValueError)。 本 skill 自带 52 条约束(含 5 条 fatal),覆盖典型踩坑:ServiceContext 硬删除(不是 deprecated)、SentenceSplitter chunk_overlap 默认 200(不是constants.DEFAULT_CHUNK_OVERLAP=20)、embedding model 身份不持久化到 index_struct/storage_context、CJK / 多语言语料用默认 SentenceSplitter 踩 punkt 英语分词等。宿主 AI 自动应用这些约束。

Blueprint Source

finance-bp-135

run-llama/llama_index0a6c90b1 source file

Constraints

5total
5fatal
5 must-not-violate

Evidence Quality

Confidence90%

High confidence — strong evidence base

5 条不可违反的约束

FATALdomain_rule?

WHENWhen porting code from a llama-index v0.9 era tutorial / blog / Stack Overflow answer that constructs a ServiceContext object

ACTIONDelete every ServiceContext.from_defaults / ServiceContext(...) / set_global_service_context(...) call. Replace with attribute assignments on the module-level Settings singleton (e.g. Settings.llm = OpenAI(...), Settings.embed_model = OpenAIEmbedding(...), Settings.node_parser = SentenceSplitter(chunk_overlap=20)) BEFORE any index/query construction. Do not pass a ServiceContext kwarg to BaseIndex.from_documents.

CONSEQUENCEundefined behavior

FATALdomain_rule?

WHENWhen designing a workflow where the index is persisted to storage today and re-loaded later (possibly by a different process / different developer) for query

ACTIONDo not rely on storage_context to remember which embedder built the index. Treat the embed model identity as caller-managed state — always reconstruct the index with the same explicit embed_model that was used at index time, or fail loudly when re-loading. Read llamaindex-C-004 for the remedy.

CONSEQUENCEundefined behavior

FATALdomain_rule?

WHENWhen persisting an index to disk / vector store today for later re-load and query

ACTIONAt index time: write a sidecar file (e.g. {storage_dir}/embed_model.json) with {'provider_class': type(embed_model).__module__ + '.' + type(embed_model).__name__, 'model_name': getattr(embed_model, 'model_name', None), 'embed_dim': getattr(embed_model, 'embed_dim', None) or len(embed_model.get_text_embedding('probe'))}. At re-load: read the sidecar, compare against Settings.embed_model or the embed_model passed to load_index_from_storage, raise EmbedModelMismatchError on any drift. Do not fall back to the new embedder.

CONSEQUENCEundefined behavior

常见问题

讨论 (0)

暂无讨论,成为第一个发言的人吧!

更新历史

v0.1.02026-04-25·贡献者: tangweigang-jpg

v0.1.0: 首次发布到 Doramagic.ai。基于 run-llama/llama_index 的 RAG 框架,中英双语 + 52 条 anti-pattern 约束(5 条 fatal)+ 3 条 FAQ。

v0.1.02026-04-25·贡献者: tangweigang-jpg

v0.1.0: 首次发布到 Doramagic.ai。基于 run-llama/llama_index 的 RAG 框架,中英双语 + 52 条 anti-pattern 约束(5 条 fatal)+ 3 条 FAQ。