LlamaIndex (RAG / Agent / Workflow)

LlamaIndex: a Python framework that turns arbitrary documents into queryable, LLM-grounded knowledge. The four-pillar core (Index / Retriever / QueryEngine / ResponseSynthesizer) wires a configurable retrieve-then-synthesize loop;

✓ 0 reported success·v0.1.0·

Overview

LlamaIndex is a Python framework that turns arbitrary documents into queryable, LLM-grounded knowledge (github.com/run-llama/llama_index). The four-pillar core (Index / Retriever / QueryEngine / ResponseSynthesizer) wires a configurable retrieve-then-synthesize loop; an Ingestion pipeline handles Document → Node → Embedding → Index transformations with content-hash caching. The workflow / agent submodule (FunctionAgent / ReActAgent / CodeActAgent / multi-agent) layers tool-calling on top of an external 'workflows' primitive. Settings (singleton) replaces the v0.9 ServiceContext as the global configuration surface — ServiceContext is hard-removed (not deprecated); three entry points raise ValueError. This skill embeds 52 constraints (5 fatal) covering typical pitfalls: ServiceContext hard-removed, SentenceSplitter chunk_overlap default is 200 (not constants.DEFAULT_CHUNK_OVERLAP=20), embedding model identity is NOT persisted in index_struct/storage_context, and CJK / multilingual corpora trip on punkt English tokenizer with the default splitter. The host AI applies these automatically.

Blueprint Source

finance-bp-135

run-llama/llama_index0a6c90b1 source file

Constraints

5total
5fatal
5 must-not-violate

Evidence Quality

Confidence90%

High confidence — strong evidence base

5 non-negotiable constraints

FATALdomain_rule?

WHENWhen porting code from a llama-index v0.9 era tutorial / blog / Stack Overflow answer that constructs a ServiceContext object

ACTIONDelete every ServiceContext.from_defaults / ServiceContext(...) / set_global_service_context(...) call. Replace with attribute assignments on the module-level Settings singleton (e.g. Settings.llm = OpenAI(...), Settings.embed_model = OpenAIEmbedding(...), Settings.node_parser = SentenceSplitter(chunk_overlap=20)) BEFORE any index/query construction. Do not pass a ServiceContext kwarg to BaseIndex.from_documents.

CONSEQUENCEundefined behavior

FATALdomain_rule?

WHENWhen designing a workflow where the index is persisted to storage today and re-loaded later (possibly by a different process / different developer) for query

ACTIONDo not rely on storage_context to remember which embedder built the index. Treat the embed model identity as caller-managed state — always reconstruct the index with the same explicit embed_model that was used at index time, or fail loudly when re-loading. Read llamaindex-C-004 for the remedy.

CONSEQUENCEundefined behavior

FATALdomain_rule?

WHENWhen persisting an index to disk / vector store today for later re-load and query

ACTIONAt index time: write a sidecar file (e.g. {storage_dir}/embed_model.json) with {'provider_class': type(embed_model).__module__ + '.' + type(embed_model).__name__, 'model_name': getattr(embed_model, 'model_name', None), 'embed_dim': getattr(embed_model, 'embed_dim', None) or len(embed_model.get_text_embedding('probe'))}. At re-load: read the sidecar, compare against Settings.embed_model or the embed_model passed to load_index_from_storage, raise EmbedModelMismatchError on any drift. Do not fall back to the new embedder.

CONSEQUENCEundefined behavior

FAQ

Discussion (0)

No comments yet. Be the first to share!

Changelog

v0.1.02026-04-25·Contributors: tangweigang-jpg

v0.1.0: Initial release on Doramagic.ai. RAG framework on run-llama/llama_index with bilingual metadata, 52 anti-pattern constraints (5 fatal), and 3 FAQs.

v0.1.02026-04-25·Contributors: tangweigang-jpg

v0.1.0: Initial release on Doramagic.ai. RAG framework on run-llama/llama_index with bilingual metadata, 52 anti-pattern constraints (5 fatal), and 3 FAQs.