# https://github.com/microsoft/graphrag Project Manual

Generated at: 2026-06-22 18:05:16 UTC

## Table of Contents

- [GraphRAG Overview and Architecture](#page-1)
- [Indexing Pipeline, Data Flow & Incremental Updates](#page-2)
- [Query Engine and Search Methods](#page-3)
- [Configuration, LLM Integration, Storage & Extensibility](#page-4)

<a id='page-1'></a>

## GraphRAG Overview and Architecture

### Related Pages

Related topics: [Indexing Pipeline, Data Flow & Incremental Updates](#page-2), [Query Engine and Search Methods](#page-3), [Configuration, LLM Integration, Storage & Extensibility](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/microsoft/graphrag/blob/main/README.md)
- [packages/graphrag/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/README.md)
- [packages/graphrag/graphrag/api/__init__.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/__init__.py)
- [packages/graphrag/graphrag/api/index.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/index.py)
- [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)
- [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py)
- [packages/graphrag/graphrag/cli/__init__.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/__init__.py)
- [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md)
- [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md)
- [packages/graphrag-llm/graphrag_llm/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-llm/graphrag_llm/README.md)
- [packages/graphrag-storage/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-storage/README.md)
- [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md)
- [packages/graphrag-cache/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-cache/README.md)
- [unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py)
</details>

# GraphRAG Overview and Architecture

## Purpose and Scope

GraphRAG is a data pipeline and transformation suite designed to extract meaningful, structured information from unstructured text using large language models (LLMs). The project implements a knowledge-graph–based memory layer that augments LLM reasoning over private datasets, as described in the upstream [Microsoft Research blog post](https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/) and the [GraphRAG arXiv paper](https://arxiv.org/pdf/2404.16130). Source: [README.md:1-9]().

The repository is a methodology demonstration, not an officially supported Microsoft product, and indexing is intentionally treated as an expensive operation that should be started on small data first. Source: [README.md:17-19](). The codebase is published as a monorepo of several Python packages (each with its own README) plus a `unified-search-app` demo that consumes the resulting index.

## Repository and Package Architecture

The monorepo separates concerns into narrowly-scoped libraries that can be composed at runtime. The top-level `graphrag` package exposes the user-facing API and CLI; the remaining packages provide pluggable, factory-based building blocks.

```mermaid
graph TB
    subgraph "graphrag (main package)"
        API[api: index, query, prompt_tune]
        CLI[cli: graphrag init/index/query]
        CFG[config: GraphRagConfig + load_config]
        IDX[index: run_pipeline, workflows]
    end

    subgraph "Supporting packages"
        CHUNK[graphrag-chunking]
        LLM[graphrag-llm: completion]
        STORE[graphrag-storage]
        CACHE[graphrag-cache]
        IN[graphrag-input]
        COMMON[graphrag-common: factory + config]
    end

    subgraph "Reference consumers"
        APP[unified-search-app]
    end

    API --> IDX
    API --> CFG
    CLI --> API
    IDX --> CHUNK
    IDX --> LLM
    IDX --> STORE
    IDX --> CACHE
    IDX --> IN
    API --> LLM
    APP --> STORE
    APP --> API
    COMMON -. provides .-> CHUNK
    COMMON -. provides .-> LLM
    COMMON -. provides .-> STORE
    COMMON -. provides .-> CACHE
```

Key architectural conventions:

- **Factory + DI pattern.** A shared `Factory` class in `graphrag-common` registers and resolves implementations by string strategy, with transient and singleton scopes. Source: [packages/graphrag-common/README.md:5-9]().
- **Config-driven setup.** `load_config` in `graphrag-common` auto-discovers YAML/JSON, performs environment variable substitution, and supports `.env` loading. Source: [packages/graphrag-common/README.md:13-15]().
- **Pluggable storage backends.** `graphrag-storage` registers `FileStorage`, `AzureBlobStorage`, `AzureCosmosStorage`, and `MemoryStorage`, with dynamic preregistration so unused providers are not imported. Source: [packages/graphrag-storage/README.md:43-55]().
- **Pluggable cache backends.** `graphrag-cache` ships `JsonCache`, `MemoryCache`, and `NoopCache` under the same factory model. Source: [packages/graphrag-cache/README.md:23-32]().
- **Pluggable input formats.** `graphrag-input` supports CSV, JSON, JSON Lines, plain text, and a `MarkItDown` converter that can ingest PDFs, Office files, HTML, etc. Source: [packages/graphrag-input/README.md:5-23]().
- **Pluggable chunking.** `graphrag-chunking` exposes `SentenceChunker`, `TokenChunker`, and a `create_chunker` factory keyed off a `ChunkingConfig` object. Source: [packages/graphrag-chunking/README.md:1-9]().
- **Pluggable LLM completion.** `graphrag-llm` provides a `create_completion(model_config)` function and a `ModelConfig` that abstracts over providers. Source: [packages/graphrag-llm/graphrag_llm/README.md:3-15]().

## Core APIs and Pipelines

The public surface of the main package is concentrated in `graphrag.api`, with three entry points: indexing, query, and prompt tuning. Source: [packages/graphrag/graphrag/api/__init__.py:11-33]().

### Indexing API

`build_index(config, method, is_update_run, callbacks, input_documents, ...)` runs a pipeline under a `GraphRagConfig`, choosing an `IndexingMethod` (e.g., `Standard`) and a `PipelineFactory`-resolved workflow. Source: [packages/graphrag/graphrag/api/index.py:21-49](). The function returns a list of `PipelineRunResult` records, allowing callers to inspect per-stage output. The `is_update_run` flag is the existing hook for the highly requested incremental-indexing workflow tracked in community discussion #741 ("Incremental indexing (adding new content)"), which has 35 comments and is currently in the design stage. Source: [packages/graphrag/graphrag/api/index.py:30-33]().

### Query API

`graphrag.api.query` exposes six search entry points: `global_search`, `global_search_streaming`, `local_search`, `local_search_streaming`, `drift_search`, `drift_search_streaming`, plus `basic_search` variants re-exported from `__init__`. Source: [packages/graphrag/graphrag/api/query.py:1-19]() and [packages/graphrag/graphrag/api/__init__.py:23-29](). Internally these functions call `get_global_search_engine`, `get_local_search_engine`, `get_drift_search_engine`, and `get_basic_search_engine` from the `query.factory` module, then rehydrate the persisted index through `read_indexer_*` adapter helpers. Source: [packages/graphrag/graphrag/api/query.py:33-44](). The expected table names for those artifacts are codified in `unified-search-app/app/data_config.py` (`output/communities`, `output/community_reports`, `output/entities`, `output/relationships`, `output/covariates`, `output/text_units`). Source: [unified-search-app/app/data_config.py:6-21]().

### Prompt Tuning API

`generate_indexing_prompts` (in `graphrag.api.prompt_tune`) drives auto-templating: it loads sample documents, detects language and domain, infers entity types, and synthesizes extraction, summarization, community-report, and reporter-role prompts. Source: [packages/graphrag/graphrag/api/prompt_tune.py:11-39](). This API is explicitly marked as under development and not yet stable. Source: [packages/graphrag/graphrag/api/prompt_tune.py:9-11]().

### CLI Surface

The `graphrag` CLI is exported from `graphrag.cli` and is the recommended starting point. The recommended initialization command is `graphrag init --root [path] --force`, which should be rerun between minor version bumps to pick up the latest config format. Source: [packages/graphrag/README.md:51-55]().

## Supporting Subsystems and Community-Driven Roadmap

Beyond the core APIs, several subsystems implement the storage, caching, and language-model abstractions that the indexer and query engines rely on:

- The `LLMCompletion` returned by `create_completion` is the abstraction used throughout the codebase; it returns either an `LLMCompletionResponse` or an `Iterator[LLMCompletionChunk]` for streaming, and `gather_completion_response` collapses both into a single string. Source: [packages/graphrag-llm/graphrag_llm/README.md:5-43]().
- The unified search app's data config defines reasonable defaults for downstream LLM use, including suggested follow-up questions and a 7-day Streamlit cache TTL, and notes that context-window settings should be tuned per model. Source: [unified-search-app/app/data_config.py:23-30]().
- The most recent release (v3.1.0) introduced a native `CosmosTableProvider` with namespace partitioning, transactional batch writes, and a simplified `AzureCosmosStorage`, plus a `litellm` dependency update that broadens indirect model-provider support. Source: [community release notes for v3.1.0]().

Open community threads shape the near-term roadmap and are useful to know when planning an adoption:

- **Incremental indexing (#741, 35 comments).** Add new documents to an existing index without a full re-run; design is in progress.
- **Additional model providers (#657, 15 comments; #345, 29 comments for Ollama).** Native support beyond OpenAI/Azure is not planned by the core team; the `litellm` upgrade in v3.1.0 and community workarounds for Ollama remain the primary paths.
- **Cheaper triplet extraction (#632, 2 comments).** Interest in integrating Triplex for cost reduction relative to `gpt-4o`.
- **LazyGraphRAG (#1512, 44 comments).** The most-upvoted open question, awaiting a release announcement.

## See Also

- [GraphRAG Indexing Pipeline (wiki)]()
- [GraphRAG Query Engine (wiki)]()
- [GraphRAG Configuration Reference (wiki)]()
- [GraphRAG Prompt Tuning Guide (wiki)]()
- [GraphRAG Storage Backends (wiki)]()

---

<a id='page-2'></a>

## Indexing Pipeline, Data Flow & Incremental Updates

### Related Pages

Related topics: [GraphRAG Overview and Architecture](#page-1), [Query Engine and Search Methods](#page-3), [Configuration, LLM Integration, Storage & Extensibility](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/microsoft/graphrag/blob/main/README.md)
- [packages/graphrag/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/README.md)
- [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md)
- [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md)
- [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md)
- [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py)
- [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py)
- [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)
- [packages/graphrag/graphrag/index/utils/__init__.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/index/utils/__init__.py)
- [unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py)
- [unified-search-app/app/knowledge_loader/data_sources/loader.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/knowledge_loader/data_sources/loader.py)
- [unified-search-app/README.md](https://github.com/microsoft/graphrag/blob/main/unified-search-app/README.md)
</details>

# Indexing Pipeline, Data Flow & Incremental Updates

## Overview and Purpose

GraphRAG's indexing pipeline is the data transformation suite that converts unstructured text into a structured knowledge graph plus derived artifacts (entities, relationships, communities, community reports, embeddings, and covariates). The repository positions this suite as "a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs" [README.md](https://github.com/microsoft/graphrag/blob/main/README.md). The system warns users that "GraphRAG indexing can be an expensive operation, please read all of the documentation to understand the process and costs involved, and start small" [README.md](https://github.com/microsoft/graphrag/blob/main/README.md).

The pipeline is composed of modular Python packages:

- `packages/graphrag-input` — loaders that ingest source documents from disk, blob storage, or `markitdown` for PDF parsing [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md).
- `packages/graphrag-chunking` — text splitters (sentence, token, factory-based) that produce text units [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md).
- `packages/graphrag` — the core library, including `graphrag.api.prompt_tune`, `graphrag.api.query`, the CLI (`graphrag.cli.prompt_tune`, `graphrag.cli.index`), and the runnable pipeline runner [packages/graphrag/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/README.md).
- `packages/graphrag-common` — shared infrastructure providing the `Factory` dependency-injection pattern and the `load_config` system that parses YAML/JSON with Pydantic, environment-variable substitution, and `.env` loading [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md).
- `unified-search-app` — a Streamlit reference application that consumes the produced parquet outputs to expose search and community exploration [unified-search-app/README.md](https://github.com/microsoft/graphrag/blob/main/unified-search-app/README.md).

The pipeline writes its final results as parquet tables under well-known paths consumed downstream by the query engine and the search app: `output/communities`, `output/community_reports`, `output/entities`, `output/relationships`, `output/covariates`, and `output/text_units` [unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py).

## Data Flow Stages

The runtime flow from raw documents to queryable index follows a five-stage pipeline, each stage producing artifacts that the next stage consumes.

```mermaid
flowchart LR
    A[Input Loader<br/>graphrag-input] --> B[Chunking<br/>graphrag-chunking]
    B --> C[Graph Extraction<br/>LLM: entities/relationships]
    C --> D[Community Detection<br/>+ Report Generation]
    D --> E[Embeddings & Covariates]
    E --> F[(Parquet Outputs<br/>text_units, entities,<br/>relationships, communities,<br/>community_reports, covariates)]
    F --> G[Query / Search App<br/>graphrag.api.query]
```

Key behaviors observed in the source:

- **Input** — loaders read raw files according to a configured `input.type` (for example, `markitdown` with a file pattern such as `".*\\.pdf$$"`) and an `input_storage` block describing where the input lives (e.g. local `type: file`, `base_dir: input`) [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md). The unified-search-app's `create_datasource` switches between `BlobDatasource` and `LocalDatasource` based on whether `blob_account_name` is set, demonstrating the same pluggable strategy used inside the indexing CLI [unified-search-app/app/knowledge_loader/data_sources/loader.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/knowledge_loader/data_sources/loader.py).
- **Chunking** — the `ChunkingConfig` selects a strategy via `create_chunker`, with `SentenceChunker` for boundary detection and `TokenChunker` for fixed-size windows with overlap [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md). During prompt tuning, the chunking overrides are read from the loaded graph config: `if chunk_size != graph_config.chunking.size: graph_config.chunking.size = chunk_size` and the same pattern is used for overlap [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py).
- **Prompt Tuning (optional pre-pass)** — `generate_indexing_prompts` chunks a sample of documents, derives a domain and persona from the LLM if not supplied, and returns the entity-extraction, entity-summarization, and community-summarization prompts that downstream index stages will use [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py). The CLI mirrors this API, writing logs to `prompt-tuning.log` and honoring overrides for `chunk_size`, `overlap`, `limit`, `selection_method`, `domain`, `language`, `max_tokens`, `discover_entity_types`, and `min_examples_required` [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py).
- **Graph Extraction** — text units are sent to the LLM with the tuned prompts to produce entities, relationships, claims/covariates, and descriptions. This stage is the dominant cost driver and is what makes GraphRAG "an expensive operation" [README.md](https://github.com/microsoft/graphrag/blob/main/README.md).
- **Community Detection and Reporting** — Leiden/Leiden-like algorithms produce a hierarchy of communities; an LLM-driven reporter generates per-community summaries that the global search engine consumes [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py).
- **Outputs** — the canonical tables listed above are persisted and re-read by `local_search` and `global_search` via DataFrame parameters (`entities`, `relationships`, `text_units`, `community_reports`, `covariates`, `communities`) [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py). The unified-search-app's UI then renders these as citations, hyperlinking entity/relationship IDs back to source text units [unified-search-app/app/ui/search.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/ui/search.py).

## Incremental Updates: Current State

Incremental indexing — the ability to add new documents to an existing index without rebuilding from scratch — is the most engaged community topic, tracked in issue [#741 "Incremental indexing (adding new content)"](https://github.com/microsoft/graphrag/issues/741). As of v3.1.0, the maintainers state that the feature is "in the design stages" and provide a manual workaround. The current repository architecture, however, still assumes a full re-run for the parquet outputs consumed by `graphrag.api.query` [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py).

What users can do today without a re-index:

1. Append new files to the input directory configured via `input_storage` (local or blob) [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md).
2. Re-run the full pipeline; the loaders will pick up the new files based on the configured file pattern (e.g. `".*\\.pdf$$"`) [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md).
3. Swap in a different parquet output backend. The v3.1.0 release notes call out a "Native CosmosTableProvider with namespace partitioning, transactional batch writes, and simplified AzureCosmosStorage", which makes it easier to persist index artifacts in Azure Cosmos and treat each pipeline run as a partitioned namespace — a foundation for future incremental runs.

What is not yet first-class:

- There is no `incremental` flag or delta-detection step in the chunking or graph extraction APIs visible in `graphrag-chunking` or `graphrag.api.prompt_tune` [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py).
- Community detection and report generation re-derive the full hierarchy on each run; merging communities across runs is not implemented in the `graphrag.api.query` interface [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py).
- The unified-search-app reads outputs via `create_datasource`, which selects either `BlobDatasource` or `LocalDatasource` based on environment, but does not perform any merge or diff of prior and new parquet outputs [unified-search-app/app/knowledge_loader/data_sources/loader.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/knowledge_loader/data_sources/loader.py).

Until incremental indexing ships, the recommended operational pattern is to version output directories per run and treat each run as immutable.

## Configuration, CLI, and Extensibility

All pipeline behavior is driven by `settings.yaml`, parsed through `load_config` which "automatically discovers and parses YAML/JSON config files into Pydantic models with support for environment variable substitution and .env file loading" [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md). Strategies (chunkers, model providers, storage backends) are registered through the `Factory` class with transient or singleton scope, allowing new implementations to be plugged in without changing call sites [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md).

The prompt-tuning CLI explicitly honors per-invocation overrides for chunking parameters before delegating to the API, illustrating how users can experiment without rewriting the config file [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py). Community demand for non-OpenAI/Azure providers (issue [#657](https://github.com/microsoft/graphrag/issues/657), with [#345](https://github.com/microsoft/graphrag/issues/345) focused on Ollama) flows through this same factory mechanism — new model providers are added by registering a strategy string in `graphrag-common`'s Factory rather than by patching core code [packages/graphrag-common/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-common/README.md).

Downstream, the Streamlit unified-search-app renders citation tables for each context type (`sources`, `reports`, `entities`, `relationships`, `covariates`) by reading the parquet outputs the indexing pipeline produces, making the pipeline's contract with the query layer explicit and stable [unified-search-app/app/ui/search.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/ui/search.py).

## See Also

- [Prompt Tuning Guide](https://microsoft.github.io/graphrag/prompt_tuning/overview/)
- Related wiki pages: Query APIs & Search Modes, Configuration Reference, Chunking Strategies
- Community discussions: [Incremental indexing #741](https://github.com/microsoft/graphrag/issues/741), [Non-OpenAI providers #657](https://github.com/microsoft/graphrag/issues/657), [Ollama support #345](https://github.com/microsoft/graphrag/issues/345), [LazyGraphRAG #1512](https://github.com/microsoft/graphrag/issues/1512)

---

<a id='page-3'></a>

## Query Engine and Search Methods

### Related Pages

Related topics: [GraphRAG Overview and Architecture](#page-1), [Indexing Pipeline, Data Flow & Incremental Updates](#page-2), [Configuration, LLM Integration, Storage & Extensibility](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)
- [packages/graphrag/graphrag/cli/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/query.py)
- [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py)
- [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py)
- [packages/graphrag/graphrag/query/input/loaders/dfs.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/query/input/loaders/dfs.py)
- [unified-search-app/app/home_page.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/home_page.py)
- [unified-search-app/app/ui/search.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/ui/search.py)
- [unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py)
- [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md)
- [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md)
</details>

# Query Engine and Search Methods

## Overview

The Query Engine is the retrieval layer of Microsoft GraphRAG. After the indexer produces a knowledge graph (entities, relationships, communities, community reports, text units, and optional covariates), the query engine consumes those parquet outputs and returns natural-language answers grounded in the graph. The module exposes a public API and a CLI, and is also embedded inside the Streamlit-based `unified-search-app` for interactive exploration.

The module's docstring states it "provides access to the query engine of graphrag, allowing external applications to hook into graphrag and run queries over a knowledge graph" and warns that "this API is under development and may undergo changes in future releases. Backwards compatibility is not guaranteed at this time" ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)). Treat the surface as stable in shape but evolving in detail.

## Search Methods

The query engine implements multiple search strategies, each suited to a different question type. They are assembled through a factory in `graphrag.query.factory` (referenced as `get_basic_search_engine`, `get_drift_search_engine`, `get_global_search_engine`, and `get_local_search_engine` in the API module) and selected via the CLI's `--method` argument.

- **Local Search** — entity-centric retrieval. Uses entities, their text-unit neighborhoods, relationships, and covariates to answer questions about specific people, places, or concepts. The API signature requires `entities`, `communities`, `community_reports`, `text_units`, `relationships`, `community_level`, and `response_type` ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)).
- **Global Search** — map-reduce over community reports. The engine distributes the query across many community summaries and consolidates partial answers into a single response. A `dynamic_community_selection` flag enables runtime selection of communities, capped by `community_level` ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)).
- **DRIFT Search** — a dynamic variant that combines local and global reasoning by introducing exploratory sub-queries; useful for comparative or "why/how" questions. The CLI exposes it as a distinct `--method` value ([packages/graphrag/graphrag/cli/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/query.py)).
- **Basic Search** — text-unit-only retrieval, lightweight, with no graph traversal ([packages/graphrag/graphrag/cli/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/query.py)).

Every method has both a blocking variant (`global_search`, `local_search`) and a streaming variant (`global_search_streaming`, `local_search_streaming`) that yield chunks via an `AsyncGenerator` ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)).

```mermaid
flowchart LR
    A[User Query] --> B{Method}
    B -->|local| C[Local Search Engine]
    B -->|global| D[Global Search Engine]
    B -->|drift| E[DRIFT Search Engine]
    B -->|basic| F[Basic Search Engine]
    C --> G[Index Artifacts]
    D --> G
    E --> G
    F --> G
    G --> H[Response + Context]
```

## API and CLI Usage

The API functions take a `GraphRagConfig` (loaded from `settings.yaml`) plus the relevant pandas `DataFrame`s. `_resolve_output_files` in the CLI is responsible for loading the parquet outputs required by a given method ([packages/graphrag/graphrag/cli/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/query.py)). Records are normalized into typed objects via `read_indexer_entities`, `read_indexer_relationships`, `read_indexer_text_units`, `read_indexer_reports`, `read_indexer_report_embeddings`, `read_indexer_communities`, and `read_indexer_covariates` ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)). `Entity` and `Relationship` builders accept configurable column names, so custom indexers can be plugged in by remapping columns ([packages/graphrag/graphrag/query/input/loaders/dfs.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/query/input/loaders/dfs.py)).

On the CLI side, `graphrag query --method <local|global|drift|basic>` is the entry point. Each method has a dedicated runner (`run_global_search`, `run_local_search`, `run_drift_search`, `run_basic_search`) and the streaming path is triggered with `--streaming`. Response style is controlled by `--response-type` (e.g., `multiple_paragraphs`, `single_paragraph`, `prioritized_list`) ([packages/graphrag/graphrag/cli/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/query.py)).

## Prompt Tuning and the Unified Search App

Before running queries, users typically tune their prompts via `graphrag prompt-tune`, which uses the same `GraphRagConfig` to load chunks and produce entity-extraction, entity-summarization, and community-summarization prompts ([packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py)). The CLI override pattern lets the tuning run inject chunk-size and overlap adjustments into the loaded config ([packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py)).

The `unified-search-app` is a Streamlit reference consumer of the query engine. `home_page.py` wires search buttons to `run_all_searches` and `run_generate_questions` ([unified-search-app/app/home_page.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/home_page.py)). `ui/search.py` renders per-method responses, token usage, LLM call counts, and a Citations panel that lists the entities, relationships, reports, and source chunks the engine consumed ([unified-search-app/app/ui/search.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/ui/search.py)). The expected parquet paths for the app live in `app/data_config.py` (e.g., `output/communities`, `output/community_reports`, `output/entities`, `output/relationships`, `output/covariates`, `output/text_units`) ([unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py)).

## Configuration and Known Constraints

Several practical constraints surface from the source and from community discussion:

- **Model providers.** The API and CLI instantiate completion and embedding models through the `GraphRagConfig`, which natively targets OpenAI and Azure. Community requests for additional providers (Ollama, other SLMs, custom endpoints) are tracked but not yet supported in-tree ([issue #657](https://github.com/microsoft/graphrag/issues/657), [issue #345](https://github.com/microsoft/graphrag/issues/345)).
- **Cheaper extraction.** Triplex-style models have been proposed to lower extraction cost during indexing; this affects the indexer rather than the query engine, but the engine consumes the result ([issue #632](https://github.com/microsoft/graphrag/issues/632)).
- **LazyGraphRAG.** A deferred-evaluation variant has been requested; once shipped it would likely plug in alongside the existing factory methods ([issue #1512](https://github.com/microsoft/graphrag/issues/1512)).
- **Incremental indexing.** Adding new documents to an existing index currently requires a re-run; the engine is unaffected, but the artifacts it loads would need to be regenerated or extended ([issue #741](https://github.com/microsoft/graphrag/issues/741)).
- **Data layout.** Because the engine loads from parquet, downstream tools must respect the column conventions expected by `read_indexer_*` helpers; mismatches will fail at load time ([packages/graphrag/graphrag/query/input/loaders/dfs.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/query/input/loaders/dfs.py)).
- **API stability.** The module docstring explicitly warns that "backwards compatibility is not guaranteed at this time", so pin versions and avoid coupling external code to internal helper signatures ([packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)).

## See Also

- Indexing and Prompt Tuning pipeline
- Configuration reference (`settings.yaml` and `GraphRagConfig`)
- `unified-search-app` user guide
- Chunking strategies (`packages/graphrag-chunking`)
- Input loaders (`packages/graphrag-input`)

---

<a id='page-4'></a>

## Configuration, LLM Integration, Storage & Extensibility

### Related Pages

Related topics: [GraphRAG Overview and Architecture](#page-1), [Indexing Pipeline, Data Flow & Incremental Updates](#page-2), [Query Engine and Search Methods](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [packages/graphrag/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/README.md)
- [packages/graphrag-storage/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-storage/README.md)
- [packages/graphrag-chunking/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-chunking/README.md)
- [packages/graphrag-input/README.md](https://github.com/microsoft/graphrag/blob/main/packages/graphrag-input/README.md)
- [packages/graphrag/graphrag/config/models/__init__.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/config/models/__init__.py)
- [packages/graphrag/graphrag/api/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/prompt_tune.py)
- [packages/graphrag/graphrag/api/query.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/api/query.py)
- [packages/graphrag/graphrag/cli/prompt_tune.py](https://github.com/microsoft/graphrag/blob/main/packages/graphrag/graphrag/cli/prompt_tune.py)
- [unified-search-app/app/data_config.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/data_config.py)
- [unified-search-app/app/knowledge_loader/__init__.py](https://github.com/microsoft/graphrag/blob/main/unified-search-app/app/knowledge_loader/__init__.py)
</details>

# Configuration, LLM Integration, Storage & Extensibility

## Overview

GraphRAG is a data pipeline and transformation suite designed to extract meaningful, structured data from unstructured text using LLMs. Source: [packages/graphrag/README.md](). Underneath the indexing and query APIs sit four foundational subsystems that determine how the project is configured, how it talks to language models, where it persists intermediate artifacts, and how third parties can plug in new behavior. These four subsystems — configuration, LLM integration, storage, and extensibility — are the primary surfaces users customize when adapting GraphRAG to their own data and infrastructure.

The configuration layer is built on Pydantic-style typed config models exposed under `graphrag.config.models`. Source: [packages/graphrag/graphrag/config/models/__init__.py](). It is loaded via `load_config` and accepts a `settings.yaml` file as the canonical user-facing configuration artifact.

## Configuration System

The `GraphRagConfig` model is the central object that drives indexing, prompt tuning, and query workflows. Every public API accepts a `GraphRagConfig` instance and reads model, storage, chunking, and input settings from it. Source: [packages/graphrag/graphrag/api/prompt_tune.py]().

The `prompt_tune` API shows the typical usage pattern: the configuration is loaded, an LLM is instantiated via `create_completion(default_llm_settings)`, and downstream operations are configured against the typed model. Source: [packages/graphrag/graphrag/api/prompt_tune.py](). The CLI mirror in `graphrag.cli.prompt_tune` calls `load_config(root_dir=root)` and allows runtime overrides such as `chunk_size` and `chunking.overlap` before invoking the prompt-tuning pipeline. Source: [packages/graphrag/graphrag/cli/prompt_tune.py]().

The unified search application uses a parallel `data_config.py` module that defines table names for downstream artifacts (`output/communities`, `output/community_reports`, `output/entities`, `output/relationships`, `output/covariates`, `output/text_units`). Source: [unified-search-app/app/data_config.py](). This reflects how a built index is consumed at query time and how output artifacts are addressed independently of storage backend.

## LLM Integration

Native LLM support in GraphRAG is implemented through completion-model configuration objects and a `create_completion` factory. The prompt-tuning API explicitly retrieves the model via `config.get_completion_model_config(PROMPT_TUNING_MODEL_ID)` and instantiates the model with `create_completion(...)`. Source: [packages/graphrag/graphrag/api/prompt_tune.py]().

A second completion model is retrieved for graph extraction when `discover_entity_types` is enabled: `config.get_completion_model_config(config.extract_graph.completion_model_id)`. Source: [packages/graphrag/graphrag/api/prompt_tune.py](). This separation lets users run a cheaper model for prompt tuning while keeping a stronger model for entity/relationship extraction.

For query time, both global (`global_search`) and local (`local_search`) APIs in `graphrag.api.query` accept the same `GraphRagConfig`, ensuring a consistent model-selection surface across indexing and retrieval. Source: [packages/graphrag/graphrag/api/query.py]().

## Storage Architecture

The `graphrag-storage` package provides a unified storage abstraction. By default the `create_storage` factory ships with four preregistered providers corresponding to a `StorageType` enum. Source: [packages/graphrag-storage/README.md]().

```mermaid
graph LR
    Config["GraphRagConfig"] --> Factory["create_storage / storage_factory"]
    Factory --> FS["FileStorage"]
    Factory --> ABS["AzureBlobStorage"]
    Factory --> ACS["AzureCosmosStorage"]
    Factory --> MS["MemoryStorage"]
    User["User-defined Storage subclass"] -.register.-> Factory
```

Registration is dynamic — `FileStorage` is only imported when requested — and users can bypass preregistration by importing `storage_factory` directly for a clean factory. Source: [packages/graphrag-storage/README.md](). The v3.1.0 release notes describe a native `CosmosTableProvider` with namespace partitioning, transactional batch writes, and a simplified `AzureCosmosStorage`, indicating that the storage layer is actively evolving toward richer table semantics. Source: [packages/graphrag-storage/README.md]().

| Provider | Typical Use | Notes from source |
| --- | --- | --- |
| `FileStorage` | Local development | Default; lazily imported |
| `AzureBlobStorage` | Cloud blob persistence | Pre-registered |
| `AzureCosmosStorage` | Cosmos DB-backed storage | Simplified in v3.1.0; adds table-provider semantics |
| `MemoryStorage` | Tests / ephemeral pipelines | Pre-registered |

## Extensibility

GraphRAG is designed for extension at every layer. Three concrete extension points are documented:

- **Storage**: Subclass the base `Storage` class and register the new provider with the factory; it can then be instantiated via `create_storage` or `storage_factory`. Source: [packages/graphrag-storage/README.md]().
- **Chunking**: A `create_chunker` factory accepts a `ChunkingConfig` and instantiates strategies such as `SentenceChunker` (NLTK-based) or `TokenChunker` (tokenizer-based with configurable size and overlap). Source: [packages/graphrag-chunking/README.md]().
- **Input**: Input loaders are configured via a YAML block that combines `input.type` (for example, `markitdown`) with a `file_pattern` regex; PDF processing additionally requires the `markitdown[pdf]` extra. Source: [packages/graphrag-input/README.md]().
- **Knowledge loading (apps)**: The unified search app organizes custom loaders under `app.knowledge_loader`, with pluggable `data_sources` submodules. Source: [unified-search-app/app/knowledge_loader/__init__.py]() and [unified-search-app/app/knowledge_loader/data_sources/__init__.py]().

## Common Failure Modes and Community Notes

Several recurring community discussions intersect with the topics on this page. Users have repeatedly asked for **non-OpenAI/Azure model providers** such as Ollama (#657, #345) and for cheaper extractors like Triplex (#632); because native support is limited to OpenAI and Azure, these integrations typically rely on OpenAI-compatible endpoints wired through `create_completion`. Source: [packages/graphrag/graphrag/api/prompt_tune.py]().

Incremental indexing (#741) is itself an extensibility concern: users wishing to add content today must re-run the full pipeline because the storage and configuration layers do not yet expose a partial-update API. Source: [packages/graphrag-storage/README.md]().

## See Also

- GraphRAG main package: [packages/graphrag/README.md]()
- Storage package details: [packages/graphrag-storage/README.md]()
- Chunking strategies: [packages/graphrag-chunking/README.md]()
- Input loaders: [packages/graphrag-input/README.md]()
- Query API: [packages/graphrag/graphrag/api/query.py]()

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: microsoft/graphrag

Summary: Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

## 1. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/microsoft/graphrag

## 2. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/graphrag

## 3. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/microsoft/graphrag

## 4. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/microsoft/graphrag

## 5. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/graphrag

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/graphrag

<!-- canonical_name: microsoft/graphrag; human_manual_source: deepwiki_human_wiki -->