Doramagic Project Pack · Human Manual
graphiti
Graphiti uses a pluggable driver architecture to support multiple backends:
Introduction to Graphiti
Related topics: Temporal Context Graphs, Installation Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Temporal Context Graphs, Installation Guide
Introduction to Graphiti
Graphiti is an open-source temporal context graph engine designed specifically for AI agents operating in dynamic, evolving environments. Unlike traditional knowledge graphs or RAG systems that rely on static data summarization, Graphiti continuously integrates user interactions, structured and unstructured enterprise data, and external information into a coherent, queryable graph that tracks how facts change over time. Source: README.md:1
Purpose and Scope
Graphiti serves as the foundational technology for building and querying temporal context graphs—graphs where every entity, relationship, and fact carries temporal validity windows that record when information became true and when it was potentially superseded.
The framework addresses fundamental limitations of traditional RAG approaches:
- Static data handling: Traditional RAG processes data in batches, making it inefficient for frequently changing information
- Lack of temporal tracking: Conventional systems cannot answer questions about what was true at a specific point in time
- Limited provenance: Most systems cannot trace facts back to their source data
- No contradiction handling: Static summaries cannot automatically invalidate outdated information
Graphiti is particularly suitable for developing interactive, context-aware AI applications that require real-time interaction and precise historical queries. Source: README.md:1
Core Concepts
Context Graph Architecture
A context graph is a temporal graph of entities, relationships, and facts with built-in versioning. Every piece of information in Graphiti has a validity window and full lineage back to source data.
graph TD
subgraph "Context Graph Components"
E[Entities<br/>Nodes]
F[Facts/Relationships<br/>Edges with Validity Windows]
EP[Episodes<br/>Provenance]
T[Custom Types<br/>Ontology]
end
E --> F
F --> EP
T --> E
T --> F
style E fill:#e1f5fe
style F fill:#fff3e0
style EP fill:#e8f5e9
style T fill:#f3e5f5Key Components
| Component | Description |
|---|---|
| Entities (Nodes) | People, products, policies, concepts—with summaries that evolve over time |
| Facts/Relationships (Edges) | Triplets (Entity → Relationship → Entity) with temporal validity windows |
| Episodes (Provenance) | Raw data as ingested—the ground truth stream. Every derived fact traces back here |
| Custom Types (Ontology) | Developer-defined entity and edge types via Pydantic models |
Source: README.md:1
Temporal Fact Management
Graphiti's core innovation is its bi-temporal tracking system:
- Valid Time: When a fact became true in the real world
- System Time: When the fact was recorded in the graph
- Invalidation: When a fact is superseded, it is marked as invalid—not deleted—preserving full history
This approach enables queries like "What did the user prefer in January 2024?" while maintaining accurate current state. Source: README.md:1
Key Features and Capabilities
Multi-Database Support
Graphiti supports multiple graph database backends, providing flexibility in deployment:
| Database | Status | Full-Text Search Backend |
|---|---|---|
| Neo4j 5.26+ | Primary | Native full-text |
| FalkorDB 1.1.2+ | Supported | RediSearch |
| Kuzu 0.11.2+ | Supported | Native |
| Amazon Neptune | Supported | OpenSearch Serverless |
Source: README.md:1
Multi-Provider LLM and Embedding Support
Graphiti is designed to work with various AI providers beyond OpenAI:
LLM Providers:
- OpenAI (default)
- Azure OpenAI
- Anthropic (Claude)
- Google (Gemini)
- Groq
Embedding Providers:
- OpenAI Embeddings (default)
- Azure OpenAI Embeddings
- Voyage AI
- Sentence Transformers
- Google Gemini Embeddings
Source: mcp_server/README.md:1
Hybrid Search Capabilities
Graphiti combines multiple retrieval strategies:
graph LR
A[Query] --> B[Semantic Search]
A --> C[BM25 Keyword Search]
A --> D[Graph Traversal]
B --> E[Hybrid Results]
C --> E
D --> E
E --> F[Reranked Results]- Semantic Search: Embedding-based similarity matching
- BM25: Traditional keyword-based retrieval
- Graph Traversal: Leverages entity relationships for context-aware results
- Center Node Reranking: Reranks results based on graph distance from a specific entity
Source: examples/quickstart/README.md:1
Episode Ingestion
Episodes are the raw data inputs that Graphiti processes to extract entities and relationships. Supported episode formats include:
- Plain text content
- Structured JSON data
- Message-style content with source attribution
Source: mcp_server/README.md:1
Entity Type System
Graphiti supports extensible entity types through Pydantic models. The MCP server includes built-in types:
| Entity Type | Purpose |
|---|---|
| Preference | User preferences, choices, opinions, selections |
| Requirement | Specific needs, features, functionality |
| Procedure | Step-by-step processes, workflows |
| Location | Physical places, geographical entities |
| Event | Occurrences, activities, meetings |
| Object | Physical items, tools, devices, possessions |
| Topic | Subjects of conversation, interest domains |
| Organization | Companies, institutions, groups |
| Document | Files, records, written materials |
Source: mcp_server/src/models/entity_types.py:1
Architecture Overview
Core System Components
graph TD
subgraph "Ingestion Pipeline"
I1[Episode Input] --> EX[Extractor]
EX --> N[Node Extractor]
EX --> E[Edge Extractor]
N --> NC[Node Constructor]
E --> EC[Edge Constructor]
NC --> G[(Graph Store)]
EC --> G
end
subgraph "Query Pipeline"
Q[Query Input] --> SR[Search Recipe]
SR --> SS[Searcher]
SS --> HY[Hybrid Combiner]
HY --> RR[Reranker]
RR --> R[Results]
end
G --> SSDriver Architecture
Graphiti uses a pluggable driver architecture to support multiple backends:
- Neo4jDriver: Primary driver for Neo4j databases
- FalkorDriver: Driver for FalkorDB with RediSearch integration
- KuzuDriver: Driver for Kuzu embedded graph database
- SearchInterface: Unified search interface across all drivers
Source: README.md:1
Installation and Setup
Requirements
- Python 3.10 or higher
- One of the supported database backends
- API key for your chosen LLM/embedding provider
Source: README.md:1
Quick Installation
pip install graphiti-core
Source: examples/quickstart/README.md:1
Environment Configuration
Required:
export OPENAI_API_KEY=your_openai_api_key
Optional Neo4j:
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=password
Optional FalkorDB:
export FALKORDB_URI=falkor://localhost:6379
Optional Azure OpenAI:
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_DEPLOYMENT=your-deployment
Source: examples/quickstart/README.md:1
Basic Usage Patterns
Initializing Graphiti
from graphiti import Graphiti
from graphiti_core.driver import Neo4jDriver
driver = Neo4jDriver(
uri="bolt://localhost:7687",
user="neo4j",
password="password"
)
graphiti = Graphiti(driver)
await graphiti.setup()
Source: examples/quickstart/README.md:1
Adding Episodes
await graphiti.add_episode(
name="User preference discussion",
episode_body="The user prefers dark mode and不喜欢 notifications",
source="user_message"
)
Source: mcp_server/README.md:1
Searching the Graph
Hybrid Edge Search:
results = await graphiti.search(
query="user preferences for interface settings",
include_edges=True
)
Center Node Search (Graph-aware Reranking):
results = await graphiti.search(
query="interface preferences",
center_node_uuid="entity-uuid-here",
include_edges=True
)
Source: examples/quickstart/README.md:1
MCP Server Integration
Graphiti includes a Model Context Protocol (MCP) server that exposes its functionality to AI assistants:
graph LR
A[Claude/<br/>Cursor] --> B[MCP Server]
B --> C[Graphiti Core]
C --> D[(Neo4j/<br/>FalkorDB)]Available MCP Tools
| Tool | Description |
|---|---|
add_memory | Add episodes to the knowledge graph |
search_nodes | Search for entity nodes |
search_facts | Search for relationships/facts |
get_entity_edge | Retrieve a specific edge by UUID |
get_episodes | Get recent episodes for a group |
delete_episode | Remove an episode and its derived facts |
delete_entity_edge | Remove a specific edge |
clear_graph | Clear all data and rebuild indices |
get_status | Check server and database status |
Source: mcp_server/README.md:1
Running the MCP Server
Using Python:
uv run main.py --group-id <your_group_id>
Using Docker:
docker compose up
Source: mcp_server/README.md:1
Comparison with Traditional Approaches
Graphiti vs GraphRAG
| Aspect | GraphRAG | Graphiti |
|---|---|---|
| Primary Use | Static document summarization | Dynamic, evolving context for agents |
| Data Handling | Batch-oriented processing | Continuous, incremental updates |
| Knowledge Structure | Entity clusters & community summaries | Temporal context graph with validity windows |
| Retrieval Method | Sequential LLM summarization | Hybrid semantic, keyword, and graph-based search |
| Temporal Handling | Basic timestamp tracking | Explicit bi-temporal tracking with automatic invalidation |
| Query Latency | Seconds to tens of seconds | Typically sub-second latency |
| Custom Entity Types | No | Yes, customizable via Pydantic models |
Source: README.md:1
When to Choose Graphiti
Graphiti is the right choice when:
- Your data changes frequently and requires real-time updates
- You need to query historical states ("What was true at time X?")
- Provenance tracing is important (linking facts back to source data)
- You need to handle contradictory information with automatic invalidation
- You're building agentic AI systems that maintain conversation context
Source: README.md:1
Relationship to Zep
Graphiti is the open-source core of Zep, a managed context graph infrastructure for AI agents.
| Aspect | Zep | Graphiti |
|---|---|---|
| What they are | Managed platform with SLAs and support | Open-source engine you self-host |
| Context graphs | Multi-tenant, governed infrastructure | Build and query individual graphs |
| User management | Built-in | Build your own |
| Retrieval performance | Sub-200ms at scale | Depends on your implementation |
| Deployment | Fully managed or self-hosted | Self-hosted only |
Source: README.md:1
Known Limitations and Issues
Active Bug Reports
Be aware of these known issues when using specific configurations:
| Issue | Severity | Workaround |
|---|---|---|
FalkorDB group_id with hyphens causes RediSearch syntax errors (#1483) | Medium | Avoid hyphens in group_id values |
Neo4j database parameter not honored in execute_query() (#1481) | Medium | Use explicit database selection at connection time |
search_memory_facts fails with Neo4j DateTime serialization (#1438) | Medium | Convert datetime fields before search operations |
Source: mcp_server/README.md:1
Community Feature Requests
The community has requested several enhancements:
- Amazon Bedrock support (#459): Native integration with AWS Bedrock models
- MemGraph driver (#642): Support for MemGraph as an alternative backend
- RDF support (#933): Integration with RDF knowledge graphs for semantic inference
- PostgreSQL/pgvector (#779): Alternative storage backend using pgvector for embeddings
Source: README.md:1
Telemetry and Privacy
Graphiti collects anonymous telemetry data to help improve the project. The data collected includes:
- Anonymous UUID identifier
- System information (OS, Python version, architecture)
- Graphiti version
- Configuration choices (LLM provider, database backend, embedder type)
What is never collected:
- Personal information or API keys
- Actual data, queries, or graph content
- IP addresses or hostnames
Telemetry is opt-out. To disable:
export GRAPHTI_TELEMETRY_ENABLED=false
Source: README.md:1
Next Steps
- Review the Quickstart Guide for hands-on examples
- Explore the Azure OpenAI Example for cloud deployments
- Configure the MCP Server for AI assistant integration
- Study the Entity Types to customize your ontology
- Consult the Paper for architectural details
Source: https://github.com/getzep/graphiti / Human Manual
Installation Guide
This guide covers the installation, configuration, and setup of Graphiti for building temporal context graphs. Graphiti is a Python library that requires Python 3.10+ and supports multiple...
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
This guide covers the installation, configuration, and setup of Graphiti for building temporal context graphs. Graphiti is a Python library that requires Python 3.10+ and supports multiple graph database backends.
Prerequisites
Before installing Graphiti, ensure your environment meets the following requirements.
System Requirements
| Requirement | Version/Details |
|---|---|
| Python | 3.10 or higher |
| Package Manager | pip or uv |
| Operating System | Linux, macOS, Windows |
Supported Graph Databases
Graphiti supports the following graph database backends:
| Database | Minimum Version | Notes |
|---|---|---|
| Neo4j | 5.26 | Default backend |
| FalkorDB | 1.1.2 | Full-text search via RediSearch |
| Kuzu | 0.11.2 | Embedded database |
| Amazon Neptune | Cluster or Neptune Analytics | Requires OpenSearch for full-text search |
LLM Requirements
Graphiti defaults to OpenAI for LLM inference and embedding. The library works best with LLM services that support Structured Output.
Installation Methods
Using pip
Install graphiti-core from PyPI:
pip install graphiti-core
Using uv (Recommended)
For faster dependency resolution:
uv pip install graphiti-core
MCP Server Installation
To install the MCP server component:
pip install graphiti-mcp
Or with uv:
uv pip install graphiti-mcp
Source: pyproject.toml
Environment Configuration
Graphiti uses a YAML-based configuration system with environment variable support.
Creating the Configuration File
Create a config.yaml file in your project root:
graphiti:
group_id: "main" # Namespace for graph data
neo4j:
uri: "bolt://localhost:7687"
user: "neo4j"
password: "${NEO4J_PASSWORD}"
database: "neo4j"
Environment Variables
Graphiti supports environment variable expansion using ${VAR_NAME} or ${VAR_NAME:default} syntax.
#### Required Variables
| Variable | Default | Description |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Neo4j connection URI |
NEO4J_USER | neo4j | Neo4j username |
NEO4J_PASSWORD | demodemo | Neo4j password |
#### LLM Provider Variables
Depending on your chosen LLM provider:
| Provider | Variables Required |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Google Gemini | GOOGLE_API_KEY |
| Groq | GROQ_API_KEY |
| Azure OpenAI | AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT, AZURE_OPENAI_EMBEDDING_DEPLOYMENT |
Source: mcp_server/README.md
Example Environment File
Copy from the provided example:
cp .env.example .env
Then edit .env with your specific values:
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_secure_password
OPENAI_API_KEY=sk-...
Source: .env.example
Docker Deployment
For containerized deployments, use the provided Docker configuration.
Using Docker Compose
The repository includes a docker-compose.yml for quick setup:
docker compose up
This starts the Graphiti MCP server with default settings.
Docker Configuration
The Docker setup includes:
- Neo4j database container
- Graphiti MCP server
- Proper network configuration
- Volume mounts for data persistence
Source: docker-compose.yml
Database Setup
Neo4j Setup
- Download and install Neo4j Desktop
- Create a new database
- Start the database
- Configure credentials in your environment or config file
For production deployments:
neo4j:
uri: "${NEO4J_URI}"
user: "${NEO4J_USER}"
password: "${NEO4J_PASSWORD}"
database: "${NEO4J_DATABASE}"
FalkorDB Setup
When using FalkorDB, ensure Redis is running with RediSearch module enabled:
docker run -d --name redisearch -p 6379:6379 redislabs/redisearch:latest
Source: README.md
Quick Start Verification
After installation, verify your setup by running the quickstart example:
cd examples/quickstart
uv run python quickstart_neo4j.py
This script will:
- Initialize the Graphiti client
- Add sample episodes
- Perform hybrid searches
- Demonstrate graph-aware search with reranking
Source: examples/quickstart/README.md
Common Installation Issues
Neo4j Connection Errors
Error: Neo.ClientError.Database.DatabaseNotFound: Graph not found: default_db
Solution: The Neo4j driver defaults to using neo4j as the database name. If using a different database, specify it explicitly:
driver = Neo4jDriver(
uri=neo4j_uri,
user=neo4j_user,
password=neo4j_password,
database="your_database_name"
)
Source: examples/quickstart/README.md
API Key Configuration
Ensure your LLM API key is properly set in the environment:
export OPENAI_API_KEY=sk-your-key-here
Or in your config file:
llm:
provider: "openai"
api_key: "${OPENAI_API_KEY}"
Advanced Configuration
Using Azure OpenAI
For Azure OpenAI deployments:
azure_client = AsyncOpenAI(
base_url=f"{azure_endpoint}/openai/v1/",
api_key=azure_api_key,
)
Required environment variables:
| Variable | Description |
|---|---|
AZURE_OPENAI_ENDPOINT | Azure endpoint URL |
AZURE_OPENAI_API_KEY | Azure API key |
AZURE_OPENAI_DEPLOYMENT | LLM deployment name |
AZURE_OPENAI_EMBEDDING_DEPLOYMENT | Embedding deployment name |
Source: examples/azure-openai/README.md
Custom Entity Types
Graphiti allows defining custom entity types via Pydantic models in your configuration:
graphiti:
entity_types:
- name: "Preference"
description: "User preferences, choices, opinions, or selections"
- name: "Requirement"
description: "Specific needs, features, or functionality"
Source: mcp_server/README.md
MCP Server Installation
The Graphiti MCP Server provides a Model Context Protocol interface for AI agent integrations.
Installation
pip install graphiti-mcp
Running the Server
With default HTTP transport:
uv run main.py --group-id <your_group_id>
With Docker:
docker compose up
Available MCP Tools
| Tool | Description |
|---|---|
add_memory | Add episodes to the knowledge graph |
search_nodes | Search for entities using natural language |
search_facts | Find relationships between entities |
delete_entity_edge | Remove an entity edge |
delete_episode | Remove an episode |
get_episodes | Retrieve recent episodes |
clear_graph | Clear all graph data |
get_status | Check server and database status |
Source: mcp_server/README.md
Dependency Management
Graphiti uses the following key dependencies:
| Package | Purpose |
|---|---|
pydantic | Data validation and configuration |
neo4j | Neo4j database driver |
falkordb | FalkorDB driver |
openai | OpenAI API client |
anthropic | Anthropic API client |
fastapi | MCP server framework |
mcp | Model Context Protocol |
Full dependency list is available in pyproject.toml.
Source: pyproject.toml
Next Steps
After installation:
- Review the Quickstart Guide for basic usage patterns
- Explore the Examples Directory for domain-specific implementations
- Configure your preferred LLM provider
- Set up the MCP Server for agent integrations
Source: https://github.com/getzep/graphiti / Human Manual
Quick Start Guide
Related topics: Introduction to Graphiti, Neo4j Driver
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Graphiti, Neo4j Driver
Quick Start Guide
This guide provides a comprehensive introduction to getting started with Graphiti, an open-source temporal context graph engine for building knowledge graphs from structured and unstructured data. Follow these steps to install dependencies, configure your database, and run your first Graphiti application.
Overview
Graphiti enables you to build temporal context graphs from conversations, documents, and structured data. The quick start examples demonstrate how to:
- Connect to a graph database (Neo4j, FalkorDB, or Amazon Neptune)
- Initialize Graphiti indices and constraints
- Add episodes (raw data) to the knowledge graph
- Search the graph using hybrid retrieval (semantic + keyword + graph traversal)
- Perform graph-aware reranking based on entity relationships
Source: examples/quickstart/README.md
Prerequisites
Before running Graphiti, ensure your environment meets the following requirements.
System Requirements
| Requirement | Details |
|---|---|
| Python Version | Python 3.9 or higher |
| API Keys | OpenAI API key (for LLM inference and embeddings) |
| Database | One of: Neo4j 5.26+, FalkorDB 1.1.2+, Kuzu 0.11.2+, or Amazon Neptune |
Database Setup
Neo4j (Primary supported database):
- Download and install Neo4j Desktop
- Create a new database
- Start the database
- Note the connection URI, username, and password
FalkorDB:
- FalkorDB server running on
falkor://localhost:6379(default) - See FalkorDB documentation for setup instructions
Amazon Neptune:
- Amazon Neptune Database Cluster or Neptune Analytics Graph
- Amazon OpenSearch Serverless collection (serves as the full text search backend)
- See Amazon Neptune documentation for setup
Source: examples/quickstart/README.md
Installation
Install the required dependencies using pip or uv:
pip install graphiti-core
Or using uv:
uv pip install graphiti-core
Source: examples/quickstart/README.md
Environment Configuration
Configure your environment variables before running Graphiti. Create a .env file in your project root with the following variables:
Required Variables
# OpenAI API Key (required for LLM and embeddings)
export OPENAI_API_KEY=your_openai_api_key
Optional Neo4j Variables
# Neo4j connection parameters (defaults shown)
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=password
Optional FalkorDB Variables
# FalkorDB connection parameters (defaults shown)
export FALKORDB_URI=falkor://localhost:6379
Optional Amazon Neptune Variables
# Amazon Neptune connection parameters
NEPTUNE_HOST=your-neptune-endpoint
NEPTUNE_PORT=8182
OPENSEARCH_HOST=your-opensearch-endpoint
OPENSEARCH_PORT=443
Azure OpenAI Variables (Alternative LLM Provider)
# Azure OpenAI configuration
AZURE_OPENAI_API_KEY=your_azure_api_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=your_deployment_name
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=your_embedding_deployment
Source: examples/quickstart/README.md, examples/azure-openai/README.md
Basic Usage with Neo4j
The following example demonstrates the core Graphiti workflow with Neo4j:
import os
from graphiti import Graphiti
from graphiti.driver import Neo4jDriver
# Database configuration
neo4j_uri = os.getenv("NEO4J_URI", "bolt://localhost:7687")
neo4j_user = os.getenv("NEO4J_USER", "neo4j")
neo4j_password = os.getenv("NEO4J_PASSWORD", "password")
# Initialize the driver
driver = Neo4jDriver(
uri=neo4j_uri,
user=neo4j_user,
password=neo4j_password
)
# Initialize Graphiti with the driver
graphiti = Graphiti(driver)
# Initialize indices and constraints
await graphiti.initialize()
# Add an episode to the knowledge graph
episode_result = await graphiti.add_episode(
name="California Politics Update",
episode_body="Governor Newsom signed the new housing bill in Sacramento today.",
source="text",
source_description="news article"
)
# Search for nodes (entities)
node_results = await graphiti.search_nodes(
query="housing legislation California",
group_id="default"
)
# Search for edges (relationships/facts)
edge_results = await graphiti.search_edges(
query="housing bill Sacramento",
group_id="default"
)
# Close connections
await graphiti.close()
Source: examples/quickstart/quickstart_neo4j.py
Graphiti Architecture
Understanding the core components helps you make the most of Graphiti's capabilities.
graph TD
A[Episode Input] --> B[Entity Extraction]
A --> C[Edge Extraction]
B --> D[Knowledge Graph]
C --> D
D --> E[Hybrid Search]
E --> F[Node Search]
E --> G[Edge Search]
F --> H[Graph-Aware Reranking]
G --> H
H --> I[Search Results]
J[(Neo4j<br/>FalkorDB<br/>Neptune)] --> D
J --> ECore Components
| Component | Description |
|---|---|
| Graphiti | Main client class that orchestrates graph operations |
| Driver | Database-specific adapter (Neo4jDriver, FalkorDriver, NeptuneDriver) |
| Episode | Raw input data (text, JSON, or messages) |
| Node | Extracted entity from an episode |
| Edge | Extracted relationship between entities |
Source: README.md
Episode Types
Graphiti supports multiple episode types for different input formats:
Text Episodes
Plain text content such as articles, documents, or notes:
await graphiti.add_episode(
name="Company News",
episode_body="Acme Corp announced a new product line today.",
source="text",
source_description="news article",
group_id="company_updates"
)
JSON Episodes
Structured data with key-value pairs:
await graphiti.add_episode(
name="Customer Profile",
episode_body='{"company": {"name": "Acme Technologies"}, "products": [{"id": "P001", "name": "CloudSync"}]}',
source="json",
source_description="CRM data",
group_id="customer_data"
)
Source: mcp_server/README.md
Search Capabilities
Graphiti provides multiple search strategies that can be combined for optimal results.
Hybrid Search
Combines semantic embeddings with keyword-based BM25 retrieval:
edge_results = await graphiti.search_edges(
query="housing legislation Sacramento",
group_id="default"
)
Graph-Aware Search
Reranks results based on graph distance to a specific entity:
# Get initial search results
results = await graphiti.search_edges(
query="California government",
group_id="default"
)
# Use the top result's source node UUID for graph-distance reranking
if results.facts:
top_result = results.facts[0]
reranked = await graphiti.search_raft(
query="housing policy",
center_node_uuid=top_result.source_node.uuid,
group_id="default"
)
Search Using Recipes
Graphiti provides predefined search configurations:
from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRF
# Search for nodes directly using a recipe
node_results = await graphiti.search_nodes(
query="Governor policy",
group_id="default",
config=NODE_HYBRID_SEARCH_RRF
)
Source: examples/quickstart/README.md, examples/quickstart/quickstart_neo4j.py
Using FalkorDB
For FalkorDB, use the FalkorDriver instead of Neo4jDriver:
import os
from graphiti import Graphiti
from graphiti.driver import FalkorDriver
# FalkorDB configuration
falkordb_uri = os.getenv("FALKORDB_URI", "falkor://localhost:6379")
# Initialize the driver
driver = FalkorDriver(uri=falkordb_uri)
# Initialize Graphiti with the driver
graphiti = Graphiti(driver)
# Continue with same episode and search operations
await graphiti.initialize()
[!NOTE]
When using FalkorDB, ensure your group_id values do not contain hyphens due to RediSearch syntax requirements. This is a known limitation (see GitHub Issue #1483).
Source: examples/quickstart/quickstart_falkordb.py
Using Amazon Neptune
For Amazon Neptune, configure both Neptune and OpenSearch connections:
import os
from graphiti import Graphiti
from graphiti.driver import NeptuneDriver
# Neptune configuration
neptune_host = os.getenv("NEPTUNE_HOST", "your-neptune-endpoint")
neptune_port = os.getenv("NEPTUNE_PORT", "8182")
# OpenSearch configuration for full-text search
opensearch_host = os.getenv("OPENSEARCH_HOST", "your-opensearch-endpoint")
opensearch_port = os.getenv("OPENSEARCH_PORT", "443")
# Initialize the driver
driver = NeptuneDriver(
neptune_host=neptune_host,
neptune_port=neptune_port,
opensearch_host=opensearch_host,
opensearch_port=opensearch_port
)
# Initialize Graphiti with the driver
graphiti = Graphiti(driver)
await graphiti.initialize()
Source: examples/quickstart/quickstart_neptune.py
Group ID Usage
The group_id parameter is used to namespace graph data, allowing you to maintain separate knowledge domains:
# Add episode to a specific group
await graphiti.add_episode(
name="Project Alpha Update",
episode_body="The new feature is scheduled for release next week.",
source="text",
group_id="project_alpha" # Namespace for this project
)
# Search within a specific group
results = await graphiti.search_edges(
query="release schedule",
group_id="project_alpha"
)
If you do not specify a group_id, the system uses "default" as the group_id.
Source: mcp_server/README.md, examples/quickstart/README.md
MCP Server Quick Start
The Graphiti MCP Server provides tools for AI agent integrations via the Model Context Protocol.
Starting the MCP Server
# Using uv
uv run main.py --group-id <your_group_id>
# Using Docker
docker compose up
Available Tools
| Tool | Description |
|---|---|
add_memory | Add episode to the knowledge graph |
search_nodes | Search for relevant entity nodes |
search_facts | Search for relevant facts (edges) |
get_entity_edge | Get a specific entity edge by UUID |
get_episodes | Get recent episodes for a group |
delete_episode | Delete an episode from the graph |
delete_entity_edge | Delete an entity edge |
clear_graph | Clear all data and rebuild indices |
get_status | Check server and connection status |
Source: mcp_server/README.md
Common Issues and Solutions
"Graph not found: default_db" Error
When using Neo4j and encountering this error, the driver may be trying to connect to a non-existent database:
Solution: Neo4j defaults to using neo4j as the database name. If you need a different database, specify it in the driver constructor:
driver = Neo4jDriver(
uri=neo4j_uri,
user=neo4j_user,
password=neo4j_password,
database="your_database_name" # Add this parameter
)
Source: examples/quickstart/README.md
Database Parameter Not Honored
If query operations don't respect the database parameter, this may be a known issue with Neo4jDriver. Ensure you are using the latest version of graphiti-core.
Source: GitHub Issue #1481
FalkorDB Hyphen Handling
group_id values containing hyphens cause RediSearch syntax errors. Use underscores or camelCase instead:
# Avoid this:
group_id="my-group-id" # Will fail
# Use this instead:
group_id="my_group_id" # Works correctly
Source: GitHub Issue #1483
Next Steps
After completing the quick start:
- Explore Search Recipes: Try different predefined search configurations in
graphiti_core.search.search_config_recipes - Custom Entity Types: Define custom entity types via Pydantic models for domain-specific extraction
- Batch Ingestion: Process larger datasets using the batch ingestion APIs
- Advanced Examples: Explore other example directories:
examples/podcast/- Processing longer contentexamples/ecommerce/- Domain-specific knowledge graphsexamples/azure-openai/- Azure OpenAI integration
Additional Resources
| Resource | Description |
|---|---|
| Graphiti Documentation | Main repository |
| arXiv Paper | Technical paper on Graphiti architecture |
| Zep Blog | State of the art in agent memory |
| Discord Community | Community support and discussions |
Source: https://github.com/getzep/graphiti / Human Manual
Temporal Context Graphs
Related topics: Data Models, Ingestion Pipeline
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Data Models, Ingestion Pipeline
Temporal Context Graphs
Overview
A Temporal Context Graph is a specialized knowledge graph that captures entities, relationships, and facts with explicit temporal validity windows. Unlike traditional knowledge graphs that store static information, temporal context graphs track when facts became true and when they were superseded, enabling precise historical queries.
Graphiti is the open-source implementation of a temporal context graph engine. It builds and queries temporal context graphs for AI agents operating in dynamic environments. Source: README.md
Core Characteristics
| Characteristic | Description |
|---|---|
| Temporal Validity | Facts have validity windows - when they became true and when (if ever) they were superseded |
| Episodes & Provenance | Every derived entity and relationship traces back to the raw data (episodes) that produced it |
| Incremental Updates | New data integrates immediately without batch recomputation |
| Hybrid Retrieval | Combines semantic embeddings, keyword (BM25), and graph traversal |
Source: README.md
Architecture
Graph Components
A temporal context graph contains four primary components:
| Component | What it stores |
|---|---|
| Entities (nodes) | People, products, policies, concepts — with summaries that evolve over time |
| Facts / Relationships (edges) | Triplets (Entity → Relationship → Entity) with temporal validity windows |
| Episodes (provenance) | Raw data as ingested — the ground truth stream. Every derived fact traces back here |
| Custom Types (ontology) | Developer-defined entity and edge types via Pydantic models |
Source: README.md
Data Flow Architecture
graph TD
A[Raw Data Input] --> B[Episodes]
B --> C[Entity Extraction]
B --> D[Relationship Extraction]
C --> E[Entities with Temporal Validity]
D --> F[Facts with Temporal Validity]
E --> G[Temporal Context Graph]
F --> G
G --> H[Query Processing]
G --> I[Node Summaries]
I --> EThe architecture follows a pipeline where episodes (raw data) flow through extraction processes that produce entities and relationships. These are stored in the graph with temporal metadata, and node summaries are generated from the extracted information. Source: mcp_server/README.md
Temporal Fact Management
Validity Windows
Every fact in a temporal context graph carries a validity window that defines when the fact is true:
- valid_from: Timestamp when the fact became true
- valid_to: Timestamp when the fact was superseded (null if still valid)
When information changes, old facts are invalidated — not deleted. This preserves full historical context while enabling queries for what is true at any point in time. Source: README.md
Bi-Temporal Tracking
Graphiti implements bi-temporal tracking to support both:
- State Time: When a fact was true in reality
- System Time: When the fact was recorded in the graph
This enables queries such as:
- "What did the system believe was true at time X?"
- "What was actually true at time Y?"
Fact Invalidation
graph LR
A[Fact: Kendra loves Adidas<br/>valid_from: 2025-01, valid_to: null] --> B[new_data]
B --> C[New Fact: Kendra loves Nike<br/>valid_from: 2026-03, valid_to: null]
C --> D[Old Fact Updated<br/>valid_to: 2026-03]
D --> E[Historical Query Available]When new information contradicts existing facts, the system automatically invalidates the old fact by setting its valid_to timestamp, while creating a new fact with the updated information. Source: README.md
Entity Management
Entity Types
Graphiti supports a rich set of predefined entity types, each with specific extraction instructions:
| Entity Type | Purpose | Example |
|---|---|---|
| Preference | User preferences, choices, opinions | "Kendra prefers running shoes" |
| Requirement | Specific needs, features, functionality | "The system must support authentication" |
| Procedure | Step-by-step processes | "How to reset the password" |
| Location | Physical or virtual places | "Meeting in Conference Room A" |
| Event | Time-bound activities | "Product launch on March 15" |
| Topic | Subjects of conversation or interest | "Machine learning techniques" |
| Organization | Companies, institutions, groups | "Acme Technologies" |
| Document | Written materials | "Q4 financial report" |
| Object | Physical items, tools, devices | "Company laptop" |
Source: mcp_server/src/models/entity_types.py
Entity Extraction Instructions
Each entity type has detailed extraction instructions to guide the LLM:
class Preference(BaseModel):
"""
IMPORTANT: Prioritize this classification over ALL other classifications.
Represents entities mentioned in contexts expressing user preferences,
choices, opinions, or selections. Use LOW THRESHOLD for sensitivity.
"""
name: str = Field(..., description='The name or identifier of the preference')
description: str = Field(..., description='Brief description of the preference')
Source: mcp_server/src/models/entity_types.py
Custom Entity Types
Developers can define custom entity types via Pydantic models. This allows extending the ontology to domain-specific concepts:
class CustomEntityType(BaseModel):
name: str = Field(..., description='Entity identifier')
description: str = Field(..., description='Entity description')
Source: README.md
Episodes and Provenance
What Are Episodes?
Episodes are the raw data ingested into the graph. They serve as the ground truth stream — every derived entity and relationship traces back to the episodes that produced it.
Episode Sources
Episodes can come from various sources:
| Source Type | Description | Example |
|---|---|---|
| text | Plain text content | Conversation transcripts, documents |
| json | Structured JSON data | CRM records, API responses |
| messages | Chat or message data | Slack threads, email conversations |
Source: mcp_server/README.md
Adding Episodes
graphiti.add_episode(
name="Customer Conversation",
episode_body="Customer mentioned they prefer cloud storage over local backups",
source="text",
source_description="Support call transcript"
)
JSON Episode Processing
The MCP server can process structured JSON data:
add_episode(
name="Customer Profile",
episode_body='{"company": {"name": "Acme Technologies"}, "products": [...]}',
source="json",
source_description="CRM data"
)
Source: mcp_server/README.md
Retrieval and Search
Hybrid Search
Graphiti combines multiple search strategies for optimal retrieval:
| Strategy | Description |
|---|---|
| Semantic Search | Uses embeddings to find semantically similar content |
| BM25 | Keyword-based text retrieval |
| Graph Traversal | Leverages relationships between entities |
Source: examples/azure-openai/README.md
Search Recipes
Graphiti provides predefined search configurations:
EDGE_HYBRID_SEARCH_RRF: Hybrid edge search with Reciprocal Rank FusionNODE_HYBRID_SEARCH_RRF: Hybrid node search with Reciprocal Rank FusionCENTER_NODE_SEARCH: Reranks results based on graph distance to a specific node
Source: examples/quickstart/README.md
Temporal Query Patterns
graph TD
A[Query] --> B{Time Specification?}
B -->|No filter| C[Current Facts Only]
B -->|At time T| D[Facts Valid at T]
B -->|Range| E[Facts in Time Range]
C --> F[Return Active Facts]
D --> F
E --> FQuery patterns supported:
- Current state: Retrieve only facts where
valid_to IS NULL - Historical point: Retrieve facts valid at a specific timestamp
- Time range: Retrieve facts valid within a date range
- Change tracking: Retrieve the history of fact changes
Comparison with Traditional Approaches
Graphiti vs. GraphRAG
| Aspect | GraphRAG | Graphiti |
|---|---|---|
| Primary Use | Static document summarization | Dynamic, evolving context for agents |
| Data Handling | Batch-oriented processing | Continuous, incremental updates |
| Knowledge Structure | Entity clusters & community summaries | Temporal context graph with validity windows |
| Retrieval Method | Sequential LLM summarization | Hybrid semantic, keyword, and graph-based search |
| Temporal Handling | Basic timestamp tracking | Explicit bi-temporal tracking with automatic fact invalidation |
| Contradiction Handling | LLM-driven summarization judgments | Automatic fact invalidation with history preserved |
| Query Latency | Seconds to tens of seconds | Typically sub-second latency |
| Custom Entity Types | No | Yes, customizable via Pydantic models |
Source: README.md
Known Issues and Limitations
Active Bugs Related to Temporal Features
| Issue | Description | Impact |
|---|---|---|
| #1483 | FalkorDriver.build_fulltext_query fails on hyphens in group_id | RediSearch syntax errors when group_id contains hyphens |
| #1481 | Database parameter not honored in Neo4jDriver.execute_query() | Temporal queries may execute against wrong database |
| #1438 | MCP search_memory_facts fails with 'neo4j.time.DateTime' serialization | Temporal data cannot be returned in MCP responses |
Source: Community Context
Database Compatibility
| Database | Temporal Support | Notes |
|---|---|---|
| Neo4j | Full | Recommended for production with temporal queries |
| FalkorDB | Full | Redis-based, fast for most workloads |
| Kuzu | Full | Embedded graph database |
| Amazon Neptune | Full | Requires OpenSearch for full-text search backend |
Source: README.md
Use Cases
Agent Memory Systems
Temporal context graphs excel as memory systems for AI agents:
- Conversation Memory: Track user preferences and facts across conversations
- Entity Evolution: Understand how entities change over time
- Temporal Reasoning: Query what was true at any historical point
Enterprise Knowledge Management
- Policy Tracking: Track policy changes with full history
- Document Evolution: Understand how documents and understanding changed
- Audit Trails: Maintain complete provenance of derived information
Dynamic Data Integration
- Continuous Updates: Real-time data integration without batch processing
- Contradiction Resolution: Automatic handling of conflicting information
- Provenance Tracking: Full lineage from source to derived facts
Getting Started
Basic Setup
from graphiti import Graphiti
graphiti = Graphiti()
await graphiti.add_episode(
name="Initial conversation",
episode_body="User mentioned they work at Acme Technologies",
source="text"
)
# Query current state
results = await graphiti.search_nodes("Where does the user work?")
MCP Server Integration
For AI assistant integration, use the MCP server:
uv run main.py --group-id <your_group_id>
The server exposes tools for:
add_memory: Add episodes to the knowledge graphsearch_nodes: Search for entitiessearch_facts: Find relationships between entitiesget_episodes: Retrieve episodes by group
Source: mcp_server/README.md
Performance Considerations
Latency
| Operation | Typical Latency |
|---|---|
| Episode Ingestion | ~100-500ms (LLM-dependent) |
| Semantic Search | <100ms |
| Hybrid Search | <200ms |
| Temporal Queries | <100ms |
Scalability
Graphiti is optimized for:
- Parallel Processing: Multiple episodes processed concurrently
- Pluggable Graph Backends: Support for various graph databases
- Efficient Indices: Optimized for temporal and semantic queries
Source: README.md
Summary
Temporal Context Graphs represent a significant advancement in knowledge representation for AI agents. By explicitly tracking when facts are true and maintaining full provenance, Graphiti enables:
- Precise historical queries
- Automatic handling of changing information
- Full audit trails from source to derived knowledge
- Real-time context assembly for agent interactions
The framework's combination of temporal validity windows, hybrid retrieval, and incremental updates makes it particularly suited for dynamic environments where information evolves continuously.
Source: https://github.com/getzep/graphiti / Human Manual
Data Models
Related topics: Temporal Context Graphs, Search System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Temporal Context Graphs, Search System
Data Models
Graphiti implements a comprehensive temporal context graph data model designed for AI agents operating in dynamic environments. The data model structures knowledge into entities (nodes), facts (edges), and episodes (provenance), with built-in support for temporal validity windows and customizable entity types.
Overview
Graphiti's data model differs from traditional knowledge graphs by incorporating bi-temporal tracking — each fact maintains explicit validity windows indicating when it became true and when it was superseded. This enables precise historical queries without data loss.
graph TD
A[Episode] -->|produces| B[Entity Node]
A -->|produces| C[Fact Edge]
B -->|connected via| C
D[Temporal Validity Window] --> C
E[group_id Namespace] --> B
E --> C
F[Entity Summary] --> BSource: README.md
Core Components
Entities (Nodes)
Entities represent the primary objects in the knowledge graph — people, products, policies, concepts, and other meaningful abstractions. Each entity includes:
- Unique Identifier: UUID for cross-referencing
- Name: Human-readable identifier
- Summary: Evolves over time as new information is ingested
- Entity Type: Classification (Person, Organization, Document, etc.)
- Valid From/To: Temporal boundaries for entity validity
graph LR
A[Entity Node] --> B[UUID]
A --> C[Name]
A --> D[Summary]
A --> E[Entity Type]
A --> F[valid_at]
A --> G[invalid_at]
A --> H[group_id]Source: README.md
Facts (Edges)
Facts represent relationships between entities as triplets: Entity → Relationship → Entity. Each fact maintains:
| Field | Description |
|---|---|
uuid | Unique identifier for the fact |
source_node_uuid | UUID of the source entity |
target_node_uuid | UUID of the target entity |
fact | Relationship description (e.g., "works for", "located in") |
fact_json | Structured JSON representation of the fact |
valid_from | When this fact became true |
valid_to | When this fact was superseded (null if current) |
invalidated_at | Timestamp when the fact was invalidated |
episode_uuid | Source episode that produced this fact |
group_id | Namespace for multi-tenant isolation |
Source: README.md
Episodes (Provenance)
Episodes are the raw data inputs that generate entities and facts. Every derived entity and relationship traces back to its source episode, ensuring full lineage and auditability.
| Field | Description |
|---|---|
uuid | Unique identifier for the episode |
name | Descriptive name for the episode |
episode_body | The raw content (text, JSON, or messages) |
source | Type of content: text, json, or message |
source_description | Human-readable description of the source |
created_at | Timestamp when the episode was ingested |
group_id | Namespace for the episode |
Source: mcp_server/README.md
Entity Type System
Graphiti supports both prescribed ontology (developer-defined types) and emergent types (learned from data). The MCP server defines a comprehensive set of built-in entity types.
Built-in Entity Types
| Type | Purpose | Key Fields |
|---|---|---|
Requirement | Product/service needs and specifications | project_name, description |
Preference | User choices, opinions, selections | name, description |
Procedure | Sequential instructions or actions | name, description |
Location | Physical or virtual places | name, description |
Event | Time-bound activities or occurrences | name, description |
Object | Physical items, tools, devices | name, description |
Topic | Subjects of conversation or interest | name, description |
Organization | Companies, institutions, groups | name, description |
Document | Files, records, written materials | name, description |
Source: mcp_server/src/models/entity_types.py
Entity Type Hierarchy
graph TD
A[Entity Types] --> B[High Priority]
A --> C[Standard Types]
B --> D[Preference]
C --> E[Requirement]
C --> F[Procedure]
C --> G[Location]
C --> H[Event]
C --> I[Topic]
C --> J[Object]
C --> K[Organization]
C --> L[Document]
D -.->|highest priority| M[Classification Priority]Source: mcp_server/src/models/entity_types.py
Classification Priority
The Preference type has the highest classification priority and should be used with a low sensitivity threshold. The extraction system prioritizes classifications in this order:
- Preference — User choices and opinions (highest priority)
- Requirement — Explicit needs and specifications
- Procedure — Sequential instructions
- Standard Types — Location, Event, Topic, Object, Organization, Document
Source: mcp_server/src/models/entity_types.py
Extracting Entity Types
Each entity type includes specific instructions for identification:
class Preference(BaseModel):
"""
IMPORTANT: Prioritize this classification over ALL other classifications.
Represents entities mentioned in contexts expressing user preferences.
Trigger patterns: "I want/like/prefer/choose X", "I don't want/dislike/avoid"
"""
class Requirement(BaseModel):
"""A Requirement represents a specific need, feature, or functionality.
Look for: "We need X", "X is required", "X must have Y"
"""
class Procedure(BaseModel):
"""Procedures are composed of several steps.
Look for: Sequential instructions, explicit directives, conditional statements
"""
Source: mcp_server/src/models/entity_types.py
Temporal Model
Bi-temporal Tracking
Graphiti implements bi-temporal tracking for facts, enabling queries across both assertion time (when the fact was recorded) and validity time (when the fact was true in reality).
sequenceDiagram
participant E1 as Episode
participant F as Fact
participant Q as Query
E1->>F: Creates fact at t1
Note over F: valid_from = t1<br/>invalidated_at = null
E1->>F: New episode invalidates at t2
Note over F: valid_to = t2<br/>invalidated_at = t3
Q->>F: Query at t_now: "What is true now?"
F-->>Q: Returns fact with valid_to = t2
Q->>F: Query at t1: "What was true at t1?"
F-->>Q: Returns fact (valid_from = t1, valid_to > t1)Fact Invalidation
When information changes, old facts are invalidated rather than deleted. This preserves full temporal history:
| State | valid_to | invalidated_at | Query Behavior |
|---|---|---|---|
| Active | null | null | Returns as current fact |
| Superseded | Timestamp | Timestamp | Queryable for historical periods |
| Deleted | Preserved | Timestamp | Removed from current queries |
Source: README.md
Group-based Namespacing
All data in Graphiti is organized by group_id, enabling multi-tenant isolation and separate knowledge domains:
| Field | Usage | Example |
|---|---|---|
group_id | Namespace for all graph operations | "customer-123", "project-alpha" |
Episode Ingestion with group_id
add_memory(
name="Customer Profile",
episode_body='{"company": {"name": "Acme Technologies"}}',
source="json",
group_id="customer-123" # Isolates to this namespace
)
Source: mcp_server/src/graphiti_mcp_server.py
Search Filtering
Search operations support group_id filtering for targeted retrieval:
search_nodes(
query="What are the company preferences?",
group_id="customer-123" # Only search this namespace
)
Source: mcp_server/README.md
Episode Types
Graphiti supports multiple episode types for different input formats:
| Type | Use Case | Input Format |
|---|---|---|
text | Plain text content | Documents, articles, descriptions |
json | Structured data | CRM records, API responses |
message | Conversation-style | Chat history, dialogue |
Source: mcp_server/src/graphiti_mcp_server.py
Search Result Structures
Node Search Results
{
"nodes": [
{
"uuid": "entity-uuid",
"name": "Entity Name",
"entity_type": "Person",
"summary": "Description...",
"created_at": "2024-01-15T10:30:00Z",
"valid_from": "2024-01-15T10:30:00Z"
}
],
"facts": [...] # Edges involving these nodes
}
Source: examples/quickstart/README.md
Fact Search Results
{
"facts": [
{
"uuid": "fact-uuid",
"source_node_uuid": "entity-1",
"target_node_uuid": "entity-2",
"fact": "relationship description",
"valid_from": "2024-01-15T10:30:00Z",
"valid_to": null,
"episode_uuid": "source-episode"
}
]
}
Known Issues and Considerations
Pydantic v2 Compatibility
The SearchInterface class uses a deprecated Pydantic v1-style class Config: declaration. This will cause issues in Pydantic v3.
Location: graphiti_core/driver/search_interface/search_interface.py:350-351
Source: Community Issue #1477
Neo4j DateTime Serialization
When using the MCP server with Neo4j, neo4j.time.DateTime objects may fail to serialize. This affects the search_memory_facts tool with the error:
Unable to serialize unknown type: <class 'neo4j.time.DateTime'>
Source: Community Issue #1438
Database Parameter in Neo4jDriver
The Neo4jDriver.execute_query() method incorrectly places the database_ parameter into parameters_, causing the database selection to not be honored during query operations.
Location: Neo4jDriver.execute_query()
Source: Community Issue #1481
FalkorDB group_id with Hyphens
The FalkorDriver.build_fulltext_query method fails to properly escape hyphens in group_id values, causing RediSearch syntax errors.
Location: FalkorDriver.build_fulltext_query
Source: Community Issue #1483
Configuration
Entity Types Configuration
Entity types are defined in config.yaml and can be customized:
graphiti:
entity_types:
- name: "Preference"
description: "User preferences, choices, opinions, or selections"
- name: "Requirement"
description: "Specific needs, features, or functionality"
Source: mcp_server/README.md
Environment Variables
| Variable | Default | Description |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Neo4j connection URI |
NEO4J_USER | neo4j | Neo4j username |
NEO4J_PASSWORD | demodemo | Neo4j password |
OPENAI_API_KEY | — | Required for OpenAI LLM/embedder |
Source: mcp_server/README.md
Source: https://github.com/getzep/graphiti / Human Manual
System Architecture
Related topics: Neo4j Driver, FalkorDB Driver, Ingestion Pipeline
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Neo4j Driver, FalkorDB Driver, Ingestion Pipeline
System Architecture
Graphiti is a temporal context graph engine designed for building and querying knowledge graphs that capture entities, relationships, and facts with temporal validity windows. The architecture follows a modular design pattern with pluggable drivers, enabling support for multiple graph databases while maintaining a consistent API surface.
Overview
Graphiti's architecture consists of three primary layers:
- Core Library (
graphiti-core) - The foundational engine for building and querying temporal context graphs - Driver Layer - Pluggable graph database connectors supporting Neo4j, FalkorDB, and other backends
- MCP Server - Model Context Protocol server providing tool-based access to graph operations
graph TB
subgraph "Client Layer"
MCP[MCP Server]
Direct[Direct Python API]
end
subgraph "Core Layer"
Graphiti[Graphiti Core]
Search[Search Interface]
Extraction[Extraction Pipeline]
end
subgraph "Driver Layer"
Neo4j[Neo4j Driver]
Falkor[FalkorDB Driver]
Kuzu[Kuzu Driver]
end
subgraph "Storage Layer"
Neo4jDB[(Neo4j)]
FalkorDB[(FalkorDB)]
KuzuDB[(Kuzu)]
SearchIndex[(Search Index)]
end
MCP --> Graphiti
Direct --> Graphiti
Graphiti --> Search
Graphiti --> Extraction
Graphiti --> Neo4j
Graphiti --> Falkor
Graphiti --> Kuzu
Neo4j --> Neo4jDB
Falkor --> FalkorDB
Kuzu --> KuzuDB
Search --> SearchIndexCore Components
Graphiti Service
The Graphiti class serves as the main entry point for the library, coordinating all operations from episode ingestion to knowledge retrieval. Source: README.md
Key responsibilities:
- Episode management (add, retrieve, delete)
- Entity and relationship extraction
- Hybrid search execution
- Graph construction and maintenance
Configuration:
The service accepts a GraphitiConfig object that defines:
| Parameter | Description | Default |
|---|---|---|
group_id | Namespace for graph data isolation | "main" |
llm_client | LLM client for extraction | Required |
embedder_client | Embedding client for semantic search | Required |
graph_driver | Database driver instance | Required |
Driver Architecture
Graphiti uses a driver-based abstraction for graph database operations, enabling support for multiple backends through a common interface. Source: mcp_server/README.md
classDiagram
class GraphDriver {
<<interface>>
+build_fulltext_query()
+execute_query()
+node_create()
+edge_create()
+node_read()
+edge_read()
+node_delete()
+edge_delete()
+create_indices()
}
class Neo4jDriver {
+execute_query()
+build_fulltext_query()
}
class FalkorDBDriver {
+execute_query()
+build_fulltext_query()
}
GraphDriver <|-- Neo4jDriver
GraphDriver <|-- FalkorDBDriverDriver Selection:
| Driver | Database | Features |
|---|---|---|
Neo4jDriver | Neo4j 5.26+ | Full Cypher support, ACID transactions |
FalkorDBDriver | FalkorDB 1.1.2+ | Redis-based, in-memory options |
KuzuDriver | KuzuDB 0.11.2+ | Embedded graph database |
Database Parameter Handling:
[!IMPORTANT]
When using theNeo4jDriver, thedatabaseparameter must be correctly passed to the underlying Neo4j connector. The driver implementation handles database routing for both write and query operations. Source: examples/quickstart/README.md
Search Interface
The SearchInterface class implements hybrid retrieval combining multiple search strategies:
graph LR
A[Query] --> B[Semantic Search]
A --> C[BM25 Keyword]
A --> D[Graph Traversal]
B --> E[Results]
C --> E
D --> E
E --> F[Reranking]
F --> G[Final Results]Search Modes:
| Mode | Description | Use Case |
|---|---|---|
HYBRID | Combines semantic + BM25 + graph | General purpose |
SEMANTIC | Embedding-based similarity only | Conceptual queries |
BM25 | Keyword-based retrieval | Exact term matching |
GRAPH | Relationship-based traversal | Graph-aware results |
Source: examples/azure-openai/README.md
Data Flow Architecture
Episode Ingestion Pipeline
graph TD
A[Episode Input] --> B{Source Type}
B -->|text| C[Text Parser]
B -->|json| D[JSON Parser]
B -->|message| E[Message Parser]
C --> F[LLM Extraction]
D --> F
E --> F
F --> G[Entity Extraction]
F --> H[Relationship Extraction]
G --> I[Graph Construction]
H --> I
I --> J[Search Index Update]
I --> K[Fact Validity Windows]
J --> L[(Graph + Index)]Processing Steps:
- Input Parsing: Handle text, JSON, or message formats
- LLM Extraction: Extract entities and relationships using configured LLM
- Graph Construction: Create nodes and edges with temporal metadata
- Index Update: Update search indices for retrieval operations
Source: mcp_server/src/graphiti_mcp_server.py
Search Pipeline
graph TD
A[Search Query] --> B[Query Processing]
B --> C[Hybrid Search Executor]
C --> D[Semantic Results]
C --> E[BM25 Results]
C --> F[Graph Results]
D --> G[Result Merging]
E --> G
F --> G
G --> H[Reranking]
H --> I[Final Results]Entity Type System
Graphiti supports customizable entity types defined through Pydantic models. The system includes built-in types that can be extended or overridden. Source: mcp_server/src/models/entity_types.py
Built-in Entity Types:
| Type | Purpose | Priority |
|---|---|---|
User | Human users | High |
Assistant | AI assistants | High |
Preference | User preferences, choices | Highest |
Requirement | Specific needs, features | Medium |
Organization | Companies, institutions | Medium |
Document | Textual documents | Medium |
Event | Time-bound activities | Medium |
Location | Physical/virtual places | Medium |
Object | Physical items | Low |
Topic | Subjects of discussion | Low |
[!NOTE]
The Preference type has the highest extraction priority and should be used for capturing user likes, dislikes, and choices. Source: mcp_server/src/models/entity_types.py
MCP Server Architecture
The Graphiti MCP Server exposes graph operations as tools for AI agents and IDE integrations. Source: mcp_server/src/graphiti_mcp_server.py
Available Tools
| Tool | Purpose | Parameters |
|---|---|---|
add_memory | Add episode to graph | name, episode_body, source, group_id |
search_nodes | Search for entities | query, group_id, limit |
search_facts | Search relationships | query, group_id, limit |
get_episodes | Retrieve episodes | group_id, limit |
delete_episode | Remove episode | uuid |
delete_entity_edge | Remove relationship | uuid |
clear_graph | Reset graph | group_id |
get_status | Check server status | - |
Service Architecture
graph TB
subgraph "MCP Server"
MCP[MCP Server Instance]
Tools[MCP Tools]
Service[GraphitiService]
end
subgraph "Backend Services"
Graphiti[Graphiti Core]
Queue[Queue Service]
Semaphore[Rate Limiter]
end
MCP --> Tools
Tools --> Service
Service --> Graphiti
Service --> Semaphore
Queue --> Graphiti
Semaphore -->|Limits| GraphitiConcurrency Control:
The MCP server implements rate limiting via a semaphore to prevent overwhelming LLM providers:
# Configuration
SEMAPHORE_LIMIT=10 # Default concurrent operations
This is particularly important when processing multiple episodes or search queries simultaneously. Source: mcp_server/README.md
Configuration System
Environment Variables
Graphiti supports configuration via environment variables with optional defaults:
| Variable | Description | Required |
|---|---|---|
NEO4J_URI | Neo4j connection URI | For Neo4j |
NEO4J_USER | Neo4j username | For Neo4j |
NEO4J_PASSWORD | Neo4j password | For Neo4j |
OPENAI_API_KEY | OpenAI API key | For OpenAI |
ANTHROPIC_API_KEY | Anthropic API key | For Claude |
AZURE_OPENAI_* | Azure OpenAI config | For Azure |
SEMAPHORE_LIMIT | Concurrency limit | No |
Source: mcp_server/README.md
Configuration File
The config.yaml file supports environment variable expansion:
graphiti:
entity_types:
- name: "Preference"
description: "User preferences, choices, opinions, or selections"
- name: "Requirement"
description: "Specific needs, features, or functionality"
Syntax: ${VAR_NAME} or ${VAR_NAME:default} for fallback values.
Deployment Patterns
Docker Compose (Default - FalkorDB)
docker compose up
This starts:
- MCP server on
http://localhost:8000/mcp/ - FalkorDB on
localhost:6379 - FalkorDB web UI on
http://localhost:3000
Source: mcp_server/README.md
Neo4j Deployment
# With Docker
docker compose -f docker-compose.neo4j.yml up
# Direct Python usage
driver = Neo4jDriver(uri=uri, user=user, password=password)
Azure OpenAI Integration
azure_client = AsyncOpenAI(
base_url=f"{azure_endpoint}/openai/v1/",
api_key=azure_api_key,
)
llm_client = AzureOpenAILLMClient(
client=azure_client,
deployment=azure_deployment,
)
Source: examples/azure-openai/README.md
Temporal Fact Management
A distinguishing feature of Graphiti is its bi-temporal data model:
graph TB
A[Fact Created] --> B[Valid From]
A --> C[Valid Until]
B --> D[Fact Validity Window]
C --> D
E[Data Received] --> F[Created At]
E --> G[Updated At]
F --> H[Record Validity Window]
G --> H| Time Dimension | Description |
|---|---|
| Valid From | When the fact became true |
| Valid Until | When the fact was superseded (or null if current) |
| Created At | When the fact was recorded |
| Updated At | When the record was modified |
This model enables queries like "what was true at time X?" and "what is true now?" with automatic invalidation when information changes. Source: README.md
Search Configuration Recipes
Graphiti provides predefined search configurations for common patterns:
| Recipe | Description |
|---|---|
NODE_HYBRID_SEARCH_RRF | Direct node search with Reciprocal Rank Fusion |
HYBRID_SEARCH | Combined semantic + BM25 search |
CENTRALITY_RERANK | Results reranked by graph distance |
Source: examples/azure-openai/README.md
Architecture Limitations and Considerations
Known Issues
The following architectural considerations should be noted:
| Issue | Impact | Workaround |
|---|---|---|
group_id with hyphens in FalkorDB | RediSearch syntax errors | Use underscores instead |
Neo4j database parameter routing | Query operations may use wrong database | Verify driver configuration |
| Neo4j DateTime serialization | MCP search failures | Ensure compatible Neo4j driver version |
[!WARNING]
TheSearchInterfaceclass uses Pydantic v1-styleclass Config:which is deprecated in Pydantic v2 and will fail in v3. Consider this when planning upgrades.
Source: community_context, community_context, community_context
Extension Points
Custom Entity Types
Define new entity types in config.yaml:
graphiti:
entity_types:
- name: "CustomType"
description: "Description of your custom type"
Custom Drivers
Implement the GraphDriver interface to add support for additional graph databases:
class CustomDriver(GraphDriver):
async def execute_query(self, query: str, params: dict) -> list[dict]:
# Implementation
pass
Search Recipes
Create custom search configurations by combining semantic, BM25, and graph strategies:
search_config = SearchConfig(
mode=SearchMode.HYBRID,
semantic_weight=0.6,
bm25_weight=0.3,
graph_weight=0.1,
)Source: https://github.com/getzep/graphiti / Human Manual
Ingestion Pipeline
Related topics: System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture
Ingestion Pipeline
The Ingestion Pipeline is the core mechanism in Graphiti that transforms raw, unstructured content into a structured temporal knowledge graph. It handles continuous, incremental updates by extracting entities, relationships, and facts from episodes, then integrating them into the graph while maintaining temporal validity windows for each fact.
Overview
Graphiti is designed specifically for dynamic and frequently updated datasets, making it particularly suitable for applications requiring real-time interaction and precise historical queries. The ingestion pipeline enables this by processing data as it arrives and immediately integrating new knowledge without batch recomputation.
The pipeline achieves efficiency through a combined node + edge extraction path that allows a single LLM call to cover what previously required multiple calls. Source: v0.29.0 Release Notes
Architecture Overview
graph TD
A[Raw Content Input] --> B[Episode Creation]
B --> C{Content Size Check}
C -->|Large Entity-Dense| D[Content Chunking]
C -->|Small or Low Density| E[Direct Extraction]
D --> F[Parallel Chunk Processing]
E --> G[Combined Node + Edge Extraction]
F --> G
G --> H[Entity Validation & Deduplication]
H --> I[Edge Creation with Validity Windows]
I --> J[Graph Integration]
J --> K[Community Detection]
K --> L[Entity Summary Updates]
L --> M[Temporal Knowledge Graph]Key Concepts
Episodes
Episodes are the raw data as ingested — the ground truth stream. Every derived fact traces back to an episode. Source: README.md
| Property | Description |
|---|---|
uuid | Unique identifier for the episode |
name | Descriptive name for the episode |
episode_body | The raw content (text, JSON, or messages) |
source | Source type: text, json, or message |
source_description | Description of the data source |
group_id | Namespace for organizing graph data |
created_at | Timestamp of ingestion |
Episodes can be added through the MCP server using the add_memory tool:
# Adding plain text content
add_memory(
name="Company News",
episode_body="Acme Corp announced a new product line today.",
source="text",
source_description="news article",
group_id="some_arbitrary_string"
)
# Adding structured JSON data
add_memory(
name="Customer Profile",
episode_body='{"company": {"name": "Acme Technologies"}, "products": [{"id": "P001", "name": "CloudSync"}]}',
source="json",
source_description="CRM data"
)
Source: mcp_server/src/graphiti_mcp_server.py:1-100
Entity Types
Graphiti supports customizable entity types defined via Pydantic models. The default entity types include:
| Entity Type | Purpose | Priority |
|---|---|---|
Preference | User preferences, choices, opinions, or selections | HIGHEST - Always checked first |
Requirement | Specific needs, features, or functionality | High |
Procedure | Actions to take or how to perform in scenarios | High |
Organization | Companies, institutions, groups, or formal entities | Medium |
Document | Written materials (books, articles, reports, videos) | Medium |
Event | Time-bound activities, occurrences, or experiences | Medium |
Location | Physical or virtual places | Medium |
Topic | Subject of conversation, interest, or knowledge domain | Low |
Object | Physical items, tools, devices, or possessions | LOWEST - Last resort |
Source: mcp_server/src/models/entity_types.py
The Preference entity type has special handling — it uses a LOW THRESHOLD for sensitivity and should be used whenever user preferences, choices, opinions, or selections are expressed. Trigger patterns include:
- "I want/like/prefer/choose X"
- "I don't want/dislike/avoid/reject Y"
- "X is better/worse"
- "rather have X than Y"
- "no X please", "skip X", "go with X instead"
Source: mcp_server/src/models/entity_types.py
Facts and Relationships
Facts (edges) represent triplets: Entity → Relationship → Entity. Each fact has:
| Property | Description |
|---|---|
subject | Source entity |
object | Target entity |
fact | The relationship description |
valid_from | When the fact became true |
valid_to | When the fact was superseded (null if current) |
episodes | Source episodes that produced this fact |
Source: README.md
Content Chunking
The ingestion pipeline includes intelligent content chunking to handle large inputs that could cause LLM issues. Source: graphiti_core/utils/content_chunking.py:1-50
Chunking Strategy
The should_chunk() function determines whether content should be split based on:
- Minimum token threshold: Content must be ≥
CHUNK_MIN_TOKENSto be considered - Entity density estimation: High-density content (many entities per token) benefits from chunking
def should_chunk(content: str, episode_type: EpisodeType) -> bool:
"""Determine whether content should be chunked based on size and entity density."""
tokens = estimate_tokens(content)
# Short content always processes fine
if tokens < CHUNK_MIN_TOKENS:
return False
return _estimate_high_density(content, episode_type, tokens)
Source: graphiti_core/utils/content_chunking.py:1-50
Density Estimation
The system estimates entity density differently based on content type:
| Content Type | Density Indicator | Threshold |
|---|---|---|
| JSON | Array elements or object keys per 1000 tokens | High if many elements |
| Text | Sentence count and structure patterns | Lower for prose/narratives |
def _json_likely_dense(content: str, tokens: int) -> bool:
"""JSON is considered dense if it has many array elements or object keys."""
data = json.loads(content)
if isinstance(data, list):
element_count = len(data)
density = (element_count / tokens) * 1000 if tokens > 0 else 0
return density > DENSITY_THRESHOLD
Source: graphiti_core/utils/content_chunking.py:50-100
Chunking Process
When chunking is required, the system preserves message boundaries and never splits mid-message:
graph LR
A[Raw Content] --> B[Token Estimation]
B --> C{Exceeds Threshold?}
C -->|No| D[Single Chunk]
C -->|Yes| E{JSON Format?}
E -->|Yes| F[_chunk_message_array]
E -->|No| G{Speaker Pattern?}
G -->|Yes| H[_chunk_speaker_messages]
G -->|No| I[General Chunking]The chunking preserves:
- JSON message arrays (never splits within an array element)
- Speaker patterns (e.g., "Alice: Hello, Bob: Hi there")
- Newline-separated messages
Source: graphiti_core/utils/content_chunking.py:20-40
Extraction Pipeline
As of v0.29.0, Graphiti uses a simplified and optimized extraction pipeline that combines node and edge extraction into a single LLM call, significantly reducing ingestion costs. Source: v0.29.0 Release Notes
Combined Extraction Flow
graph TD
A[Episode Content] --> B[Combined Extraction Prompt]
B --> C[Single LLM Call]
C --> D[Extracted Entities & Relationships]
D --> E{Validation}
E -->|Valid| F[Entity Deduplication]
E -->|Invalid| G[Retry or Skip]
F --> H[Graph Integration]
H --> I[Community Detection]
I --> J[Summary Updates]Batch Entity Summarization
The v0.28.0 release introduced batch entity summarization to improve efficiency when processing multiple entities. Source: v0.28.0 Release Notes
Temporal Fact Management
One of Graphiti's unique features is bi-temporal tracking for facts. Each fact maintains two time dimensions:
- Valid time: When the fact is true in the real world
- Transaction time: When the fact was recorded in the system
When information changes, old facts are invalidated (not deleted), preserving full temporal history:
graph TD
A[Episode 1] -->|Fact A-B| B[Fact v1: valid_from=T1, valid_to=null]
A -->|Episode 2| C[Fact A-B Updated]
C -->|Fact A-B| D[Fact v2: valid_from=T2, valid_to=null]
D -->|Invalidate| E[Fact v1: valid_from=T1, valid_to=T2]This enables queries like:
- "What's true now?" → Filter for
valid_to = null - "What was true at time T?" → Filter for
valid_from ≤ T ≤ valid_to - "What changed?" → Compare valid windows across versions
Source: README.md
Graph Integration
Community Detection
After entities and edges are extracted, the pipeline runs community detection to cluster related entities. This enables:
- Community summaries: Aggregate information about groups of related entities
- Graph traversal search: Find entities by exploring nearby communities
- Hierarchical understanding: See relationships at different granularity levels
The v0.27.0 release added extracted edge facts to entity summaries, improving the richness of community understanding. Source: v0.27.0 Release Notes
Duplicate Handling
A known issue in v0.27.1 addressed duplicate information appearing in summaries. The fix excludes duplicate edges from node summary generation, ensuring cleaner, non-redundant summaries. Source: v0.27.1 Release Notes
MCP Server Integration
The MCP server provides a convenient interface for the ingestion pipeline:
@mcp.tool()
async def add_memory(
name: str,
episode_body: str,
source: str = "text",
source_description: str = "",
uuid: str | None = None,
group_id: str | None = None,
) -> AddMemoryResponse | ErrorResponse:
Source: mcp_server/src/graphiti_mcp_server.py:100-150
Environment Configuration
The pipeline uses environment variables from config.yaml:
| Variable | Default | Purpose |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Neo4j database connection |
NEO4J_USER | neo4j | Database username |
NEO4J_PASSWORD | demodemo | Database password |
OPENAI_API_KEY | (required) | OpenAI LLM/embedder |
ANTHROPIC_API_KEY | (optional) | Claude models |
Source: mcp_server/README.md
Performance Considerations
Chunking Trade-offs
| Scenario | Recommendation |
|---|---|
| Large entity-dense content (JSON arrays) | Enable chunking for reliable extraction |
| Prose/narrative content | Avoid chunking to preserve context |
| Short content (< CHUNK_MIN_TOKENS) | Always process as single unit |
Parallel Processing
Content chunks are processed in parallel when possible, limited by a semaphore to prevent overwhelming the LLM:
class GraphitiService:
def __init__(self, config: GraphitiConfig, semaphore_limit: int = 10):
self.semaphore = asyncio.Semaphore(semaphore_limit)
Source: mcp_server/src/graphiti_mcp_server.py:40-60
Known Limitations and Issues
Active Bugs
| Issue | Description | Impact |
|---|---|---|
| #1483 | FalkorDriver.build_fulltext_query fails on hyphens in group_id | RediSearch syntax errors with hyphenated group IDs |
| #1438 | MCP search_memory_facts fails with neo4j.time.DateTime serialization | Facts search unavailable with dense graphs |
Version History
| Version | Key Changes |
|---|---|
| v0.29.0 | Combined node + edge extraction, major efficiency gains |
| v0.28.0 | Simplified extraction pipeline, batch entity summarization |
| v0.27.1 | Fixed duplicate info in summaries |
| v0.27.0 | Added edge facts to entity summaries, efficiency gains |
| v0.28.2 | Security hardening against Cypher injection |
Related Documentation
- Search Pipeline - Querying the knowledge graph
- Graph Drivers - Supported database backends
- MCP Server - MCP server configuration and tools
Source: https://github.com/getzep/graphiti / Human Manual
Search System
Related topics: Neo4j Driver, FalkorDB Driver, Data Models
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Neo4j Driver, FalkorDB Driver, Data Models
Search System
Graphiti provides a sophisticated hybrid search system that combines multiple retrieval strategies—semantic embeddings, keyword-based BM25 retrieval, and graph traversal—to enable low-latency, high-precision queries over temporal knowledge graphs.
Overview
The Search System retrieves entities and relationships from the context graph using a multi-stage pipeline:
- Query Parsing: Parse user queries and extract search parameters
- Multi-Strategy Retrieval: Execute semantic, keyword, and graph-based searches in parallel
- Result Fusion: Combine results using Reciprocal Rank Fusion (RRF)
- Reranking: Apply cross-encoder reranking for improved relevance
- Filtering: Apply temporal, group, and type filters to refine results
Source: graphiti_core/search/search.py:1-50
Architecture
graph TD
A[User Query] --> B[SearchConfig]
B --> C[SearchFilters]
C --> D[Hybrid Search Executor]
D --> E1[Semantic Search]
D --> E2[BM25 Keyword Search]
D --> E3[Graph Traversal]
E1 --> F[Result Fusion RRF]
E2 --> F
E3 --> F
F --> G[CrossEncoder Reranker]
G --> H[Final Results]Core Components
| Component | File | Responsibility |
|---|---|---|
SearchConfig | graphiti_core/search/search_config.py | Encapsulates search parameters including retrieval strategies, reranking options, and search mode |
SearchFilters | graphiti_core/search/search_filters.py | Builds Cypher queries with temporal and group-based filtering |
SearchExecutor | graphiti_core/search/search.py | Orchestrates multi-strategy retrieval and result fusion |
SearchInterface | graphiti_core/driver/search_interface/search_interface.py | High-level API for performing fact and node searches |
OpenAIRerankerClient | graphiti_core/cross_encoder/openai_reranker_client.py | Cross-encoder reranking using OpenAI models |
Source: graphiti_core/search/search_config.py:1-30
Search Configuration
The SearchConfig class defines all parameters for a search operation.
Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
mode | SearchMode | HYBRID | Search mode: SEMANTIC, KEYWORD, HYBRID, or CENTER_NODE |
use_reranker | bool | False | Whether to apply cross-encoder reranking |
reranker_top_n | int | 10 | Number of results to return after reranking |
reranker_batch_size | int | 10 | Batch size for reranking API calls |
center_node_uuid | Optional[str] | None | UUID of center node for graph-distance-based reranking |
valid_at | Optional[datetime] | None | Return results valid at this point in time |
group_id | Optional[str] | None | Filter by group identifier |
entity_types | Optional[list[str]] | None | Filter by entity type names |
limit | int | 10 | Maximum number of results |
node_limit | int | 10 | Limit for node search operations |
fulltext_index_name | str | "entity_summary_vector" | Name of the fulltext index |
vector_index_name | str | "entity_summary" | Name of the vector index |
Source: graphiti_core/search/search_config.py:35-80
Search Modes
class SearchMode(str, Enum):
SEMANTIC = "semantic" # Embedding-based similarity search
KEYWORD = "keyword" # BM25 keyword matching
HYBRID = "hybrid" # Combine semantic + keyword
CENTER_NODE = "center_node" # Graph-distance-based reranking
Source: graphiti_core/search/search_config.py:10-25
Search Recipes
Predefined search configurations are available in search_config_recipes.py:
| Recipe | Use Case |
|---|---|
EDGE_HYBRID_SEARCH | Default hybrid search for edges/facts |
EDGE_HYBRID_SEARCH_RRF | Hybrid search with Reciprocal Rank Fusion |
NODE_HYBRID_SEARCH | Hybrid search for entity nodes |
NODE_HYBRID_SEARCH_RRF | Node search with RRF |
EDGE_SEMANTIC_SEARCH | Pure embedding-based edge search |
NODE_SEMANTIC_SEARCH | Pure embedding-based node search |
CENTER_NODE_GRAPH_SEARCH | Search reranked by graph distance from center node |
Source: graphiti_core/search/search_config_recipes.py:1-60
Using Recipes
from graphiti_core.search.search_config_recipes import (
EDGE_HYBRID_SEARCH_RRF,
NODE_HYBRID_SEARCH,
)
# Search for edges using hybrid RRF
edge_results = await search_interface.search_facts(
query="customer preferences for product X",
config=EDGE_HYBRID_SEARCH_RRF,
group_id="customer_123"
)
# Search for nodes using hybrid approach
node_results = await search_interface.search_nodes(
query="preferences opinions",
config=NODE_HYBRID_SEARCH,
group_id="customer_123"
)
Source: graphiti_core/search/search_config_recipes.py:25-45
Search Filters
The SearchFilters class constructs Cypher query clauses for filtering search results.
Filter Types
| Filter | Description |
|---|---|
group_id | Filter by group identifier |
valid_at | Temporal validity filter |
entity_types | Filter by entity type names |
uuid | Filter by specific node/edge UUID |
Source: graphiti_core/search/search_filters.py:1-40
Building Filters
from graphiti_core.search.search_filters import SearchFilters
from graphiti_core.search.search_config import SearchMode
filters = SearchFilters(
mode=SearchMode.HYBRID,
group_id="customer-123",
valid_at=datetime.now(),
entity_types=["Preference", "Requirement"]
)
# Generates Cypher WHERE clause
filter_clause = filters.build_group_id_filter()
# Results in: "WHERE edge.group_id = $group_id"
Source: graphiti_core/search/search_filters.py:40-70
Security: Cypher Injection Prevention
[!IMPORTANT]
Version 0.28.2 introduced security hardening against Cypher injection in search filters. The SearchFilters class properly parameterizes all user inputs to prevent injection attacks.
Source: graphiti_core/search/search_filters.py:50-65
Cross-Encoder Reranking
The OpenAIRerankerClient applies cross-encoder reranking to improve search result relevance after initial retrieval.
Reranking Configuration
| Parameter | Default | Description |
|---|---|---|
model | "cross-encoder/ms-marco-MiniLM-L-6-v2" | Cross-encoder model identifier |
top_n | 10 | Number of results to return |
batch_size | 10 | Batch size for API calls |
Source: graphiti_core/cross_encoder/openai_reranker_client.py:1-35
Reranking Process
graph LR
A[Initial Results<br/>N results] --> B[Reranker Client]
B --> C[Batch to CrossEncoder API]
C --> D[Relevance Scores]
D --> E[Reorder Results]
E --> F[Top N Results]reranker = OpenAIRerankerClient()
# Rerank search results
reranked = await reranker.rerank(
query="customer preference for shoes",
documents=["Adidas shoes are preferred...", "Nike shoes are popular..."],
top_n=5
)
Source: graphiti_core/cross_encoder/openai_reranker_client.py:35-80
Search Interface API
The SearchInterface provides the primary API for performing searches against the knowledge graph.
Source: graphiti_core/driver/search_interface/search_interface.py:1-100
Core Methods
#### search_facts
Search for facts (edges/relationships) matching a query.
async def search_facts(
self,
query: str,
config: Optional[SearchConfig] = None,
group_id: Optional[str] = None,
valid_at: Optional[datetime] = None,
entity_types: Optional[list[str]] = None,
limit: int = 10,
center_node_uuid: Optional[str] = None,
) -> List[FactResult]
| Parameter | Type | Description |
|---|---|---|
query | str | Natural language search query |
config | SearchConfig | Search configuration (uses default if None) |
group_id | str | Filter by group |
valid_at | datetime | Temporal validity filter |
entity_types | list[str] | Filter by entity types |
limit | int | Maximum results |
center_node_uuid | str | Center node for graph-distance reranking |
Source: graphiti_core/driver/search_interface/search_interface.py:100-150
#### search_nodes
Search for entity nodes matching a query.
async def search_nodes(
self,
query: str,
config: Optional[SearchConfig] = None,
group_id: Optional[str] = None,
entity_types: Optional[list[str]] = None,
limit: int = 10,
) -> List[NodeResult]
Source: graphiti_core/driver/search_interface/search_interface.py:150-200
Search Result Types
#### FactResult
| Field | Type | Description |
|---|---|---|
uuid | str | Unique identifier of the edge |
name | str | Name/label of the edge |
fact | str | Human-readable fact description |
source_node | NodeSummary | Source entity node |
target_node | NodeSummary | Target entity node |
valid_from | datetime | Start of validity window |
valid_to | datetime | End of validity window (None if current) |
created_at | datetime | When the fact was created |
Source: graphiti_core/driver/search_interface/search_interface.py:200-250
#### NodeResult
| Field | Type | Description |
|---|---|---|
uuid | str | Unique identifier of the node |
name | str | Name of the entity |
entity_type | str | Type of entity |
summary | str | Current entity summary |
created_at | datetime | When the node was created |
updated_at | datetime | When the summary was last updated |
Source: graphiti_core/driver/search_interface/search_interface.py:250-300
Hybrid Search Pipeline
The hybrid search combines three retrieval strategies:
graph TD
A[Query] --> B[Semantic Search]
A --> C[BM25 Search]
A --> D[Graph Search]
B --> E[Scores 0-1]
C --> F[BM25 Scores]
D --> G[Graph Distance]
E --> H[Reciprocal Rank Fusion]
F --> H
G --> H
H --> I[Fused Ranking]
I --> J[CrossEncoder Reranker]
J --> K[Final Results]Reciprocal Rank Fusion (RRF)
Results from multiple retrieval strategies are fused using RRF:
RRF_score(d) = Σ 1/(k + rank_i(d))
Where:
d= documentk= constant (default: 60)rank_i(d)= rank of document d in strategy i
Source: graphiti_core/search/search.py:150-200
Temporal Search
Graphiti supports temporal queries to retrieve facts valid at specific points in time.
Temporal Parameters
# Find facts valid now
results = await search_interface.search_facts(
query="customer preferences",
valid_at=datetime.now()
)
# Find facts valid at specific historical date
from datetime import datetime
historical_date = datetime(2024, 1, 15)
results = await search_interface.search_facts(
query="customer preferences",
valid_at=historical_date
)
Validity Window
Each fact (edge) has a validity window:
valid_from: When the fact became truevalid_to: When the fact was superseded (None = still current)
Facts are automatically invalidated when new contradictory information is ingested.
Source: graphiti_core/search/search_config.py:60-70
Known Issues and Limitations
Active Bugs
| Issue | Description | Workaround |
|---|---|---|
| #1483 | FalkorDriver.build_fulltext_query fails on hyphens in group_id | Avoid hyphens in group_id, or use underscores |
| #1481 | Neo4jDriver.execute_query() doesn't honor database parameter | Specify database in connection URI |
| #1477 | SearchInterface uses deprecated Pydantic v1 class Config | Will be fixed in v3; suppress warnings in v2 |
Pydantic v2 Deprecation Warning
The SearchInterface class uses a deprecated Pydantic v1-style class Config: block. This will cause a hard failure in Pydantic v3. To suppress warnings in v2:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
Source: graphiti_core/driver/search_interface/search_interface.py:350-360
Usage Examples
Basic Hybrid Search
from graphiti_core.driver.search_interface.search_interface import SearchInterface
from graphiti_core.search.search_config import SearchConfig, SearchMode
# Initialize search interface
search_interface = SearchInterface(graphiti)
# Basic hybrid search for facts
results = await search_interface.search_facts(
query="What are the customer's preferences?",
group_id="customer_123",
limit=5
)
for fact in results:
print(f"{fact.source_node.name} -> {fact.name} -> {fact.target_node.name}")
Node Search with Type Filtering
# Search for specific entity types
results = await search_interface.search_nodes(
query="preferences and requirements",
group_id="customer_123",
entity_types=["Preference", "Requirement"],
limit=10
)
Graph-Distance Reranking
from graphiti_core.search.search_config import SearchConfig, SearchMode
# Search reranked by graph distance from a specific node
config = SearchConfig(
mode=SearchMode.CENTER_NODE,
center_node_uuid="uuid-of-center-node",
limit=20
)
results = await search_interface.search_facts(
query="related preferences",
config=config,
group_id="customer_123"
)
Using Search Recipes
from graphiti_core.search.search_config_recipes import (
EDGE_HYBRID_SEARCH_RRF,
NODE_HYBRID_SEARCH_RRF,
)
# Edge search with RRF
edge_results = await search_interface.search_facts(
query="customer interactions",
config=EDGE_HYBRID_SEARCH_RRF,
group_id="customer_123"
)
# Node search with RRF
node_results = await search_interface.search_nodes(
query="products mentioned",
config=NODE_HYBRID_SEARCH_RRF,
group_id="customer_123"
)
Performance Considerations
| Aspect | Recommendation |
|---|---|
| Batch size | Use reranker_batch_size=10-20 for optimal throughput |
| Result limits | Set limit to expected use case (default 10) |
| Group filtering | Always specify group_id to reduce search scope |
| Entity type filtering | Use entity_types to narrow results when types are known |
| Temporal queries | Specify valid_at to avoid full temporal graph scan |
See Also
- Context Graph Overview - Understanding entities, facts, and episodes
- Episode Ingestion - How data enters the knowledge graph
- MCP Server - Search via Model Context Protocol
- Drivers - Neo4j, FalkorDB, and Kuzu integration
Source: https://github.com/getzep/graphiti / Human Manual
Neo4j Driver
Related topics: FalkorDB Driver, System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: FalkorDB Driver, System Architecture
Neo4j Driver
Overview
The Neo4j Driver is the primary database adapter for Graphiti, enabling temporal knowledge graph storage and retrieval using Neo4j as the underlying graph database. It provides a unified interface for graph operations including entity node management, relationship (edge) handling, search functionality, and index maintenance.
Neo4j serves as the default and most fully-featured driver in Graphiti, supporting the complete extraction, storage, and retrieval pipeline.
Architecture
Driver Hierarchy
Graphiti
└── Neo4jDriver (Primary implementation)
├── GraphOperations
├── EntityNodeOperations
├── EntityEdgeOperations
└── SearchOperations
Module Structure
The Neo4j driver is organized into several operation modules located in graphiti_core/driver/neo4j/operations/:
| Module | Purpose |
|---|---|
graph_ops.py | Index creation, constraint management, graph cleanup |
entity_node_ops.py | Node creation, retrieval, and updates |
entity_edge_ops.py | Edge/relationship management between nodes |
search_ops.py | Hybrid search combining semantic and BM25 retrieval |
Query Execution Flow
graph TD
A[Graphiti API Call] --> B[Neo4jDriver]
B --> C[Operation Module]
C --> D[Cypher Query Builder]
D --> E[execute_query]
E --> F[Neo4j Driver Session]
F --> G[Neo4j Database]
G --> H[Result Processing]
H --> I[Graphiti Data Models]Configuration
Environment Variables
The Neo4j Driver is configured through environment variables:
| Variable | Default | Description |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Bolt protocol connection URI |
NEO4J_USER | neo4j | Database username |
NEO4J_PASSWORD | password | Database password |
NEO4J_DATABASE | neo4j | Target database name |
Basic Initialization
from graphiti_core import Graphiti
from graphiti_core.driver.neo4j_driver import Neo4jDriver
# Using default connection
driver = Neo4jDriver()
# Or with explicit parameters
driver = Neo4jDriver(
uri="bolt://localhost:7687",
user="neo4j",
password="your-password",
database="neo4j"
)
graphiti = Graphiti(graph_driver=driver)
Source: README.md
MCP Server Configuration
In the MCP server context, Neo4j configuration is loaded through config.yaml or environment variable overrides:
database:
provider: "neo4j"
providers:
neo4j:
uri: "bolt://localhost:7687"
username: "neo4j"
password: "your_password"
database: "neo4j"
The configuration system supports environment variable overrides for CI/CD pipelines:
import os
uri = os.environ.get('NEO4J_URI', neo4j_config.uri)
username = os.environ.get('NEO4J_USER', neo4j_config.username)
password = os.environ.get('NEO4J_PASSWORD', neo4j_config.password)
Source: mcp_server/src/services/factories.py
Core Operations
Graph Operations (`graph_ops.py`)
Handles index and constraint management essential for Graphiti's operation.
#### Key Operations
| Method | Description |
|---|---|
build_indices_and_constraints | Creates full-text indexes and uniqueness constraints |
drop_indices_and_constraints | Removes all graph indices and constraints |
clear_graph | Removes all nodes and edges, preserving indices |
close | Closes the database connection |
#### Index Requirements
The driver creates the following indexes:
- Full-text indexes on
EntityNodeandEntityEdgefor hybrid search - Uniqueness constraints on
uuidproperties for all node types - Composite indexes on
(uuid, group_id)for efficient grouped queries
Entity Node Operations (`entity_node_ops.py`)
Manages the creation and retrieval of entity nodes in the knowledge graph.
#### Data Model
graph LR
A[Episodic Node] --> B[EntityNode]
A --> C[Fact Node]
B --> D[uuid]
B --> E[name]
B --> E[group_id]
B --> F[node_labels]
B --> G[content_summary]
B --> H[created_at]
B --> I[valid_at]
B --> J[invalid_at]
B --> K[attributes]#### Key Methods
| Method | Purpose |
|---|---|
create_episode_node | Stores episodic memory nodes |
create_entity_nodes | Batch creates entity nodes from extraction |
get_entity_node_by_uuid | Retrieves entity by unique identifier |
get_entity_nodes_by_uuid | Batch retrieval by UUIDs |
update_entity_node | Updates node properties and metadata |
#### Cypher Query Pattern
The driver uses MERGE for idempotent node creation:
MERGE (n:Episodic {uuid: $uuid})
SET n = {uuid: $uuid, name: $name, group_id: $group_id,
source_description: $source_description, source: $source,
content: $content, entity_edges: $entity_edges,
created_at: $created_at, valid_at: $valid_at}
RETURN n.uuid AS uuid
Source: graphiti_core/models/nodes/node_db_queries.py
Entity Edge Operations (`entity_edge_ops.py`)
Handles relationships (edges) between entity nodes, representing facts and temporal associations.
#### Key Methods
| Method | Description |
|---|---|
create_entity_edge | Creates a relationship between two nodes |
get_entity_edge_by_uuid | Retrieves edge by unique identifier |
get_entity_edges_by_uuid | Batch retrieval of edges |
get_entity_edges | Retrieves edges with filtering options |
expire_entity_edge | Marks an edge as invalid (temporal expiration) |
delete_entity_edge | Removes an edge from the graph |
#### Edge Properties
Each edge maintains temporal metadata:
uuid: Unique identifiername: Fact name/identifierfact: The extracted fact statementsource_node_uuid: Origin nodetarget_node_uuid: Destination nodecreated_at: Creation timestampvalid_at: Start of validity periodinvalid_at: End of validity period
Search Operations (`search_ops.py`)
Implements hybrid search combining multiple retrieval strategies.
#### Search Strategies
| Strategy | Description |
|---|---|
| Hybrid Search | Combines semantic vector search with BM25 keyword matching |
| Center Node Search | Graph-distance-based reranking from a reference node |
| Full-text Search | Direct keyword matching using full-text indexes |
| Entity Search | Vector similarity search on entity embeddings |
#### Search Recipe System
Predefined search configurations are available in search_config_recipes.py:
EDGE_HYBRID_SEARCH_RRF # Hybrid search with RRF reranking
NODE_HYBRID_SEARCH_RRF # Node-focused hybrid search
CENTER_NODE_SEARCH_RRF # Graph-aware reranking
FULL_TEXT_SEARCH # Keyword-only retrieval
Known Issues and Limitations
Bug: Database Parameter Not Honored in execute_query()
Issue #1481: The Neo4jDriver.execute_query() method incorrectly places the database_ parameter into parameters_ instead of passing it directly to the Neo4j connector's session configuration. This causes write operations to work, but query operations may target the wrong database.
Impact: When using a non-default database configuration, query operations may fail or return incorrect results.
Status: Open as of v0.29.1
Bug: DateTime Serialization in MCP Server
Issue #1438: The MCP server's search_memory_facts tool fails with 'Unable to serialize unknown type: <class 'neo4j.time.DateTime'> when querying a Neo4j-backed graph.
Impact: Temporal queries through the MCP interface fail on graphs with non-trivial fact density.
Workaround: search_nodes works correctly; direct API usage may be preferred.
Cypher Injection Hardening
v0.28.2: The driver was updated to harden search filters against Cypher injection attacks. Always use parameterized queries rather than string interpolation when extending the driver.
Group ID Management
Graphiti uses group_id to namespace graph data, allowing multiple independent knowledge graphs within a single database instance.
Query Filtering Pattern
All graph operations filter by group_id:
MATCH (e:EntityNode)
WHERE e.group_id = $group_id
RETURN e
Best Practices
- Use descriptive group IDs (e.g.,
user_123,project_acme) - Avoid special characters that may require escaping in Cypher
- Consider group isolation requirements when designing multi-tenant applications
Docker Deployment
Neo4j Docker Container
version: '3.8'
services:
graph:
image: zepai/graphiti:latest
ports:
- "8000:8000"
environment:
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=${NEO4J_USER}
- NEO4J_PASSWORD=${NEO4J_PASSWORD}
neo4j:
image: neo4j:5.22.0
ports:
- "7474:7474"
- "7687:7687"
volumes:
- neo4j_data:/data
environment:
- NEO4J_AUTH=${NEO4J_USER}/${NEO4J_PASSWORD}
MCP Server with Neo4j
docker compose -f mcp_server/docker/docker-compose-neo4j.yml up
Default Neo4j credentials: neo4j / demodemo
Access points:
- Web Interface: http://localhost:7474
- Bolt Protocol: bolt://localhost:7687
Comparison with Other Drivers
| Feature | Neo4j | FalkorDB | Kuzu | Neptune |
|---|---|---|---|---|
| Default Database | neo4j | default_db | In-memory/file | AWS-managed |
| Full-text Search | Native indexes | Redis-based | Limited | CloudSearch |
| Temporal Queries | Native | Via properties | Via properties | Via properties |
| Connection Protocol | Bolt | Redis | Custom | IAM/VPC |
| Cloud Native | Self-hosted | Self-hosted | Embedded | Fully managed |
Source: README.md
Extending the Driver
To implement a custom driver for a different graph database, follow the GraphDriver interface pattern:
from graphiti_core.driver.graph_driver import GraphDriver
class CustomGraphDriver(GraphDriver):
async def create_entity_node(self, node_data):
# Implement custom entity node creation
pass
async def create_entity_edge(self, edge_data):
# Implement custom edge creation
pass
# Implement all required abstract methods
The Neo4j Driver serves as the reference implementation for all driver operations in Graphiti.
Source: https://github.com/getzep/graphiti / Human Manual
FalkorDB Driver
Related topics: Neo4j Driver
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Neo4j Driver
FalkorDB Driver
Overview
The FalkorDB Driver is a database adapter that enables Graphiti to use FalkorDB as the underlying graph database for storing and querying temporal context graphs. FalkorDB is a graph database built on Redis that provides high-performance graph operations with native RediSearch integration for full-text search capabilities.
The driver implements the abstract GraphDriver interface, providing FalkorDB-specific implementations for graph operations, entity management, and hybrid search functionality. This allows Graphiti to work seamlessly with FalkorDB while maintaining compatibility with other graph databases like Neo4j.
graph TB
subgraph "Graphiti Core"
A[Graphiti] --> B[Graph Driver Interface]
end
subgraph "Driver Implementations"
B --> C[FalkorDB Driver]
B --> D[Neo4j Driver]
B --> E[Kuzu Driver]
B --> F[Neptune Driver]
end
subgraph "FalkorDB Stack"
C --> G[FalkorDB Client]
G --> H[Redis Protocol]
H --> I[FalkorDB Server]
I --> J[RediSearch Module]
endArchitecture
Module Structure
The FalkorDB driver is organized into a package under graphiti_core/driver/falkordb/:
| Module | Purpose |
|---|---|
falkordb_driver.py | Main driver class implementing GraphDriver interface |
operations/search_ops.py | Full-text search and hybrid search operations |
operations/graph_ops.py | Core graph operations (nodes, edges, episodes) |
operations/entity_node_ops.py | Entity node specific operations |
Core Components
#### FalkorDriver Class
The FalkorDriver class extends GraphDriver and provides FalkorDB-specific implementations:
class FalkorDriver(GraphDriver):
def __init__(
self,
uri: str = "falkor://localhost:6379",
db: int = 0,
password: str | None = None,
):
Key responsibilities include:
- Establishing connections to FalkorDB via Redis protocol
- Managing indices for entities, edges, and episodes
- Executing Cypher-like graph queries
- Building full-text search queries with RediSearch
#### Search Operations
The search_ops.py module handles:
- Full-text index creation: Creates RediSearch indices on graph entity and edge properties
- Hybrid search: Combines semantic embeddings with keyword matching using Reciprocal Rank Fusion (RRF)
- Query building: Constructs RediSearch queries for the
build_fulltext_querymethod
graph LR
A[Search Query] --> B[build_fulltext_query]
B --> C{group_id check}
C -->|Contains hyphens| D[Escape special chars]
C -->|No hyphens| E[Standard quoting]
D --> F[RediSearch Query String]
E --> F
F --> G[RediSearch FT.SEARCH]#### Graph Operations
The graph_ops.py module provides:
- Node creation and management for entities and episodes
- Edge creation and relationship management
- Temporal validity window handling (valid_from, valid_to)
- Group-based data isolation
#### Entity Node Operations
The entity_node_ops.py module handles:
- Entity extraction and summarization
- Node property management
- Cross-referencing between entities
Configuration
Connection Parameters
| Parameter | Default | Description |
|---|---|---|
uri | falkor://localhost:6379 | FalkorDB connection URI |
db | 0 | Database number |
password | None | Authentication password |
Environment Variables
FalkorDB configuration can be set via environment variables:
export FALKORDB_URI=falkor://localhost:6379
export FALKORDB_DB=0
MCP Server Configuration
In the MCP server context, FalkorDB is configured through config.yaml:
database:
type: "falkordb"
uri: "falkor://localhost:6379"
db: 0
Search Functionality
Full-Text Query Building
The build_fulltext_query method constructs RediSearch queries for FalkorDB. It handles:
- Group ID filtering using
@@group_id:{group_id}syntax - Text search with optional field specification
- Special character escaping
Important: There is a known issue with hyphens in group_id values. The current implementation wraps group IDs in quotes ("...") but may not properly escape all RediSearch special characters, leading to syntax errors.
# Known issue: https://github.com/getzep/graphiti/issues/1483
# Hyphens in group_id can cause RediSearch parse failures
def build_fulltext_query(
self,
query: str,
group_id: str | None = None,
index_name: str | None = None,
limit: int | None = None,
) -> str:
Hybrid Search
The driver supports hybrid search combining:
- Semantic search: Uses embedding vectors for similarity matching
- BM25 keyword search: Traditional text ranking algorithm
- RRF fusion: Reciprocal Rank Fusion to combine results
Index Management
Index Types
FalkorDB creates and manages several RediSearch indices:
| Index | Purpose | Key Fields |
|---|---|---|
entity_index | Entity node search | name, summary, entity_type, group_id |
edge_index | Edge/fact search | fact, source_name, target_name, group_id |
episode_index | Episode search | name, content, source, group_id |
Index Initialization
On driver initialization, indices are created if they don't exist:
# From graphiti_core/driver/falkordb_driver.py
self.create_fulltext_indices()
Usage Examples
Basic Driver Setup
from graphiti_core.driver.falkordb_driver import FalkorDriver
# Create driver instance
driver = FalkorDriver(
uri="falkor://localhost:6379",
db=0
)
# Initialize connection and indices
await driver.close()
await driver.setup_indices()
MCP Server with FalkorDB
# Start MCP server with FalkorDB
uv run main.py --group-id my-project --db-type falkordb
Docker Compose
services:
falkordb:
image: falkordb/falkordb:latest
ports:
- "6379:6379"
mcp-server:
depends_on:
- falkordb
# ... configuration
Known Issues
Issue #1483: Hyphen Handling in group_id
Severity: Bug
Summary: The FalkorDriver.build_fulltext_query method fails with RediSearch syntax errors when group_id contains hyphens.
Root Cause: The implementation wraps the group_id in quotes ("...") but does not properly escape hyphens or other special characters that RediSearch interprets as syntax elements.
Affected Code:
- Location:
graphiti_core/driver/falkordb_driver.py - Method:
build_fulltext_query
Workaround: Avoid using hyphens in group_id values when using FalkorDB as the backend.
Comparison with Neo4j Driver
| Feature | FalkorDB Driver | Neo4j Driver |
|---|---|---|
| Protocol | Redis | Bolt |
| Full-text Search | Native RediSearch | Lucene via APOC |
| Query Language | Cypher (Redgraph dialect) | Cypher |
| Graph Algorithms | Limited | Rich library |
| Cloud Support | Self-hosted | Neo4j Aura/Bloom |
| Performance | High for read-heavy workloads | Balanced |
Extending the Driver
To implement support for additional graph databases, follow the pattern established by FalkorDriver:
- Create a new driver class extending
GraphDriver - Implement required abstract methods for CRUD operations
- Implement search operations with database-specific syntax
- Add index management methods
class CustomDriver(GraphDriver):
def __init__(self, connection_params):
super().__init__()
# Initialize database connection
async def close(self):
# Close connections
async def setup_indices(self):
# Create necessary indices
Requirements
- FalkorDB 1.1.2 or higher
- Redis protocol compatibility
- Python 3.10+
Further Reading
Source: https://github.com/getzep/graphiti / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 14 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.
1. Configuration risk: Configuration risk requires verification
- Severity: high
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_2328466b52aa45579865c00657e318dd | https://github.com/getzep/graphiti/issues/1505
2. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_a31b49c7d71646a48dec968318e0ba8b | https://github.com/getzep/graphiti/issues/1483
3. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_b128cabe853b4492945c209b962d8457 | https://github.com/getzep/graphiti/issues/1513
4. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_a62b375e05dd49d18f6807c575566949 | https://github.com/getzep/graphiti/issues/1515
5. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_c1d4750319b947bd96a2308a6fe6d3b7 | https://github.com/getzep/graphiti/issues/1518
6. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_752fa6d991e844c599853b3d9f69b47e | https://github.com/getzep/graphiti/issues/1516
7. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:840056306 | https://github.com/getzep/graphiti
8. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_2126d3011d414eaf8bd9ce77b0d0ce6f | https://github.com/getzep/graphiti/issues/1517
9. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:840056306 | https://github.com/getzep/graphiti
10. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | github_repo:840056306 | https://github.com/getzep/graphiti
11. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | github_repo:840056306 | https://github.com/getzep/graphiti
12. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_b6a9696a7ec842d48a220154476c4abe | https://github.com/getzep/graphiti/issues/1509
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using graphiti with real data or production workflows.
- Your project is ranked #16 on HVTracker — embed a trust badge? - github / github_issue
- FalkorDriver.default_group_id ('\_') is rejected by validate_group_id - github / github_issue
- add_episode is impractically slow for >5KB content — proposal: skip_extr - github / github_issue
- Suggestion: Standardized retrieval quality benchmarks for temporal knowl - github / github_issue
- NaN/Inf values from embedder silently break entity deduplication and pro - github / github_issue
- Neo4jDriver.__init__:57 schedules orphan create_task without cancel/awai - github / github_issue
- Security or permission risk requires verification - GitHub / issue
- Capability evidence risk requires verification - GitHub / issue
- Security or permission risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence