# https://github.com/getzep/zep Project Manual

Generated at: 2026-06-19 10:54:29 UTC

## Table of Contents

- [Repository Overview & Examples Catalog](#page-1)
- [Agent Framework Integrations](#page-2)
- [Zep MCP Server & Claude/Cursor Plugin](#page-3)
- [Evaluation Harness & Ontology](#page-4)

<a id='page-1'></a>

## Repository Overview & Examples Catalog

### Related Pages

Related topics: [Agent Framework Integrations](#page-2), [Zep MCP Server & Claude/Cursor Plugin](#page-3), [Evaluation Harness & Ontology](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [examples/python/user-summary-instructions-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md)
- [integrations/livekit/python/README.md](https://github.com/getzep/zep/blob/main/integrations/livekit/python/README.md)
- [examples/typescript/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/typescript/chunking-example/README.md)
- [examples/python/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/chunking-example/README.md)
- [examples/go/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/go/chunking-example/README.md)
- [integrations/crewai/python/README.md](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md)
- [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)
- [integrations/autogen/python/README.md](https://github.com/getzep/zep/blob/main/integrations/autogen/python/README.md)
- [integrations/adk/typescript/src/graph-search-tool.ts](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/src/graph-search-tool.ts)
- [integrations/vercel-ai/typescript/package.json](https://github.com/getzep/zep/blob/main/integrations/vercel-ai/typescript/package.json)
- [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)
- [integrations/adk/python/src/zep_adk/graph_search_tool.py](https://github.com/getzep/zep/blob/main/integrations/adk/python/src/zep_adk/graph_search_tool.py)
- [integrations/adk/python/README.md](https://github.com/getzep/zep/blob/main/integrations/adk/python/README.md)
</details>

# Repository Overview & Examples Catalog

The `getzep/zep` repository is a multi-package monorepo centered on Zep's long-term memory service for AI agents. Beyond the core server, it distributes reference examples, framework integrations, an evaluation harness, and an MCP server. The `examples/` and `integrations/` directories form a curated catalog that demonstrates how to attach Zep's temporal knowledge graph, user-summary generation, and contextual retrieval to real applications. Source: [examples/python/user-summary-instructions-example/README.md:1-40](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md).

## Examples by Language

The `examples/` tree provides runnable, end-to-end patterns in three languages. The **Python** directory covers the broadest surface: a `user-summary-instructions-example` (a real estate sales agent chatbot driven by `thread.get_user_context()`), a `chunking-example` that demonstrates Anthropic-style contextualized retrieval, plus memory, search, and assistant demos. Source: [examples/python/user-summary-instructions-example/README.md:21-39](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md). The **TypeScript** directory mirrors the contextualized retrieval pattern via `examples/typescript/chunking-example`, useful for Node.js-based agents. Source: [examples/typescript/chunking-example/README.md:1-15](https://github.com/getzep/zep/blob/main/examples/typescript/chunking-example/README.md). The **Go** directory offers a `chunking-example` for Go services, illustrating that the same pipeline (chunk → contextualize → ingest) is reproducible outside Python. Source: [examples/go/chunking-example/README.md:1-15](https://github.com/getzep/zep/blob/main/examples/go/chunking-example/README.md).

Each example expects a `.env` containing `ZEP_API_KEY` and an LLM provider key (e.g., `OPENAI_API_KEY`), and uses Zep's `AsyncZep` client for asynchronous ingestion. The chunking examples share a common design: prepending a brief document context to each chunk before embedding, which materially improves retrieval recall. Source: [examples/python/chunking-example/README.md:9-37](https://github.com/getzep/zep/blob/main/examples/python/chunking-example/README.md).

## Framework Integrations

The `integrations/` directory packages Zep as a drop-in component for popular agent frameworks.

| Integration | Language | Primary Surface |
|-------------|----------|-----------------|
| CrewAI | Python | `ZepUserStorage`, `ZepGraphStorage`, `ZepStorage`, search/add tools, ontology support |
| LiveKit | Python | `ZepUserAgent`, `ZepGraphAgent` for real-time voice agents |
| AutoGen | Python | `ZepUserMemory` memory layer + graph search/add tools |
| Google ADK | Python & TypeScript | `ZepContextTool` (auto-injection) and `ZepGraphSearchTool` (model-callable) |
| Vercel AI SDK | TypeScript | Lightweight adapter, peer-dep on `ai ^6.0.0` |
| MCP server | Go | Standards-compliant model context protocol tool server |

CrewAI's integration exposes per-user memory and a generic knowledge graph, with tool factories `create_search_tool` and `create_add_data_tool` bound to either a `user_id` or `graph_id` at construction time. Source: [integrations/crewai/python/README.md:13-58](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md). The `zep-crewai v1.1.1` release fixed a regression where external memory search results returned a `"memory"` key; the result key is now `"context"` for CrewAI v0.186.0+. Source: community release notes (zep-crewai v1.1.1).

LiveKit's adapter supports two memory models—`ZepUserAgent` for thread-based per-user memory, and `ZepGraphAgent` for direct graph manipulation—and both work against the same underlying temporal knowledge graph. Source: [integrations/livekit/python/README.md:24-66](https://github.com/getzep/zep/blob/main/integrations/livekit/python/README.md). AutoGen uses `ZepUserMemory` registered via the framework's `memory=[...]` slot on an `AssistantAgent`. Source: [integrations/autogen/python/README.md:7-39](https://github.com/getzep/zep/blob/main/integrations/autogen/python/README.md).

The Google ADK integration (Python and TypeScript) provides two complementary tools. `ZepContextTool` injects context on every turn, while `ZepGraphSearchTool` is model-callable: search parameters can be *pinned* at construction time and hidden from the schema, locking the tool to a fixed `graph_id` or `user_id`. Source: [integrations/adk/python/src/zep_adk/graph_search_tool.py:1-31](https://github.com/getzep/zep/blob/main/integrations/adk/python/src/zep_adk/graph_search_tool.py) and [integrations/adk/typescript/src/graph-search-tool.ts:11-37](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/src/graph-search-tool.ts). The ADK packages require `zep-cloud>=3.23.0` and `google-adk>=1.0.0`. Source: [integrations/adk/python/README.md:55-58](https://github.com/getzep/zep/blob/main/integrations/adk/python/README.md).

The Vercel AI SDK adapter is a thin TypeScript package that depends on `@getzep/zep-cloud ^3.23.0` and peers on `ai ^6.0.0`, targeting Node 20+. Source: [integrations/vercel-ai/typescript/package.json:1-43](https://github.com/getzep/zep/blob/main/integrations/vercel-ai/typescript/package.json). Finally, `mcp/zep-mcp-server` is a Go implementation of the Model Context Protocol, built on `zep-go` v3, exposing Zep's tools to any MCP-compatible client. Source: [mcp/zep-mcp-server/README.md:1-15](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md).

## Evaluation Harness

`zep-eval-harness/` is a reproducible pipeline for benchmarking memory quality. It separates configuration into `document_ingestion_config/`, `user_ingestion_config/`, `document_chunking_config/`, and `evaluation_config/`, and writes each run to a timestamped directory under `runs/` with a snapshot of the active config, ensuring reproducibility even if configs change later. Source: [zep-eval-harness/README.md:1-35](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md). The pipeline executes: user ingestion → document chunking → document ingestion → evaluation, with optional custom ontologies, custom instructions, and `user-summary-instructions`. The evaluation step is concurrency-tunable (default 15) and emits `results.json` plus an `evaluation_config_snapshot/`. Source: [zep-eval-harness/README.md:35-95](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md).

## Repository Topology

```mermaid
graph TD
  Repo[getzep/zep monorepo]
  Repo --> Ex[examples/]
  Repo --> Int[integrations/]
  Repo --> Eval[zep-eval-harness/]
  Repo --> MCP[mcp/zep-mcp-server/]
  Ex --> PyEx[python/<br/>user-summary, chunking]
  Ex --> TsEx[typescript/<br/>chunking]
  Ex --> GoEx[go/<br/>chunking]
  Int --> CrewAI[CrewAI]
  Int --> LK[LiveKit]
  Int --> AG[AutoGen]
  Int --> ADK[Google ADK<br/>py + ts]
  Int --> Vercel[Vercel AI SDK]
```

## See Also

- CrewAI integration release notes (v1.1.1) for `context` vs. `memory` key changes.
- ADK Python and TypeScript source for `ZepGraphSearchTool` parameter pinning.
- Evaluation harness config snapshots under `runs/` for reproducibility.

---

<a id='page-2'></a>

## Agent Framework Integrations

### Related Pages

Related topics: [Repository Overview & Examples Catalog](#page-1), [Zep MCP Server & Claude/Cursor Plugin](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [integrations/README.md](https://github.com/getzep/zep/blob/main/integrations/README.md)
- [integrations/CLAUDE.md](https://github.com/getzep/zep/blob/main/integrations/CLAUDE.md)
- [integrations/adk/python/src/zep_adk/graph_search_tool.py](https://github.com/getzep/zep/blob/main/integrations/adk/python/src/zep_adk/graph_search_tool.py)
- [integrations/adk/typescript/src/graph-search-tool.ts](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/src/graph-search-tool.ts)
- [integrations/adk/typescript/package.json](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/package.json)
- [integrations/crewai/python/README.md](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md)
- [integrations/autogen/python/README.md](https://github.com/getzep/zep/blob/main/integrations/autogen/python/README.md)
- [integrations/livekit/python/README.md](https://github.com/getzep/zep/blob/main/integrations/livekit/python/README.md)
- [examples/typescript/langgraph/README.md](https://github.com/getzep/zep/blob/main/examples/typescript/langgraph/README.md)
- [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)
</details>

# Agent Framework Integrations

Zep ships a dedicated `integrations/` monorepo of framework adapters that wrap its agent-memory platform for popular agent runtimes. Each adapter is a separate, installable package so consumers add only what their stack requires. The integrations expose two capabilities — automatic context injection and on-demand tool calls — over Zep's high-level SDK.

## Repository Layout and Supported Frameworks

Integration packages are organized **framework-first, then language** at `integrations/<framework>/<language>/`, as documented in [integrations/README.md](https://github.com/getzep/zep/blob/main/integrations/README.md). Each package ships a README, a `SETUP.md`, a runnable example, a test suite, and a changelog.

| Framework | Language | Package | Notes |
|-----------|----------|---------|-------|
| Google ADK | Python | `zep-adk` | First-party tools: `ZepContextTool`, `ZepGraphSearchTool` |
| Google ADK | TypeScript | `@getzep/zep-adk` (peer `@google/adk` 1.2.0) | Mirror of the Python API |
| Microsoft AutoGen | Python | `zep-autogen` | `ZepUserMemory`, search/add graph tools |
| CrewAI | Python | `zep-crewai` | `ZepUserStorage`, `ZepGraphStorage`, `create_search_tool`, `create_add_data_tool` |
| LiveKit | Python | `zep-livekit` | Real-time/voice-agent memory layer |

Source: [integrations/README.md](https://github.com/getzep/zep/blob/main/integrations/README.md)

Per the same file, planned adapters include Microsoft Agent Framework (Python), Pydantic AI (Python), LangGraph (Python), Mastra (TypeScript), and Google ADK (Go). Verified extension points for each planned target are catalogued in `integrations/SPIKE_FINDINGS.md`. The LangGraph CLI example at [`examples/typescript/langgraph/`](https://github.com/getzep/zep/blob/main/examples/typescript/langgraph/README.md) demonstrates the same Zep memory pattern outside the official integrations namespace.

## Common Integration Patterns

Across frameworks, adapters converge on two primitives.

**1. Memory-backed context injection.** The framework receives a Zep `Memory`/storage object that is consulted on every turn. In CrewAI, `ZepUserStorage` returns Zep's auto-assembled context block via `thread.get_user_context`, while `ZepGraphStorage` composes context for shared knowledge graphs ([integrations/crewai/python/README.md](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md)). AutoGen follows the same shape: `ZepUserMemory(client, user_id, thread_id)` is added to an `AssistantAgent`'s `memory=` list ([integrations/autogen/python/README.md](https://github.com/getzep/zep/blob/main/integrations/autogen/python/README.md)).

**2. On-demand search and write tools.** Agents that need explicit recall receive a `search` tool — and optionally an `add` tool — that wraps `client.graph.search()` or `client.user.search()`. The ADK Python implementation pins selected parameters to keep the model schema lean while exposing a configurable query ([integrations/adk/python/src/zep_adk/graph_search_tool.py](https://github.com/getzep/zep/blob/main/integrations/adk/python/src/zep_adk/graph_search_tool.py)). The TypeScript counterpart `ZepGraphSearchTool` follows an identical contract, resolving the search target to either a pinned `graphId` or the current user's graph ([integrations/adk/typescript/src/graph-search-tool.ts](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/src/graph-search-tool.ts)).

## Reference: Google ADK Integration

The ADK adapter is the most complete cross-language example. The Python package exposes `ZepContextTool` (auto-injected on every turn) and `ZepGraphSearchTool` (model-callable on demand); the TypeScript package — built with `tsup` and tested with `vitest` against `@google/adk` 1.2.0 ([integrations/adk/typescript/package.json](https://github.com/getzep/zep/blob/main/integrations/adk/typescript/package.json)) — provides the same surface.

```mermaid
flowchart LR
    User[User message] --> Agent[ADK Agent]
    Agent -->|every turn| CT[ZepContextTool]
    CT -->|inject context block| Agent
    Agent -->|on demand| ST[ZepGraphSearchTool]
    ST -->|graph.search| Zep[(Zep Graph)]
    Agent -->|graph.add| Zep
```

Search targets resolve as follows: a pinned `graph_id`/`graphId` always wins, otherwise the adapter looks up `zep_user_id` from ADK session state (Python) or the ADK `userId` (TypeScript). This dual mode lets the same agent operate over a shared documentation graph or a per-user personal graph without code changes ([integrations/adk/python/src/zep_adk/graph_search_tool.py](https://github.com/getzep/zep/blob/main/integrations/adk/python/src/zep_adk/graph_search_tool.py)).

## Storage Classes and Ontology

CrewAI exposes the broadest storage API and is the canonical reference for patterns other adapters adopt. `ZepUserStorage` saves messages to a thread and parallel-searches thread + user graph; `ZepGraphStorage` adds multi-scope search (edges, nodes, episodes) with label and attribute filters, and supports structured entity ontologies through `client.graph.set_ontology(entities=..., edges=...)` ([integrations/crewai/python/README.md](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md)). The latest `zep-crewai` release (`v1.1.1`) renamed the external memory search key from `"memory"` to `"context"` to stay compatible with CrewAI v0.186.0+, a change that affects every `ZepUserStorage`, `ZepGraphStorage`, and `ZepStorage` consumer.

## Development and Release Workflow

Each integration builds and tests in isolation. Python packages use `uv sync --extra dev && uv run pytest && uv build`; TypeScript uses `npm ci && npm test`; Go uses `go test ./...`. CI paths are wired via `.github/workflows/test-integrations.yml` using `paths-filter`, and releases are cut per-package so a CrewAI patch does not ship through ADK. To add a new adapter, create `integrations/<framework>/<language>/`, implement the framework's memory/context extension point listed in `SPIKE_FINDINGS.md`, add tests plus a runnable example, and wire CI ([integrations/README.md](https://github.com/getzep/zep/blob/main/integrations/README.md)).

## See Also

- [integrations/CLAUDE.md](https://github.com/getzep/zep/blob/main/integrations/CLAUDE.md) — per-language conventions for contributors
- [integrations/SPIKE_FINDINGS.md](https://github.com/getzep/zep/blob/main/integrations/SPIKE_FINDINGS.md) — verified framework extension points
- [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md) — Claude Code skill that guides Zep integration work

---

<a id='page-3'></a>

## Zep MCP Server & Claude/Cursor Plugin

### Related Pages

Related topics: [Repository Overview & Examples Catalog](#page-1), [Agent Framework Integrations](#page-2)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)
- [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)
</details>

# Zep MCP Server & Claude/Cursor Plugin

## Overview

The Zep repository ships two complementary developer-experience surfaces that expose Zep Cloud to AI coding assistants and external agent runtimes: a **read-only MCP (Model Context Protocol) server** written in Go, and a **Claude Code plugin** named `building-with-zep`. Both surfaces are designed to give AI agents safe, structured access to Zep's temporal knowledge graph and conversation memory, but they serve different audiences — the MCP server targets any MCP-compatible client (Claude Desktop, Cline, custom Go services, etc.), while the Claude Code plugin is purpose-built for authoring Zep integration code from inside Claude Code.

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md) ; [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

---

## Zep MCP Server

### Purpose and Scope

The `zep-mcp-server` is a Model Context Protocol server for [Zep Cloud](https://www.getzep.com/) that provides read-only access to Zep's temporal knowledge graph and memory features. It is intentionally read-only — the README explicitly highlights that the tools it exposes are "safe, non-destructive operations for AI assistants" and "fast" because the server is built with Go. The server speaks the standard MCP wire format, so it works with Claude Desktop, Cline, and any other MCP client.

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)

Under the hood the server is built with the **Zep Go SDK v3**, the **MCP Go SDK**, and `godotenv` for environment loading, all released under Apache License 2.0.

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)

### Tool Surface

The server exposes 13 read-only tools organised into three functional groups. The README enumerates the first ten explicitly:

| Group | Tool | Purpose |
|-------|------|---------|
| Core Search & Retrieval | `search_graph` | Search the knowledge graph with filters, reranking, and scoped search |
| Core Search & Retrieval | `get_user_context` | Retrieve formatted context for a thread (supports custom templates) |
| Core Search & Retrieval | `get_user` | Get user information and metadata |
| Core Search & Retrieval | `list_threads` | List conversation threads for a user |
| Graph Query | `get_user_nodes` | Retrieve entity nodes from a user's knowledge graph |
| Graph Query | `get_user_edges` | Retrieve relationship edges from a user's knowledge graph |
| Graph Query | `get_episodes` | Get episode nodes (temporal data ingestion events) |
| Detail Retrieval | `get_thread_messages` | Retrieve messages from a conversation thread |
| Detail Retrieval | `get_node` | Get a specific node by UUID |
| Detail Retrieval | `get_edge` | Get a specific edge by UUID (description truncated in source) |

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)

Detailed parameters, return shapes, and example invocations live in [docs/TOOLS.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/docs/TOOLS.md), and Docker / docker-compose deployment is covered in [docs/DOCKER.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/docs/DOCKER.md).

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)

### Architecture & Data Flow

```mermaid
flowchart LR
    Client["MCP Client<br/>(Claude Desktop, Cline, custom)"] -- "MCP protocol<br/>(tools/list, tools/call)" --> Server["zep-mcp-server<br/>(Go binary)"]
    Server -- "Zep Go SDK v3<br/>(HTTP)" --> Cloud["Zep Cloud<br/>Temporal Knowledge Graph"]
    Cloud -- "search, nodes,<br/>edges, messages" --> Server
    Server -- "structured<br/>JSON response" --> Client
```

The server holds no persistent state of its own — every call is forwarded to Zep Cloud through the Go SDK, and the response is shaped into an MCP tool result. Because all tools are read-only, there is no concurrency or write-coordination concern; the only configuration the operator must provide is a Zep API key, typically loaded from a `.env` file via `godotenv`.

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md)

---

## Claude Code Plugin: `building-with-zep`

### Purpose

The plugin is a Claude Code plugin "for building applications that use Zep — agent memory built on temporal Context Graphs." It bundles two distinct things: a model-invoked **skill** and an MCP server that proxies Zep's hosted documentation.

Source: [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

### Components

```
plugins/building-with-zep/
├── .claude-plugin/
│   └── plugin.json          # manifest; declares the zep-docs MCP server
├── skills/
│   └── building-with-zep/
│       ├── SKILL.md
│       └── references/      # concepts, apis, customization, getting-started, evaluation, governance
└── README.md
```

Source: [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

- **`building-with-zep` skill** — installed at `skills/building-with-zep/`. It teaches Claude what Zep is, its core concepts, the high-level vs low-level APIs, customization surfaces, and how to "start simple and benchmark." It is model-invoked, meaning Claude automatically consults it when working on Zep integration code.
- **`zep-docs` MCP server** — declared in the manifest `plugin.json`. It performs real-time search over Zep's public docs at `https://docs-mcp.getzep.com/mcp` using HTTP transport and requires **no API key** from the user.

Source: [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

### Installation

The repository's root `.claude-plugin/marketplace.json` defines a plugin marketplace named `zep`, so the install flow is:

```bash
# Add the marketplace (from this repo)
/plugin marketplace add getzep/zep

# Install the plugin
/plugin install building-with-zep@zep
```

For local development without going through the marketplace, the plugin can be loaded by pointing Claude Code directly at the directory:

```bash
claude --plugin-dir plugins/building-with-zep
```

Source: [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

### When to Use the Plugin vs the MCP Server

The two surfaces are complementary rather than overlapping. The `zep-docs` MCP server is best when the model needs fresh documentation lookups while writing code, while the `building-with-zep` skill is best for design decisions and high-level guidance. The standalone `zep-mcp-server` in `mcp/zep-mcp-server/` is the right choice when you need direct read-only access to **your** Zep Cloud data (users, threads, graph state) from any MCP-compatible client, not just Claude Code.

Source: [mcp/zep-mcp-server/README.md](https://github.com/getzep/zep/blob/main/mcp/zep-mcp-server/README.md) ; [plugins/building-with-zep/README.md](https://github.com/getzep/zep/blob/main/plugins/building-with-zep/README.md)

---

## See Also

- [Zep CrewAI Integration](./zep-crewai-integration.md) — community-fixed v1.1.1 for CrewAI 0.186.0+ compatibility
- [Zep AutoGen Integration](./zep-autogen-integration.md) — AutoGen agent memory and graph tools
- [Zep Eval Harness](./zep-eval-harness.md) — end-to-end evaluation framework
- [Zep Integrations Overview](./integrations-overview.md) — release process and tagging scheme for `zep-<framework>-<language>-v<version>` packages
- External: [Zep Cloud Documentation](https://help.getzep.com/), [MCP Specification](https://modelcontextprotocol.io/)

---

<a id='page-4'></a>

## Evaluation Harness & Ontology

### Related Pages

Related topics: [Repository Overview & Examples Catalog](#page-1), [Zep MCP Server & Claude/Cursor Plugin](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)
- [zep-eval-harness/zep_evaluate.py](https://github.com/getzep/zep/blob/main/zep-eval-harness/zep_evaluate.py)
- [zep-eval-harness/zep_chunk_documents.py](https://github.com/getzep/zep/blob/main/zep-eval-harness/zep_chunk_documents.py)
- [zep-eval-harness/zep_ingest_documents.py](https://github.com/getzep/zep/blob/main/zep-eval-harness/zep_ingest_documents.py)
- [zep-eval-harness/zep_ingest_users.py](https://github.com/getzep/zep/blob/main/zep-eval-harness/zep_ingest_users.py)
- [zep-eval-harness/zep_graph_inspect.py](https://github.com/getzep/zep/blob/main/zep-eval-harness/zep_graph_inspect.py)
- [examples/typescript/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/typescript/chunking-example/README.md)
- [examples/python/user-summary-instructions-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md)
</details>

# Evaluation Harness & Ontology

## Purpose and Scope

The **Evaluation Harness** is a Python-based pipeline for benchmarking Zep's long-term memory layer against user-defined question/answer test cases. It ingests synthetic user profiles and conversations into Zep's user graphs, optionally builds a separate **document knowledge graph** with a configurable ontology, and then grades an LLM's responses against Zep-retrieved context. The result is a reproducible, LLM-judged quality score for memory-augmented assistants.

The harness ships alongside the rest of the repository, in [`zep-eval-harness/`](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md). It exposes a sequence of CLI scripts (`uv run zep_*.py`) and a structured `config/` tree that lets users tune ontology, custom instructions, chunking parameters, and evaluation prompts independently.

## High-Level Architecture

The harness is organized as seven discrete pipeline steps, each producing a timestamped, snapshot-versioned run directory under `runs/`. This design makes every evaluation fully reproducible even after the active config files change.

```mermaid
flowchart LR
  A[Discover Test Cases] --> B[Ingest Users]
  C[Chunk Documents] --> D[Ingest Documents]
  B --> E[Evaluate]
  D --> E
  E --> F[Inspect Graph]
  E --> G[results.json]
```

Each stage is independently invokable, supports `--resume` from checkpoints, and writes a copy of its active config (`*_config_snapshot/`) into the run directory ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

## Pipeline Stages

### 1. User Ingestion

`zep_ingest_users.py` reads `data/users.json`, generates random suffixes for idempotency, and creates Zep users with full names and metadata. It can also enable a **custom ontology** (entity/edge types), **custom instructions** (extraction directives), and **user summary instructions** that bias the long-form user summary toward specific questions:

```bash
uv run zep_ingest_users.py --custom-ontology --custom-instructions --user-summary-instructions
```

A run is written to `runs/users/{N}_{timestamp}/manifest.json`, including the full `user_ingestion_config_snapshot/` ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

### 2. Document Chunking

`zep_chunk_documents.py` splits source documents into chunks and (optionally) calls an LLM to generate a context prefix for each chunk — Anthropic's **contextualized retrieval** technique. Chunks are streamed into `chunks.jsonl` so a parallel ingestion script can tail the file. Resumable via `--resume runs/chunk_sets/{N}_{timestamp}` ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

The contextualized-retrieval pattern is also documented as a standalone example: each chunk is prepended with a brief description so the embedding model can disambiguate references such as "the policy" ([Source: [examples/typescript/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/typescript/chunking-example/README.md)]).

### 3. Document Ingestion

`zep_ingest_documents.py` ingests chunks into a standalone Zep graph. It supports three modes:

| Mode | Trigger | Behavior |
|------|---------|----------|
| Reuse | `--chunk-set N` | Ingests chunks from a prior run |
| Follow | auto | Tails an in-progress chunk-set's JSONL |
| Inline | `--chunk-size N` | Chunks and ingests in one pass |

Both `--custom-ontology` and `--custom-instructions` flags are accepted, and ingestion writes to `runs/documents/{N}_{timestamp}/manifest.json` with a snapshot of the active ontology/instructions ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

### 4. Evaluation

`zep_evaluate.py` ties the pipeline together. For each test case it:

1. **Searches** the user graph and (optionally) the document graph.
2. **Generates** an answer with a configurable `LLM_RESPONSE_MODEL` using `get_response_system_prompt()`.
3. **Grades** the answer with a separate `LLM_JUDGE_MODEL`.

Concurrency is bounded by an asyncio semaphore (default `--concurrency 15`). The script accepts `--user-run N` and `--doc-run N` to combine prior user and document ingestion runs, and saves results to `runs/evaluations/{N}_{timestamp}/results.json` ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

### 5. Graph Inspection

`zep_graph_inspect.py` is an operator's tool — it accepts either a user ID or a `graph_id` (from a manifest) and prints the contents of the resulting graph for debugging ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

## Configuration Layout

The `config/` tree is split by concern so a single change does not bleed across pipeline stages:

```
config/
├── user_ingestion_config/
│   ├── ontology.py                 # Document graph entity/edge types + set_document_custom_ontology()
│   └── custom_instructions.py       # Document graph custom instructions
├── document_chunking_config/
│   └── constants.py                # CHUNK_SIZE, CHUNK_OVERLAP, LLM_CONTEXTUALIZATION_MODEL
└── evaluation_config/
    ├── constants.py                # Search limits, LLM_RESPONSE_MODEL, LLM_JUDGE_MODEL
    └── response_prompt.py          # get_response_system_prompt() — system prompt for AI responses
```

The response prompt used during evaluation is defined in `config/evaluation_config/response_prompt.py` and can be customized independently from the evaluation logic itself ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

### Ontology & Custom Instructions

The harness exposes two complementary levers for shaping the document graph:

- **Custom ontology** — declares entity types (e.g. `Concept`, `Topic`, `Process`) and edge types (e.g. `DESCRIBES`, `DEPENDS_ON`, `PART_OF`, `REFERENCES`, `IMPLEMENTS`) so the extraction LLM produces structured, typed nodes/edges.
- **Custom instructions** — named instruction sets (e.g. `real_estate_reference_domain`, `home_buying_process`, `financial_concepts`) that bias the extractor toward domain-specific vocabulary and relationships.

Both are stored as part of the run manifest so a later reproduction can replay extraction verbatim ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

### User Summary Instructions

A related, but distinct, Zep feature is **User Summary Instructions**, which causes the long-form user summary to always answer a curated set of questions (budget, must-haves, location priorities in the real-estate example). The `examples/python/user-summary-instructions-example/` app demonstrates this with a Streamlit dashboard that compares agent behavior with and without the instructions ([Source: [examples/python/user-summary-instructions-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md)]).

## Resilience, Concurrency, and Run Tracking

All four pipeline scripts include retry logic with exponential backoff (up to **8 retries**, max **5-minute delay**) for handling rate limits and transient API errors. The concurrency knobs are:

| Script | Default | Flag |
|--------|---------|------|
| `zep_chunk_documents.py` | 5 | `--concurrency N` |
| `zep_ingest_users.py` / `zep_ingest_documents.py` | API-bound | (per-call) |
| `zep_evaluate.py` | 15 | `--concurrency N` |

Every pipeline step writes a numbered, timestamped subdirectory under `runs/`. Each subdirectory carries a **config snapshot** — a copy of the active config files — guaranteeing that even if config files change later, the original run remains reproducible. Manifest files reference their parent user and document ingestion runs, forming a complete provenance chain from raw data to graded results ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

## Data Contracts

The harness auto-discovers data files by naming convention:

- **Users** — `data/users.json`, a JSON array of `{user_id, first_name, last_name, email, metadata}`.
- **Conversations** — `data/conversations/{user_id}_{conversation_id}.json`, with a `messages` array suitable for `memory.add()`.
- **Documents** — arbitrary paths resolved by the chunking script.
- **Test cases** — discovered by `zep_evaluate.py`, each producing a single graded result entry.

Random suffixes added during user ingestion make ingestion **idempotent** — re-running the script does not create duplicate users ([Source: [zep-eval-harness/README.md](https://github.com/getzep/zep/blob/main/zep-eval-harness/README.md)]).

## Common Failure Modes

1. **Stale snapshots** — If you edit `config/ontology.py` after creating users, the snapshot in the user run will not match the new file. New user runs will use the updated config; old ones remain frozen.
2. **Rate limits during evaluation** — Default concurrency of 15 may exceed model quotas on large test sets. Lower `--concurrency` or rely on the built-in exponential backoff.
3. **Resume mismatch** — `--resume` paths must point at an existing run directory whose chunk set status matches the requested mode (`completed` for reuse, `in_progress` for follow).
4. **Ontology drift** — Changing entity/edge types between user and document runs can produce graphs the judge cannot compare fairly; keep ontology stable across the runs you intend to evaluate together.

## See Also

- [Zep Documentation — Retrieving Memory](https://help.getzep.com/retrieving-memory)
- [Zep Documentation — Customizing Context Blocks](https://help.getzep.com/cookbook/customize-your-context-block)
- [examples/python/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/chunking-example/README.md)
- [examples/go/chunking-example/README.md](https://github.com/getzep/zep/blob/main/examples/go/chunking-example/README.md)
- [examples/python/user-summary-instructions-example/README.md](https://github.com/getzep/zep/blob/main/examples/python/user-summary-instructions-example/README.md)
- [integrations/crewai/python/README.md](https://github.com/getzep/zep/blob/main/integrations/crewai/python/README.md)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: getzep/zep

Summary: Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

## 1. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/getzep/zep

## 2. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/getzep/zep

## 3. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/getzep/zep

## 4. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/getzep/zep

## 5. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/getzep/zep

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/getzep/zep

<!-- canonical_name: getzep/zep; human_manual_source: deepwiki_human_wiki -->