# https://github.com/neo4j-labs/agent-memory Project Manual

Generated at: 2026-06-17 05:20:52 UTC

## Table of Contents

- [Overview and System Architecture](#page-1)
- [SDK Usage and Memory Operations](#page-2)
- [Extraction, Enrichment, and Entity Resolution](#page-3)
- [MCP Server, Framework Integrations, CLI, and Deployment](#page-4)

<a id='page-1'></a>

## Overview and System Architecture

### Related Pages

Related topics: [SDK Usage and Memory Operations](#page-2), [Extraction, Enrichment, and Entity Resolution](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md)
- [examples/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md)
- [src/neo4j_agent_memory/nams/ontology.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py)
- [src/neo4j_agent_memory/integrations/openai_agents/memory.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/openai_agents/memory.py)
- [typescript/src/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/index.ts)
- [typescript/src/ontology/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/ontology/index.ts)
- [typescript/src/reasoning/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/reasoning/index.ts)
- [typescript/src/types.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/types.ts)
- [typescript/src/middleware/vercel-ai.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/middleware/vercel-ai.ts)
- [examples/full-stack-chat-agent/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/full-stack-chat-agent/README.md)
- [examples/financial-services-advisor/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/financial-services-advisor/README.md)
- [examples/microsoft_agent_retail_assistant/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/microsoft_agent_retail_assistant/README.md)
- [examples/domain-schemas/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/domain-schemas/README.md)

</details>

# Overview and System Architecture

## Purpose and Scope

`agent-memory` is a Neo4j Labs library that turns a Neo4j graph into long-term memory for LLM-based agents. It exposes a single async `MemoryClient` that abstracts over either a direct bolt connection to a Neo4j instance or the hosted **NAMS (Neo4j Agent Memory Service)** REST backend, so application code does not need to change when the storage tier is swapped. The project targets the v0.4.0 "hosted backend" release line and ships in both Python (`neo4j-agent-memory`) and TypeScript (`@neo4j-labs/agent-memory`) Source: [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

The library groups functionality into three memory tiers (short-term conversation history, long-term entities and preferences, and reasoning traces) and a small number of production-grade primitives (buffered writes, consolidation, evaluation). It also publishes a [Model Context Protocol (MCP) server](#) and pre-built integrations with the major agent frameworks.

## High-Level Architecture

```mermaid
flowchart TB
    subgraph Clients["Client SDK"]
        Py["Python MemoryClient<br/>src/neo4j_agent_memory"]
        Ts["TypeScript MemoryClient<br/>typescript/src"]
    end
    subgraph Integrations["Framework Integrations"]
        OpenAI["OpenAI Agents"]
        Pydantic["PydanticAI"]
        LangChain["LangChain"]
        Vercel["Vercel AI SDK Middleware"]
        Strands["Strands / Google ADK / CrewAI"]
    end
    subgraph Backends["Storage Backends"]
        Bolt["bolt:// Neo4j<br/>(self-hosted)"]
        NAMS["NAMS REST<br/>(hosted)"]
    end
    Py -->|"backend=bolt"| Bolt
    Py -->|"backend=nams"| NAMS
    Ts -->|"RestTransport"| NAMS
    OpenAI --> Py
    Pydantic --> Py
    LangChain --> Py
    Vercel --> Ts
    Strands --> Py
    NAMS -->|persists| Graph[("Neo4j<br/>knowledge graph")]
    Bolt --> Graph
```

The TypeScript `MemoryClient` is composed of typed sub-clients (`ShortTermMemory`, `LongTermMemory`, `ReasoningMemory`, `QueryConsole`, `AuthClient`, `OntologyClient`) that share a single `Transport` interface Source: [typescript/src/index.ts:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/index.ts). On the Python side the equivalent accessors hang off the same client, e.g. `client.ontology` for ontology management Source: [src/neo4j_agent_memory/nams/ontology.py:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py).

## Memory Tiers and Data Model

The library organizes persisted state into three tiers, surfaced through matching sub-clients:

| Tier | Sub-client | Purpose | Key operations |
|------|------------|---------|----------------|
| Short-term | `client.short_term` | Per-session/conversation message history | `add_message`, `get_conversation`, `search_messages`, `list_sessions` |
| Long-term | `client.long_term` | Entities, preferences, relationships | `add_entity` (returns `(entity, dedup_result)`), `search_entities`, `add_preference`, `merge_entities` |
| Reasoning | `client.reasoning` | Multi-step agent traces with tool calls | `record_step`, `record_tool_call`, `complete_trace`, `get_similar_traces` |

All long-term entities are typed against the **POLE+O** model (`PERSON`, `ORGANIZATION`, `LOCATION`, `EVENT`, `OBJECT`) plus extension entity types, and the TypeScript and Python clients both expose the ontology as POLE+O strings (e.g. `"PERSON"`) rather than an enum Source: [typescript/src/types.ts:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/types.ts) and Source: [examples/README.md:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md). Reasoning traces bridge the older "Silver tier" wrapper shape and the newer hosted-native flat shape, with steps owned directly by a conversation Source: [typescript/src/reasoning/index.ts:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/reasoning/index.ts).

Reasoning steps emit explicit `:TOUCHED` audit edges that point back to the entities they referenced, which is the basis for the `provenance` and `explain` views used to reconstruct why an entity was created.

## Backend Selection and the NAMS Release

The v0.4.0 release is the "hosted backend" release: applications write once against `MemoryClient` and pick a backend via configuration. `MemorySettings(backend="nams", ...)` routes traffic through the NAMS REST service, while the bolt default preserves the v0.3.x code path Source: [README.md:1-40](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

NAMS exposes several advanced surfaces that the bolt backend does not, including a typed, versioned, validated **ontology service**. The ontology surface is documented in the source as a snake_case sub-API with immutable revisions, workspace-owned ontologies, per-version `validation_mode` (`permissive` records non-conforming writes; `strict` rejects them), and ~28 system templates Source: [src/neo4j_agent_memory/nams/ontology.py:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py). The TypeScript client mirrors this API exactly, exposing typed models for `OntologyDocument`, `PropertyDef`, `EntityTypeDef`, and `RelationshipDef` Source: [typescript/src/ontology/index.ts:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/ontology/index.ts).

## Integration and Production Surfaces

The library is intentionally framework-agnostic. Framework integrations typically wrap the same three accessors and add a session strategy plus automatic persistence:

- **OpenAI Agents** — `memory.py` integrates as a session-scoped wrapper that exposes `get_context(...)`, `save_message(...)`, and exposes `extract_entities` / `generate_embedding` flags per save Source: [src/neo4j_agent_memory/integrations/openai_agents/memory.py:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/openai_agents/memory.py).
- **Vercel AI SDK** — `AgentMemoryMiddleware` implements the `LanguageModelV1Middleware` shape; it lazily creates a conversation from `conversationId`/`userId`, hydrates the system prompt with three-tier context, and persists both the user input and the assistant response by default Source: [typescript/src/middleware/vercel-ai.ts:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/middleware/vercel-ai.ts).
- **PydanticAI / Strands / Google ADK / CrewAI / Microsoft Agent Framework** — full reference apps ship under `examples/`, including a 299-episode Lenny podcast explorer Source: [examples/full-stack-chat-agent/README.md:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/examples/full-stack-chat-agent/README.md), a multi-agent KYC/AML financial advisor on AWS Strands and Google ADK Source: [examples/financial-services-advisor/README.md:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/examples/financial-services-advisor/README.md), and a Microsoft retail assistant with GDS algorithms and entity deduplication Source: [examples/microsoft_agent_retail_assistant/README.md:1-60](https://github.com/neo4j-labs/agent-memory/blob/main/examples/microsoft_agent_retail_assistant/README.md).

Production primitives are first-class rather than side-channel utilities: `client.schema.adopt_existing_graph(...)` layers the library over an existing graph, `user_identifier=` scopes writes to a tenant, `client.buffered.submit(...)` provides fire-and-forget writes, `client.consolidation.dedupe_entities(...)` exposes consolidation, and `client.eval.run(suite)` runs an eval harness Source: [README.md:1-40](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

## Domain Schemas and Extraction

For non-trivial corpora the project ships factory-style domain schemas (podcast transcripts, news, scientific papers, business reports, entertainment, medical, legal) that map onto POLE+O and add domain-specific labels, each tuned to a particular extraction pipeline (e.g. GLiREL for news, streaming extraction for scientific papers) Source: [examples/domain-schemas/README.md:1-90](https://github.com/neo4j-labs/agent-memory/blob/main/examples/domain-schemas/README.md). Models and providers are configured via `MemorySettings.embedding` and `MemorySettings.llm`, which accept either a provider-string shorthand (`"anthropic/claude-3-5-sonnet-latest"`) or a `Provider` instance, with a LiteLLM universal fallback for 100+ providers Source: [README.md:1-40](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

## Community Direction

Three recurring community themes map directly onto open or recently-closed work in the architecture: a **LangGraph middleware** that automates store/retrieve around LLM calls (issue #49) is conceptually adjacent to the Vercel AI SDK middleware that already exists; **memory decay in retrievers** via a temporal re-ranker (issue #42) is a candidate addition to `client.long_term.search_*` rather than the store path; and **public vs. private memory scoping** (issue #13) lines up with the existing `user_identifier=` tenancy knob plus the NAMS workspace model. The "Add CLI" request (issue #11) for batch ingestion, benchmark runs, and sample loading is still open.

## See Also

- [Quickstart and Installation](quickstart.md)
- [Memory Tiers Reference](memory-tiers.md)
- [NAMS Backend Configuration](nams-backend.md)
- [Ontology Service Guide](ontology.md)
- [Framework Integrations](framework-integrations.md)
- [Production Primitives](production-primitives.md)

---

<a id='page-2'></a>

## SDK Usage and Memory Operations

### Related Pages

Related topics: [Overview and System Architecture](#page-1), [MCP Server, Framework Integrations, CLI, and Deployment](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md)
- [examples/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md)
- [typescript/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/README.md)
- [typescript/src/middleware/vercel-ai.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/middleware/vercel-ai.ts)
- [typescript/src/integrations/langchain.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/integrations/langchain.ts)
- [typescript/src/reasoning/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/reasoning/index.ts)
- [typescript/src/ontology/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/ontology/index.ts)
- [src/neo4j_agent_memory/integrations/openai_agents/memory.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/openai_agents/memory.py)
- [src/neo4j_agent_memory/integrations/microsoft_agent/memory.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/microsoft_agent/memory.py)
- [src/neo4j_agent_memory/integrations/strands/__init__.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/strands/__init__.py)
- [src/neo4j_agent_memory/nams/ontology.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py)
- [examples/full-stack-chat-agent/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/full-stack-chat-agent/README.md)
- [examples/microsoft_agent_retail_assistant/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/microsoft_agent_retail_assistant/README.md)
</details>

# SDK Usage and Memory Operations

## Overview

The `agent-memory` project ships a polyglot SDK (Python and TypeScript) that turns Neo4j into the long-term store for AI agents. The headline surface is a single `MemoryClient` that exposes three memory tiers — short-term conversation, long-term entity/preference memory, and reasoning traces — plus an ontology layer, consolidation primitives, and a hosted backend (NAMS) selectable by configuration. Source: [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

The "v0.4.0 hosted backend" release, highlighted in the community context, makes the backend selection additive: callers instantiate `MemorySettings(backend="nams", ...)` for the REST service or leave the default to keep talking to a local Neo4j via Bolt — existing v0.3.x code keeps working unchanged.

## Memory Tiers and Their Operations

The SDK is organized around three accessors on `MemoryClient`:

| Tier | Purpose | Typical operations |
|------|---------|--------------------|
| `client.short_term` | Conversation history scoped to a session | append messages, fetch flat history, three-tier context |
| `client.long_term` | POLE+O entities, preferences, relationships | `add_entity`, search, dedup, adopt existing graphs |
| `client.reasoning` | Reusable reasoning traces and tool-call provenance | record steps, complete traces, get similar traces |

Short-term context is exposed as a helper that bundles reflections, observations, and recent messages into a system prompt. For example, the OpenAI Agents integration calls `client.get_context(query=..., session_id=..., include_short_term=..., include_long_term=..., include_reasoning=..., max_items=...)` so the agent can pull all three tiers in a single call. Source: [src/neo4j_agent_memory/integrations/openai_agents/memory.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/openai_agents/memory.py).

The reasoning tier also includes explain/provenance views that surface the trail behind any entity; bridge methods wrap traces while hosted-native methods flatten the model so steps belong directly to a conversation. Source: [typescript/src/reasoning/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/reasoning/index.ts).

A typical lifecycle — illustrated by the Full-Stack Chat Agent example — runs in three steps:

```mermaid
flowchart LR
    A[User message] --> B[short_term.add_message]
    B --> C[Entity extraction + add_entity]
    C --> D[long_term.search / dedup]
    D --> E[Agent response + tool calls]
    E --> F[reasoning.record_step]
    F --> G[Three-tier context on next turn]
```

Source: [examples/full-stack-chat-agent/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/full-stack-chat-agent/README.md).

## Client Lifecycle and Conventions

The Python examples enforce a small set of conventions that are worth memorising:

- **Async-only.** Every memory operation is a coroutine; scripts use `asyncio.run(...)`, notebooks prefix calls with `await`. Source: [examples/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md).
- **Context-managed client.** The recommended pattern is `async with MemoryClient(settings) as client:`, with `client.connect()` / `client.close()` as the manual alternative. There is no `initialize()` method.
- **`add_entity` returns a tuple.** Since v0.1.1, `await client.long_term.add_entity(...)` returns `(entity, dedup_result)`; callers can unpack to inspect dedup outcomes or discard with `_, _ = await ...`.
- **POLE+O types are strings.** Use `"PERSON"`, `"ORGANIZATION"`, `"LOCATION"`, `"EVENT"`, `"OBJECT"` rather than the legacy enum.

Configuration is centralised in `MemorySettings`, with a provider-string shorthand for embeddings and LLMs (`"anthropic/claude-3-5-sonnet-latest"`, `"BAAI/bge-small-en-v1.5"`) backed by native adapters for OpenAI, Anthropic, Bedrock, Vertex AI, and sentence-transformers, with LiteLLM as a universal fallback for 100+ providers. Source: [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).

## Framework Integrations and Middleware

Beyond direct `MemoryClient` use, the SDK ships adapters that hide the client behind a framework's idioms:

- **OpenAI Agents / Microsoft Agent / Google ADK** — expose a unified "memory" object combining a context provider and a chat history store. The Microsoft Agent variant (`Neo4jMicrosoftMemory.from_memory_client(...)`) hands back a `context_provider` ready to plug into `chat_client.as_agent(...)`. Source: [src/neo4j_agent_memory/integrations/microsoft_agent/memory.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/microsoft_agent/memory.py).
- **AWS Strands** — exposes `context_graph_tools(...)` that return `@tool`-decorated functions usable from a Strands `Agent`; `llm_provider_from_strands(model)` maps Bedrock-style identifiers (e.g. `anthropic.claude-sonnet-4-20250514-v1:0`) onto the `bedrock/` provider. Source: [src/neo4j_agent_memory/integrations/strands/__init__.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/integrations/strands/__init__.py).
- **Vercel AI SDK (TS)** — `agentMemoryMiddleware(client, options)` conforms to `LanguageModelV1Middleware`; it injects three-tier context, persists user input before generation, and writes back assistant responses and tool calls after generation. On the REST transport it lazily creates a conversation if `conversationId` is unset. Source: [typescript/src/middleware/vercel-ai.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/middleware/vercel-ai.ts).
- **LangChain JS (TS)** — `Neo4jChatMessageHistory` and `Neo4jEntityRetriever` are duck-typed against LangChain's interfaces so the module has no LangChain dependency at compile time. Source: [typescript/src/integrations/langchain.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/integrations/langchain.ts).

The community request to "Add LangGraph Middleware" (issue #49) and the "memory decay" re-ranker (issue #42) both push in the same direction as these built-in middlewares: automate storage and retrieval so the agent does not have to remember to call memory, and weight recall by recency.

## Ontologies, Production Features, and Failure Modes

The ontology surface (`client.ontology`) is a typed, versioned schema layer that extends POLE+O. NAMS exposes a snake-case sub-API (`GET /ontologies`, `POST /ontologies/{name}/clone`, `PUT /ontologies/{id}`, `POST /ontologies/active`, `DELETE /ontologies/{id}`) for cloning, revising, and activating domain schemas. Source: [src/neo4j_agent_memory/nams/ontology.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py), [typescript/src/ontology/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/ontology/index.ts).

Production features worth knowing:

- **Adopt existing graphs.** `client.schema.adopt_existing_graph(...)` layers the library over a graph you already operate in production.
- **Multi-tenancy.** Pass `user_identifier=` to scope entities and traces to a tenant.
- **Buffered (fire-and-forget) writes.** `client.buffered.submit(...)` decouples write latency from the request path. Source: [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md).
- **Consolidation.** `client.consolidation.dedupe_entities(...)` and an `:TOUCHED` audit edge let you reconstruct which reasoning step created or last visited an entity.
- **Eval harness.** `client.eval.run(suite)` runs benchmarks across schemas and models — also surfaced as a CLI use case in community issue #11.

Common failure modes:

1. Forgetting that operations are async and calling them synchronously.
2. Treating the old `EntityType` enum as authoritative — pass the string label instead.
3. On Vercel AI middleware, providing a `conversationId` function that throws — the middleware will not fall back to lazy creation on the Bolt/bridge transport (it raises `NotSupportedError`).
4. Not setting `OPENAI_API_KEY` (or whichever provider key the configured model requires) before running demos such as the Lenny Podcast import — the community issue #7 requests fail-fast behaviour, which the examples still need to be paired with explicit env validation.
5. Assuming the ontology endpoints behave the same across backends — the snake_case sub-API is empirically verified for NAMS and may not be present on Bolt.

## See Also

- [Architecture overview](https://neo4j.com/labs/agent-memory/explanation/graph-architecture)
- [Memory types concept guide](https://neo4j.com/labs/agent-memory/explanation/memory-types)
- [Provider migration guide](https://neo4j.com/labs/agent-memory/how-to/migrate-to-providers.html)
- [MCP tools reference](https://neo4j.com/labs/agent-memory/reference/mcp-tools)
- [TypeScript SDK landing page](https://neo4j.com/labs/agent-memory/sdks/typescript)

---

<a id='page-3'></a>

## Extraction, Enrichment, and Entity Resolution

### Related Pages

Related topics: [Overview and System Architecture](#page-1), [SDK Usage and Memory Operations](#page-2)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/neo4j-labs/agent-memory/blob/main/README.md)
- [examples/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md)
- [examples/domain-schemas/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/domain-schemas/README.md)
- [src/neo4j_agent_memory/llm/structured.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/llm/structured.py)
- [src/neo4j_agent_memory/nams/ontology.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/ontology.py)
- [src/neo4j_agent_memory/nams/__init__.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/__init__.py)
- [typescript/src/types.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/types.ts)
- [typescript/src/ontology/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/ontology/index.ts)
- [examples/microsoft_agent_retail_assistant/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/microsoft_agent_retail_assistant/README.md)
- [examples/full-stack-chat-agent/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/full-stack-chat-agent/README.md)
</details>

# Extraction, Enrichment, and Entity Resolution

## Overview

The `neo4j-agent-memory` library treats unstructured text (chat turns, transcripts, documents) as the raw input to a multi-stage pipeline that produces a deduplicated, type-validated knowledge graph. Three concerns drive that pipeline:

1. **Extraction** — turning prose into typed entity and relationship records.
2. **Enrichment** — augmenting those records with background knowledge from external sources.
3. **Entity Resolution** — collapsing duplicate mentions and harmonising them against a domain schema.

The README frames the library as supporting "multi-stage entity extraction (spaCy / GLiNER / LLM), relationship extraction (GLiREL), background enrichment (Wikipedia / Diffbot)" alongside production features such as `client.schema.adopt_existing_graph(...)`, multi-tenant `user_identifier=` scoping, and consolidation primitives like `client.consolidation.dedupe_entities(...)` (Source: [README.md]()). These concerns are deliberately orthogonal so that users can mix and match an extractor, enricher, and resolver that fit their cost/quality budget.

## Extraction Strategies

The library exposes a pluggable extractor hierarchy rooted at `src/neo4j_agent_memory/extraction/base.py` and instantiated by `factory.py`. Three first-party backends are referenced in the project documentation and examples:

| Backend | Strength | When to use |
|---|---|---|
| **LLM extractor** (`llm_extractor.py`) | High recall on long, ambiguous prose; honours custom ontologies | Default for chat agents and document corpora |
| **GLiNER extractor** (`gliner_extractor.py`) | Fast zero-shot span labelling; domain-schema driven | Batch ingestion where a typed label set is known up front |
| **spaCy extractor** (`spacy_extractor.py`) | Lightweight, dependency-free NER | Smoke tests, offline / CPU-only environments |

The LLM path goes through a shared "schema-aligned structured extraction with retry-on-validation-error" routine that converts a Pydantic `response_model` into a system prompt, calls the provider, tolerant-parses the response (stripping markdown fences, smart-quotes, trailing commas; finding the first balanced `{...}` block), validates it, and retries with the previous attempt plus its validation error appended as feedback. After `max_retries + 1` total attempts it raises `StructuredExtractionError` carrying every attempt for diagnosability (Source: [src/neo4j_agent_memory/llm/structured.py]()). Adapters with native structured output (OpenAI strict mode, Anthropic forced tool use) override `StructuredExtractor.complete_structured` and only fall back to this routine when the model lacks the native mode, so the retry path is explicitly positioned as a "safety net" (Source: [src/neo4j_agent_memory/llm/structured.py]()).

The `pipeline.py` module composes these extractors into a single ingest flow, and the LLM, GLiNER, and spaCy backends each implement the same interface so a deployment can swap providers via `MemorySettings(llm="anthropic/claude-3-5-sonnet-latest", embedding="BAAI/bge-small-en-v1.5")` without changing pipeline code (Source: [README.md]()).

### Domain Schemas for GLiNER

The `examples/domain-schemas/` directory ships eight ready-made schemas that demonstrate how to bias a zero-shot model toward a target vertical (Source: [examples/domain-schemas/README.md]()). They include:

- `POLEO` / `podcast_transcripts.py` — POLE+O investigations.
- `news` — journalism (person, organization, location, event, date).
- `scientific` — authors, institutions, methods, datasets, metrics, concepts, tools.
- `business` — companies, executives, products, industries, financial metrics.
- `entertainment` — actors, directors, films, TV shows, characters, awards.
- `medical` — diseases, drugs, symptoms, procedures, body parts, genes, organisms.
- `legal` — cases, contracts, regulatory filings.

Each schema is a `Factory pattern` description set handed to GLiNER2; the project emphasises that "domain-specific schemas significantly improve extraction accuracy compared to generic entity types" (Source: [examples/domain-schemas/README.md]()).

## Enrichment and the Ontology Surface

Extracted entities are validated against a typed, versioned **ontology** that extends POLE+O. The ontology surface is exposed both in the hosted NAMS backend and via the local bolt backend; it is mirrored between the Python client (`src/neo4j_agent_memory/nams/ontology.py`) and the TypeScript client (`typescript/src/ontology/index.ts`). The TypeScript header comment enumerates the wire surface:

```
GET    /ontologies                        → list (summaries)
GET    /ontologies/{id}                   → { record, versions[] }
GET    /ontologies/active                 → { ontology, version }
POST   /ontologies/{name}/clone           → version
POST   /ontologies        { ontology, validation_mode? } → version
PUT    /ontologies/{id}    { ontology, validation_mode? } → new revision
POST   /ontologies/active { version_id }                  → version
DELETE /ontologies/{id}                   → 204
```
(Source: [typescript/src/ontology/index.ts]())

The `OntologyDocument` carries `entity_types[]`, `relationships[]`, and a `domain`. Each `EntityTypeDef` is bound to a `poleType` (e.g. `PERSON`, `ORGANIZATION`, `LOCATION`, `EVENT`, `OBJECT`) and a list of typed `PropertyDef` entries (Source: [typescript/src/ontology/index.ts]()). Background enrichment (Wikipedia and Diffbot) populates these properties after extraction — the Lenny's Podcast demo, for example, uses Wikipedia-enriched entity cards as a first-class UI element (Source: [examples/full-stack-chat-agent/README.md]()).

The hosted service additionally exposes a `import_()` primitive that converts external formats (Arrows, Neo4j Data Importer, RDF, GraphQL, Cypher, LinkML, native JSON/YAML) into a non-persisted draft that can be activated with `create()`; URL fetches are SSRF-guarded and size-capped, and extraction-backed formats (e.g. `rdf`) are rate-limited per workspace (Source: [src/neo4j_agent_memory/nams/ontology.py]()).

## Entity Resolution and Consolidation

Once entities are extracted, the library performs two related tasks:

- **De-duplication of mentions** that refer to the same real-world thing (e.g. "Neo4j", "neo4j", "Neo4j, Inc.").
- **Schema validation** against the active ontology so that stray labels do not pollute the graph.

The Microsoft Retail Assistant example highlights this in production: the agent runs on top of the Microsoft Agent Framework and applies "GDS algorithms, entity deduplication, and context providers" over the ingested product catalog (Source: [examples/microsoft_agent_retail_assistant/README.md]()). The general-purpose API is `client.consolidation.dedupe_entities(...)`, listed among the production features of the library (Source: [README.md]()). Because `add_entity` returns `(entity, dedup_result)` since v0.1.1, callers can inspect the resolution outcome of every write (Source: [examples/README.md]()).

The active ontology is also the lever for multi-tenant scoping: every node carries the `user_identifier=` from the call site, and a single workspace can hold several ontology revisions. The `MigrateOptions` and `MigrationJob` types in the TypeScript client describe an asynchronous label-rename migration that runs in batches with a `dryRun` mode and a per-transaction node cap (Source: [typescript/src/ontology/index.ts]()). `OntologyDiff` provides a `{ added, removed, renamed, modified }` view between two revisions so resolutions and migrations are auditable (Source: [typescript/src/ontology/index.ts]()).

The TypeScript `Entity` type surfaces resolution metadata from the hosted service directly: `canonicalName`, `confidence` (0-1), and `sourceStage` (which extraction stage produced the entity), so downstream consumers can reason about how a given node was resolved (Source: [typescript/src/types.ts]()).

## See Also

- [Long-term Memory and Reasoning Traces](long-term-memory.md)
- [Memory Settings and Provider Configuration](memory-settings.md)
- [NAMS Backend and Hosted Service](nams-backend.md)
- [Framework Integrations](framework-integrations.md)
- Community: Issue #42 ("Support memory decay in memory search retrievers") discusses downstream re-ranking of resolved entities by recency — a natural follow-up to the resolution primitives described above.

---

<a id='page-4'></a>

## MCP Server, Framework Integrations, CLI, and Deployment

### Related Pages

Related topics: [Overview and System Architecture](#page-1), [SDK Usage and Memory Operations](#page-2)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/neo4j_agent_memory/cli/main.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/cli/main.py)
- [src/neo4j_agent_memory/mcp/server.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/mcp/server.py)
- [src/neo4j_agent_memory/mcp/_tools.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/mcp/_tools.py)
- [src/neo4j_agent_memory/mcp/_resources.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/mcp/_resources.py)
- [src/neo4j_agent_memory/mcp/_prompts.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/mcp/_prompts.py)
- [src/neo4j_agent_memory/mcp/_preference_detector.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/mcp/_preference_detector.py)
- [typescript/src/mcp/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/mcp/index.ts)
- [typescript/src/integrations/langchain.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/integrations/langchain.ts)
- [typescript/src/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/index.ts)
- [examples/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/README.md)
- [typescript/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/README.md)
- [examples/domain-schemas/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/domain-schemas/README.md)
- [examples/google_cloud_integration/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/google_cloud_integration/README.md)
- [src/neo4j_agent_memory/nams/__init__.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/__init__.py)
</details>

# MCP Server, Framework Integrations, CLI, and Deployment

## Overview

The `agent-memory` project ships a set of integration surfaces designed to make Neo4j-backed agent memory drop-in compatible with the broader LLM ecosystem. The four primary surfaces are:

1. A **Model Context Protocol (MCP) server** that exposes memory tools, resources, and prompts to MCP-compatible hosts such as Claude Desktop, Claude Code, and Cursor.
2. A **CLI** built on Click for batch ingestion, MCP server management, and operational tasks.
3. **Framework integrations** for LangChain, PydanticAI, Google ADK, AWS Strands, CrewAI, Vercel AI SDK, Mastra, and Microsoft Agent.
4. A **dual-backend deployment model** — local bolt-to-Neo4j or the hosted NAMS REST service — selectable at config time without code changes.

Community issue [#11](https://github.com/neo4j-labs/agent-memory/issues/11) explicitly requested CLI support for batch ingestion, benchmarks, sample loading, and installation instructions, all of which are reflected in the `neo4j-agent-memory` command group. Community issue [#49](https://github.com/neo4j-labs/agent-memory/issues/49) requested LangGraph middleware for automatic memory storage, addressed by the framework integration layer described below.

## MCP Server

The MCP server is defined by the `mcp` Click group in [src/neo4j_agent_memory/cli/main.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/cli/main.py) and implemented across `src/neo4j_agent_memory/mcp/server.py` plus its tool/resource/prompt modules. It exposes 16 tools in the extended profile and 6 in the core profile, including `memory_search`, `memory_get_context`, `memory_store_message`, `memory_add_entity`, `memory_add_preference`, and `memory_add_fact` (see [examples/google_cloud_integration/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/google_cloud_integration/README.md)).

The server is launched via:

```bash
neo4j-agent-memory mcp serve --password mypassword             # stdio (Claude Desktop)
neo4j-agent-memory mcp serve --transport sse --port 8080      # network SSE
neo4j-agent-memory mcp serve --profile core                    # slim toolset
```

Backend resolution is automatic: `--backend nams` (or presence of `MEMORY_API_KEY`) selects the hosted REST service; otherwise the local bolt driver is used. NAMS requires an API key; bolt requires a Neo4j password. The command fails fast with a red error message if the required credentials are missing, addressing the "fail fast when `OPENAI_API_KEY` is not set" feedback from issue [#7](https://github.com/neo4j-labs/agent-memory/issues/7).

The TypeScript counterpart at [typescript/src/mcp/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/mcp/index.ts) exports `createMemoryTools()` and `handleMemoryToolCall()` so that the same 12 standard tools can be registered against any MCP server or dispatched programmatically.

## Framework Integrations

The library provides first-class adapters for the most common agent frameworks. Each integration lives under `src/neo4j_agent_memory/integrations/` (Python) or `typescript/src/integrations/` (TypeScript) and is re-exported from [typescript/src/index.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/index.ts).

| Framework | Adapter shape | Example |
|-----------|---------------|---------|
| LangChain / LangChain JS | `BaseChatMessageHistory`, retriever | `Neo4jChatMessageHistory` ([typescript/src/integrations/langchain.ts](https://github.com/neo4j-labs/agent-memory/blob/main/typescript/src/integrations/langchain.ts)) |
| PydanticAI | Tool provider | `examples/lennys-memory/` |
| Google ADK | Tool provider | `examples/financial-services-advisor/google-cloud-financial-advisor/` |
| AWS Strands | Tool provider | `examples/financial-services-advisor/aws-financial-services-advisor/` |
| CrewAI | Memory backend | Listed in README |
| Vercel AI SDK | Middleware | `@neo4j-labs/agent-memory/middleware/vercel-ai` |
| Mastra | Integration | `examples/mastra` |
| Microsoft Agent | Backend | `examples/microsoft_agent_retail_assistant/` |

The LangChain JS integration is duck-typed against the LangChain interfaces so it carries no compile-time LangChain dependency. The Vercel AI SDK middleware is the spiritual sibling of the LangGraph middleware requested in issue [#49](https://github.com/neo4j-labs/agent-memory/issues/49) — both automate memory reads/writes around LLM calls.

## CLI

The Click-based CLI in [src/neo4j_agent_memory/cli/main.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/cli/main.py) provides operational commands beyond `mcp serve`. It manages extensions (registration, status), loads sample domain schemas, and supports batch ingestion use cases requested in issue [#11](https://github.com/neo4j-labs/agent-memory/issues/11). The `neo4j-agent-memory` entry point auto-resolves `NEO4J_URI`, `NEO4J_USER`, `NEO4J_PASSWORD`, and `NEO4J_DATABASE` from environment variables and falls back to `bolt://localhost:7687` / `neo4j` / `neo4j`.

```bash
neo4j-agent-memory mcp serve --password mypassword
neo4j-agent-memory extensions list
neo4j-agent-memory load-schema examples/domain-schemas/podcast_transcripts.py
```

Domain-schema examples (podcasts, news, scientific papers, business reports, entertainment, medical, legal) are documented in [examples/domain-schemas/README.md](https://github.com/neo4j-labs/agent-memory/blob/main/examples/domain-schemas/README.md) and can be loaded either via the CLI or directly through the `client.schema` API.

## Deployment

v0.4.0 introduced the **NAMS (Neo4j Agent Memory Service) backend** as a purely additive change. A single line — `MemorySettings(backend="nams", ...)` versus `MemorySettings(backend="bolt", ...)` — switches the entire client between the local driver and the hosted REST service. Both the Python and TypeScript clients share the same `MemoryClient` API surface, so application code written against v0.3.x continues to work unchanged on bolt.

The NAMS package surface is exported from [src/neo4j_agent_memory/nams/__init__.py](https://github.com/neo4j-labs/agent-memory/blob/main/src/neo4j_agent_memory/nams/__init__.py) and includes ontology types, migration jobs, and an ontology REST transport. Operational recommendations:

- **Local / on-prem**: use `bolt` with `NEO4J_URI` and `NEO4J_PASSWORD`. Required for adoption of an existing production graph via `client.schema.adopt_existing_graph(...)`.
- **Hosted / managed**: use `nams` with `MEMORY_API_KEY`. Required for the ontology versioning, migration, and import features that are absent from the OpenAPI spec for the bolt path.
- **Hybrid**: keep `backend="bolt"` in development and switch to `nams` in production by overriding the env var or settings object.

For environments where the LLM is consumed via MCP, the `mcp serve` command is the deployment artifact: stdio for desktop hosts, SSE/HTTP for network hosts, with the profile flag tuning the tool surface to the host's context budget.

## See Also

- [TypeScript SDK overview](https://neo4j.com/labs/agent-memory/sdks/typescript)
- [MCP tools reference](https://neo4j.com/labs/agent-memory/reference/mcp-tools)
- [Concept: short-term vs long-term vs reasoning memory](https://neo4j.com/labs/agent-memory/explanation/memory-types)
- [Architecture overview](https://neo4j.com/labs/agent-memory/explanation/graph-architecture)
- [Provider migration guide](https://neo4j.com/labs/agent-memory/how-to/migrate-to-providers.html)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: neo4j-labs/agent-memory

Summary: Found 20 structured pitfall item(s), including 3 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/138

## 2. Installation risk - Installation risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/42

## 3. Maintenance risk - Maintenance risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/137

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/131

## 5. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/neo4j-labs/agent-memory

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/129

## 7. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/141

## 8. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/140

## 9. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/neo4j-labs/agent-memory

## 10. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/128

## 11. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/124

## 12. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/125

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/126

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/neo4j-labs/agent-memory

## 15. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/neo4j-labs/agent-memory

## 16. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/neo4j-labs/agent-memory

## 17. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/130

## 18. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/neo4j-labs/agent-memory/issues/127

## 19. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/neo4j-labs/agent-memory

## 20. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/neo4j-labs/agent-memory

<!-- canonical_name: neo4j-labs/agent-memory; human_manual_source: deepwiki_human_wiki -->