Audrey Manual Preview - Doramagic.ai

Doramagic Project Pack · Human Manual

Audrey

Related topics: System Architecture, Memory Model, Quick Start Guide

Audrey Overview

Related topics: System Architecture, Memory Model, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Memory Types

Continue reading this section for the full explanation and source context.

Section Audrey Guard

Continue reading this section for the full explanation and source context.

Section Memory Capsule

Continue reading this section for the full explanation and source context.

Audrey Overview

Audrey is a local-first memory firewall for AI agents. It provides a durable, SQLite-backed memory layer that enables AI agents to remember past mistakes, learned principles, and project-specific rules across sessions. Audrey acts as a continuity layer that sits under any local or sidecar agent loop, preventing agents from repeating the same mistakes and enabling smarter, more context-aware behavior.

Sources: README.md:1-10

What Problem Audrey Solves

AI agents typically suffer from "cold start" problems—they treat every new session as if they've never interacted with the project before. They repeat broken commands, lose project-specific rules, miss contradictions, and forget the exact mistakes they made yesterday.

Audrey addresses this by implementing a closed feedback loop:

Record what happened during agent actions
Remember what mattered from those events
Check before new actions using stored memories
Return decisions (allow, warn, or block) with evidence
Validate whether the memory helped improve outcomes

Sources: README.md:25-40

Architecture Overview

Audrey is built with a layered architecture that separates concerns between memory storage, retrieval, governance, and agent integration.

graph TD
    subgraph Client Layer
        CLI[CLI Tool<br>npx audrey]
        PythonSDK[Python SDK<br>audrey_memory]
        MCPServer[MCP Server]
    end
    
    subgraph Integration Layer
        Hooks[Claude Code Hooks<br>PreToolUse/PostToolUse]
        MCPConfig[MCP Config<br>Codex, VSCode, etc.]
    end
    
    subgraph Core Engine
        Guard[Audrey Guard<br>Memory-before-action]
        Routes[REST API<br>/v1/*]
    end
    
    subgraph Memory Layer
        SQLite[(SQLite<br>WAL Mode)]
        Episodic[Episodic<br>Memory]
        Semantic[Semantic<br>Memory]
        Procedural[Procedural<br>Memory]
    end
    
    subgraph Embedding
        ONNX[ONNX Runtime<br>Local Embeddings]
        Providers[Cloud Providers<br>OpenAI, Anthropic]
    end
    
    CLI --> Routes
    PythonSDK --> Routes
    MCPServer --> Routes
    Hooks --> Guard
    MCPConfig --> MCPServer
    Guard --> Routes
    Routes --> SQLite
    SQLite --> Episodic
    SQLite --> Semantic
    SQLite --> Procedural
    Routes --> ONNX
    Routes --> Providers

Sources: README.md:1-50

Core Components

Memory Types

Audrey manages three distinct types of memory, each serving a different purpose:

Memory Type	Purpose	Examples
Episodic	Records specific events and outcomes	"Deploy failed at 3:42 PM with OOM error"
Semantic	Stores learned facts, principles, and rules	"Stripe rate limits are 100 req/s"
Procedural	Captures how-to knowledge and workflows	"To deploy, run `npm run deploy` after `npm test`"

Each memory type can be tagged, sourced, and validated independently. Memories gain salience through usage—memories that are repeatedly helpful become more prominent, while unused memories decay over time.

Sources: README.md:40-55

Audrey Guard

Audrey Guard is the core decision-making component that checks memories before agent actions execute. It implements a preflight check that returns structured decisions:

graph LR
    Action[Agent Action<br>tool + parameters] --> Guard
    Guard --> Recall[Recall Relevant<br>Memories]
    Recall --> Decision{Decision}
    Decision -->|No issues| ALLOW[allow]
    Decision -->|Potential risk| WARN[warn<br>+ evidence]
    Decision -->|Dangerous| BLOCK[block<br>+ reason]
    Decision -->|Uncertain| QUERY[Query<br>Human]

The Guard returns a decision with supporting evidence, allowing the agent to make informed choices. When set to strict mode, warnings are treated as blocks.

Sources: README.md:25-35

Memory Capsule

The Memory Capsule is a structured response format that bundles contextual information for agent preflight checks:

Section	Description
`recent_changes`	Memories created/modified within the recent-change window
`must_follow`	Critical rules tagged as mandatory
`procedures`	Step-by-step workflows relevant to the query
`user_preferences`	Explicitly stated user preferences
`risks`	Warnings and risk indicators
`uncertain_or_disputed`	Low-confidence or contested memories

Sources: src/capsule.ts:1-50

Impact Tracking

Audrey tracks the effectiveness of its memories through a closed validation loop:

graph TD
    Action[Action with Memory] --> Outcome{Outcome}
    Outcome -->|helpful| Boost[Boost salience<br>+usage_count]
    Outcome -->|used| Maintain[Maintain salience]
    Outcome -->|wrong| Challenge[Challenge memory<br>Decrease salience]
    Boost --> Consolidation[Consolidation<br>Dream cycle]
    Maintain --> Consolidation
    Challenge --> Consolidation
    Consolidation --> Principles[New Semantic<br>Principles]

Outcome types:

helpful: The memory contributed to a successful outcome
used: The memory was consulted but didn't directly contribute
wrong: The memory led to an incorrect decision

Sources: src/impact.ts:1-60

Installation Methods

Audrey supports multiple installation patterns depending on your use case.

CLI Installation

For direct terminal usage:

npx audrey doctor          # Verify setup
npx audrey demo --scenario repeated-failure  # Run demo
npx audrey guard --tool Bash "npm run deploy"  # Check before action

Sources: README.md:55-65

MCP Server Integration

For integration with agents like Codex, Claude Desktop, Cursor, and VS Code:

# Generate MCP configuration
npx audrey mcp-config codex
npx audrey mcp-config generic
npx audrey mcp-config vscode

Sources: mcp-server/index.ts:1-40

Claude Code Hooks

For Claude Code, install directly and configure memory-before-action hooks:

npx audrey install
claude mcp list

# Apply hooks to project or user scope
npx audrey hook-config claude-code --apply --scope project  # Project-local
npx audrey hook-config claude-code --apply --scope user     # User-wide

The generated hooks include:

PreToolUse: Runs audrey guard --hook --fail-on-warn
PostToolUse: Records successful tool executions
PostToolUseFailure: Records failed tool executions

Sources: README.md:67-85

Python SDK

For Python-based agent integrations:

pip install audrey-memory

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

# Encode new memories
memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)

# Close connection
brain.close()

Sources: python/README.md:1-50

REST API Reference

The Audrey REST API exposes core memory operations via HTTP.

Endpoints Overview

Method	Endpoint	Description
`GET`	`/health`	Server health check
`POST`	`/v1/encode`	Store a new memory
`POST`	`/v1/recall`	Retrieve memories by semantic similarity
`POST`	`/v1/preflight`	Memory-before-action check
`POST`	`/v1/validate`	Submit outcome feedback
`POST`	`/v1/impact`	Get impact statistics

Sources: src/routes.ts:1-80

Core API Operations

#### Encode Memory

interface EncodeRequest {
  content: string;           // The memory content
  memory_type: 'episodic' | 'semantic' | 'procedural';
  source: string;            // e.g., "direct-observation", "told-by-user"
  tags?: string[];
  private?: boolean;        // Agent-only memory
  wait_for_consolidation?: boolean;
}

Sources: src/routes.ts:80-120

#### Recall Memories

interface RecallRequest {
  query: string;
  limit?: number;           // Default: 5
  budget_chars?: number;    // Context budget
  retrieval?: 'hybrid' | 'vector';  // Default: hybrid
  mood?: {                  // Optional affect configuration
    min_valence?: number;
    min_arousal?: number;
  };
}

#### Preflight Check

interface PreflightRequest {
  tool: string;             // e.g., "Bash", "Write"
  action: string;           // The specific action/command
  session_id?: string;
  cwd?: string;
  include_capsule?: boolean;
  include_preflight?: boolean;
  record_event?: boolean;
}

#### Validate Outcome

interface ValidateRequest {
  receipt_id: string;       // From preflight response
  outcome: 'helpful' | 'used' | 'wrong';
  evidence_feedback?: Record<string, 'helpful' | 'used' | 'wrong'>;
  metadata?: Record<string, unknown>;
}

Sources: src/routes.ts:120-200

Configuration

Environment Variables

Variable	Default	Purpose
`AUDREY_DATA_DIR`	`~/.audrey`	SQLite data directory (set per tenant/agent)
`AUDREY_EMBEDDING_PROVIDER`	`onnx`	Embedding provider: `onnx`, `openai`, `anthropic`, `google`
`AUDREY_LLM_PROVIDER`	`openai`	LLM provider for consolidation
`AUDREY_MODEL`	varies	Specific model to use
`AUDREY_HOST`	`127.0.0.1`	REST sidecar bind address
`AUDREY_PORT`	`7437`	REST sidecar port
`AUDREY_API_KEY`	unset	Bearer token for non-loopback access
`AUDREY_ALLOW_NO_AUTH`	`0`	Allow non-loopback without API key (not recommended)
`AUDREY_ENABLE_ADMIN_TOOLS`	`0`	Enable export/import/forget routes
`AUDREY_DEBUG`	`0`	Enable debug logging
`AUDREY_PROFILE`	`0`	Emit per-stage diagnostics
`AUDREY_DISABLE_WARMUP`	`0`	Skip embedding warmup at boot
`AUDREY_CONTEXT_BUDGET_CHARS`	`4000`	Default capsule character budget

Sources: README.md:150-180

Data Isolation

SQLite uses WAL mode without an advisory lock, so two processes sharing a directory will contend on writes. Isolation is a hard requirement for multi-agent setups.

Important: Set a distinct AUDREY_DATA_DIR per tenant, agent identity, or concurrent host to avoid write contention.

Sources: README.md:55-60

Security

Redaction

Audrey automatically redacts sensitive information from stored memories and logs:

Class	Patterns
`api_key`	`api_key`, `apiKey`, `API_KEY` patterns
`password`	`password`, `passwd`, `pwd`
`token`	`token`, `bearer_token`, `access_token`, `jwt`
`secret`	`secret`, `client_secret`, `private_key`

The redaction system walks JSON structures recursively and applies pattern matching to both keys and values.

Sources: src/redact.ts:1-60

Access Control

HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks
audrey serve defaults to binding 127.0.0.1 (was 0.0.0.0)
Non-loopback bind requires AUDREY_API_KEY or explicit AUDREY_ALLOW_NO_AUTH=1
Private memories have ACL enforcement at the recall endpoint
sanitizeRecallOptions() allowlists HTTP body parameters to prevent option injection

Sources: CHANGELOG.md:1-30

Production Readiness

Release Gates

npm run release:gate           # Full release checklist
npm run python:release:check   # Python package verification
npm run bench:guard:card       # Guard performance benchmarks
npm run bench:guard:validate   # Guard accuracy validation
npx audrey doctor              # Runtime health check
npx audrey status --json --fail-on-unhealthy

Production Checklist

Set one AUDREY_DATA_DIR per tenant, environment, or isolation boundary
Pin AUDREY_EMBEDDING_PROVIDER and AUDREY_LLM_PROVIDER explicitly
Back up the SQLite data directory before provider or dimension changes
Keep API keys and raw credentials out of encoded memory content
Use AUDREY_API_KEY if the REST sidecar is reachable beyond local process boundary
Run audrey dream on a schedule for consolidation and decay
Add application-level encryption, retention, access control, and audit logging for regulated environments

Sources: README.md:100-125

Memory Lifecycle

graph TD
    Event[Agent Event<br>Action/Failure] --> Encode[Encode Memory<br>episodic]
    Encode --> Salience[Initial Salience<br>from confidence]
    Salience --> Usage[Usage Cycle]
    
    Usage --> Preflight[Preflight Check]
    Preflight --> Decision[Guard Decision]
    Decision --> Action[Execute Action]
    Action --> Outcome{Outcome}
    
    Outcome -->|Success| Boost[Boost Salience<br/>usage_count++]
    Outcome -->|Partial| Maintain[Maintain]
    Outcome -->|Failure| Challenge[Challenge<br/>decay confidence]
    
    Boost --> Consolidation{Dream Cycle}
    Maintain --> Consolidation
    Challenge --> Consolidation
    
    Consolidation --> Principle[Extract Principle<br/>semantic memory]
    Consolidation --> Decay[Apply Decay<br/>unused memories]
    
    Principle --> NewMemory[New Semantic<br/>Memory]
    Decay --> Prune[Prune Very Low<br/>salience memories]

Dream Cycle

The memory_dream operation consolidates episodes into principles and applies decay:

Consolidation: Groups related episodic memories into higher-level semantic principles
Decay: Reduces salience of memories that haven't been used recently
Challenge: Flags memories that led to wrong outcomes for review

Sources: README.md:40-50

Supported Integrations

Agent/IDE	Integration Method	Features
Claude Code	Hooks + MCP	Full memory-guard loop
Codex	MCP Config	Memory recall
Claude Desktop	MCP	Memory access
Cursor	MCP Config	Memory recall
Windsurf	MCP Config	Memory recall
VS Code	MCP Config	Memory recall
JetBrains	MCP Config	Memory recall
Ollama	Generic MCP	Memory recall
Custom Agents	REST API / Python SDK	Full integration

Sources: README.md:10-20

Key Files Reference

File	Purpose
`src/audrey.ts`	Core Audrey class with memory operations
`src/routes.ts`	REST API route handlers
`src/capsule.ts`	Memory capsule builder
`src/impact.ts`	Impact tracking and validation
`src/redact.ts`	Sensitive data redaction
`src/rules-compiler.ts`	Rule file generation from memories
`mcp-server/index.ts`	MCP server and CLI commands
`python/`	Python SDK implementation

Sources: src/audrey.ts, src/routes.ts, src/capsule.ts, src/impact.ts, src/redact.ts, src/rules-compiler.ts, mcp-server/index.ts, python/README.md

Sources: [README.md:1-10]()

Quick Start Guide

Overview

Audrey is a local-first memory firewall for AI agents that provides a durable memory layer they can check before executing tools. This guide covers installation, configuration, and basic usage patterns across all supported surfaces: CLI, REST API, JavaScript SDK, and Python client.

Prerequisites

Requirement	Version/Details
Node.js	v18+ recommended
npm	v8+
Python	3.9+ (for Python SDK)
SQLite	Built-in (bundled)
Docker	Optional for containerized deployment

Installation

CLI Installation

npm install -g audrey

Verify installation:

audrey --version

Python SDK Installation

pip install audrey-memory

Sources: python/README.md:1-20

Quick Setup

1. Run the Health Check

audrey doctor

This command validates the installation and checks for any configuration issues.

Sources: mcp-server/index.ts:85-120

2. Start the REST API Server

npx audrey serve

By default, the server binds to 127.0.0.1:7437. Configure using environment variables:

Environment Variable	Default	Description
`AUDREY_PORT`	`7437`	REST API port
`AUDREY_HOST`	`127.0.0.1`	REST sidecar bind address
`AUDREY_API_KEY`	unset	Bearer token for non-loopback traffic
`AUDREY_DATA_DIR`	`~/.audrey`	Data directory path

Sources: README.md:1-50

3. Install MCP Configuration

For Claude Code integration:

audrey install --host claude-code

Generate MCP config without applying:

audrey mcp-config --host claude-code --dry-run

Apply project hooks:

audrey hook-config claude-code --apply --scope project

Apply user hooks:

audrey hook-config claude-code --apply --scope user

Sources: mcp-server/index.ts:40-75

Core Usage Patterns

Memory Guard Workflow

The primary safety loop that records events, checks memory before action, and returns decisions:

graph TD
    A[Agent Action] --> B[audrey guard --tool Bash npm run deploy]
    B --> C{Memory Check}
    C -->|allow| D[Execute Action]
    C -->|warn| E[Execute with Warning]
    C -->|block| F[Block Action]
    D --> G[Record Outcome]
    E --> G
    F --> G
    G --> H[Memory Consolidation]

Execute the guard command:

audrey guard --tool Bash "npm run deploy"

Parameters:

Parameter	Description
`--tool`	Tool name (Bash, Write, Read, etc.)
`--session-id`	Session identifier
`--files`	File paths involved
`--strict`	Fail on warning

Sources: mcp-server/index.ts:150-200

Encoding Memories

#### Via REST API

curl -X POST http://127.0.0.1:7437/v1/encode \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Stripe returns HTTP 429 above 100 req/s",
    "source": "direct-observation",
    "tags": ["stripe", "rate-limit"]
  }'

#### Via Python SDK

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

Sources: python/README.md:25-45

Recalling Memories

#### Via REST API

curl -X POST http://127.0.0.1:7437/v1/recall \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "stripe rate limits",
    "limit": 5,
    "retrieval": "hybrid"
  }'

#### Via Python SDK

results = brain.recall("stripe rate limits", limit=5)

#### Available Recall Options

Option	Type	Default	Description
`query`	string	required	Search query
`limit`	number	10	Maximum results
`budget_chars`	number	4000	Context budget in characters
`retrieval`	string	"hybrid"	"hybrid" or "vector" mode
`include_private`	boolean	false	Include private memories
`agent`	string	-	Filter by agent name

Sources: src/routes.ts:1-50

Getting Memory Capsule

A turn-sized memory packet containing relevant context:

curl -X POST http://127.0.0.1:7437/v1/capsule \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "current deploy status",
    "budget_chars": 4000
  }'

The capsule contains sections:

recent_changes - Memories from recent window
must_follow - Critical rules
procedures - Step-by-step memories
user_preferences - User-stated preferences
risks - Warnings and risks
uncertain_or_disputed - Low-confidence items

Sources: src/capsule.ts:1-60

Check Health

curl http://127.0.0.1:7437/v1/status

Async version:

import asyncio
from audrey_memory import AsyncAudrey

async def main():
    async with AsyncAudrey(base_url="http://127.0.0.1:7437", api_key="secret") as brain:
        health = await brain.health()
        print(health)

asyncio.run(main())

Sources: python/README.md:50-70

Advanced CLI Commands

Dream - Consolidate Memory

audrey dream

Triggers memory consolidation process.

Reembed - Rebuild Vector Indices

audrey reembed

Rebuilds embedding indices after schema changes.

Observe Tool

Record tool execution results:

audrey observe-tool --tool Bash --input '{"command": "npm test"}' --output '{"exitCode": 0}'

Impact Report

Generate memory impact analysis:

audrey impact --window-days 30 --limit 10

The report includes:

Memory counts by type (episodic, semantic, procedural)
Validated memories count
Outcome breakdown (helpful, wrong, used)
Top used memories
Weakest memories by salience
Recent activity

Sources: src/impact.ts:1-80

Promote - Extract Rules

Promote memory candidates to reviewable Markdown files:

audrey promote --yes

Rules are saved to .claude/rules/ with YAML front matter for traceability.

Check Status

audrey status

Displays:

Current mood (valence, arousal)
Memory counts
Learned principles
Recent memories
Unresolved threads

Sources: mcp-server/index.ts:200-280

MCP Server Tools

Audrey provides 20 tools via MCP stdio protocol:

Tool	Purpose
`memory_encode`	Record new memories
`memory_recall`	Retrieve relevant memories
`memory_capsule`	Get turn-sized context packet
`preflight_check`	Validate before action
`record_outcome`	Record action results
`promote_memory`	Convert to persistent rule
`impact_report`	Analyze memory effectiveness

Resources include:

status - System health
recent - Recent memories
principles - Semantic memories
briefing - Current context

Prompts include:

memory-recall - Search memories
memory-reflection - Self-analysis

Sources: README.md:60-90

Configuration Reference

Environment Variables

Variable	Default	Description
`AUDREY_PORT`	`7437`	REST API port
`AUDREY_HOST`	`127.0.0.1`	Bind address
`AUDREY_API_KEY`	unset	Bearer token
`AUDREY_DATA_DIR`	`~/.audrey`	Data directory
`AUDREY_ENABLE_ADMIN_TOOLS`	`0`	Enable export/import routes
`AUDREY_PROMOTE_ROOTS`	unset	Extra write roots
`AUDREY_DEBUG`	`0`	Enable debug logging
`AUDREY_PROFILE`	`0`	Emit per-stage timings
`AUDREY_DISABLE_WARMUP`	`0`	Skip embedding warmup
`AUDREY_CONTEXT_BUDGET_CHARS`	`4000`	Default capsule budget

Sources: README.md:40-55

Privacy Controls

By default, private memories are ACL-protected:

include_private: true is restricted in HTTP API
confidenceConfig overrides are blocked via sanitizeRecallOptions()

For full control, use the SDK directly or enable admin tools:

AUDREY_ENABLE_ADMIN_TOOLS=1 audrey serve

Sources: CHANGELOG.md:1-30

Deployment Options

Docker

docker run -p 7437:7437 \
  -e AUDREY_API_KEY=secret \
  -v audrey-data:/root/.audrey \
  audrey:latest

Docker Compose

Use the provided docker-compose.yml for persistent deployments with volume mounts.

Host-Specific Setup

Generate platform-specific MCP configurations:

audrey mcp-config --host claude-code
audrey mcp-config --host cursor
audrey mcp-config --host windsurf

Sources: README.md:80-100

Next Steps

Review audrey doctor output for any warnings
Configure AUDREY_API_KEY for production deployments
Set up MCP integration for your preferred IDE/agent
Explore memory types: episodic, semantic, procedural
Enable impact tracking to measure memory effectiveness

For detailed API documentation, see the REST API endpoints at /v1/* when the server is running.

Sources: [python/README.md:1-20](https://github.com/Evilander/Audrey/blob/main/python/README.md)

System Architecture

Related topics: Audrey Overview, Memory Model, Data Storage, MCP Server, REST API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MCP Server (mcp-server/index.ts)

Continue reading this section for the full explanation and source context.

Section REST API (src/routes.ts)

Continue reading this section for the full explanation and source context.

Section Audrey Core (src/audrey.ts)

Continue reading this section for the full explanation and source context.

System Architecture

Overview

Audrey is a local-first memory runtime designed to give AI agents persistent, queryable memory across sessions. It operates as a stateful infrastructure layer that records observations, consolidates principles, and provides memory-before-action checks through multiple interfaces.

The system is built around a closed-loop safety architecture where every tool action can be validated against stored memory before execution, returning allow, warn, or block decisions with supporting evidence.

Core Design Principles

Principle	Description
Local-first	All data stored in local SQLite; no external database required
Agent-agnostic	Works with Codex, Claude Code, Cursor, Windsurf, VS Code, JetBrains, Ollama, and custom agents
Safety loop	Pre-action validation through Audrey Guard before tool execution
Isolation per tenant	One `AUDREY_DATA_DIR` per tenant/agent/isolation boundary
Privacy-by-default	Audrey Guard redacts tool traces; private memory ACL enforcement

High-Level Component Architecture

graph TD
    subgraph "Agent Host"
        A[AI Agent<br/>Claude Code / Codex / Cursor]
    end
    
    subgraph "Audrey Runtime"
        CLI[CLI<br/>doctor, demo, guard,<br/>install, status, dream]
        MCP[MCP Server<br/>stdio interface]
        REST[REST API<br/>Hono server :7437]
        SDK[JS SDK<br/>TypeScript/Node]
        PY[Python SDK<br/>audrey-memory]
    end
    
    subgraph "Core Engine"
        AUD[Audrey Core<br/>encode, recall, consolidate,<br/>validate, impact]
        MEM[(Memory Store<br/>SQLite + sqlite-vec)]
        EMB[Embedding Engine<br/>ONNX runtime]
        CAUSAL[Causal Validation<br/>confidence scoring]
    end
    
    A <--> MCP
    A <--> REST
    CLI --> MCP
    CLI --> REST
    SDK --> REST
    PY --> REST
    AUD <--> MEM
    AUD <--> EMB
    AUD <--> CAUSAL

Component Specifications

MCP Server (`mcp-server/index.ts`)

The MCP stdio server provides 20+ tools plus status, recent, principles resources and briefing/recall/reflection prompts.

Interface Type	Count	Purpose
Tools	20+	encode, recall, capsule, guard, promote, impact, dream, reembed, observe-tool
Resources	3	status, recent, principles
Prompts	3	briefing, recall, reflection

The server processes CLI arguments before entering stdio mode to handle --help, --version, and subcommands like install, mcp-config, hook-config. Sources: mcp-server/index.ts:1-100

REST API (`src/routes.ts`)

Hono-based HTTP server exposing the following endpoints:

Endpoint	Method	Purpose
`/health`	GET	Health check
`/v1/encode`	POST	Store memory with source, tags, salience
`/v1/recall`	POST	Retrieve relevant context
`/v1/capsule`	POST	Get turn-sized memory packet
`/v1/status`	GET	Runtime status
`/v1/observe`	POST	Record tool outcome
`/v1/validate`	POST	Validate memory usefulness

Security: HTTP recall/capsule routes use sanitizeRecallOptions() to prevent private-memory ACL bypass via caller-supplied options. API key comparison uses crypto.timingSafeEqual to prevent timing attacks. Sources: src/routes.ts:1-80

Audrey Core (`src/audrey.ts`)

The central engine handling memory operations:

graph LR
    ENC[encode] --> VEC[Vector Embedding]
    ENC --> DB[(SQLite)]
    REC[recall] --> VEC
    REC --> DB
    VEC --> EMB[Embedding Engine]
    DB --> CAUSAL[Causal Validation]
    CAUSAL --> CONF[Confidence Scoring]

Key operations:

encode(): Stores episodic, semantic, or procedural memory with vector embedding
recall(): Retrieves memories using hybrid (vector + FTS) search
consolidate(): Extracts principles from repeated evidence
decay(): Reduces authority of stale, low-confidence memories
beforeAction(): Guard check returning allow/warn/block
afterAction(): Records tool execution outcomes

Storage Layer (`src/db.ts`)

SQLite with sqlite-vec extension for vector search.

Feature	Configuration
Mode	WAL (Write-Ahead Logging)
Concurrency	No advisory lock; single writer per `AUDREY_DATA_DIR`
Indexing	sqlite-vec for vector similarity; FTS for full-text
Isolation	One directory per tenant required

The AUDREY_PRAGMA_DEFAULTS environment variable (default 1) applies custom PRAGMA tuning. Set to 0 to revert to better-sqlite3 defaults.

Embedding Engine (`src/embedding.ts`)

ONNX runtime for local vector embedding without external API calls by default.

Feature	Behavior
Warmup	Background embedding warmup at MCP boot (skippable with `AUDREY_DISABLE_WARMUP=1`)
Cold-start	First encode: ~525ms cold, ~28ms warm
Verbosity	`AUDREY_ONNX_VERBOSE=1` restores EP-assignment warnings
Reuse	Validation, interference, affect resonance reuse main content vector

Performance targets (v0.22.0):

Encode p50: 15.2ms (40% faster than prior)
Hybrid recall p50: 14.3ms (2.1x faster)
Embedding reuse eliminated 3 of 4 redundant calls

Memory Model Architecture

graph TD
    subgraph "Memory Types"
        EPI[Episodic<br/>Specific observations,<br/>tool results, facts]
        SEM[Semantic<br/>Consolidated principles<br/>from evidence]
        PROC[Procedural<br/>Remembered ways to act,<br/>avoid, retry, verify]
    end
    
    subgraph "Memory Properties"
        AFF[Affect & Salience<br/>Emotional weight, importance]
        DEC[Interference & Decay<br/>Stale/conflicting lose authority]
        CON[Contradiction Handling<br/>Competing claims tracked]
    end
    
    EPI --> AFF
    SEM --> AFF
    PROC --> AFF
    EPI --> DEC
    SEM --> CON
    CON --> DEC

Memory Types

Type	Description	Example
Episodic	Specific observations, tool results, session facts	"Stripe returns HTTP 429 above 100 req/s"
Semantic	Consolidated principles from repeated evidence	"Always check rate limits before batch operations"
Procedural	Remembered ways to act, avoid, retry, verify	"Retry with exponential backoff on network failures"

Capsule Generation (`src/capsule.ts`)

The capsule endpoint assembles a turn-sized memory packet with sections:

graph TD
    CAP[POST /v1/capsule] --> SEC[Section Assigner]
    SEC --> R[recent_changes<br/>Created/reinforced recently]
    SEC --> M[must_follow<br/>Critical rules]
    SEC --> P[procedures<br/>Procedural memories]
    SEC --> U[user_preferences<br/>Stated or tagged preferences]
    SEC --> RK[risks<br/>Warnings and recent failures]
    SEC --> UN[uncertain_or_disputed<br/>Disputed or low-confidence]

Each section includes a reason field explaining why the entry was included. Recent tool failures (last 7 days) are automatically added to risks when includeRisks is enabled.

Audrey Guard Safety Loop

sequenceDiagram
    participant Agent
    participant Guard as Audrey Guard
    participant Memory as Memory Store
    participant LLM as LLM Provider
    
    Agent->>Guard: tool + action
    Guard->>Memory: recall(relevant)
    Memory-->>Guard: context entries
    Guard->>LLM: preflight check
    LLM-->>Guard: decision + evidence
    Guard-->>Agent: allow/warn/block + reasoning
    Agent->>Guard: outcome (success/failure/wrong)
    Guard->>Memory: record outcome

Guard Modes

Mode	Behavior
`allow`	Action proceeds normally
`warn`	Action allowed but user notified
`block`	Action prevented with evidence
`caution`	Maps to `warn` display

CLI usage:

audrey guard --tool Bash "npm run deploy"
audrey guard --hook --fail-on-warn  # For hook integration

Validation Pipeline

The causal validation system (via src/causal.ts and src/validate.ts) evaluates whether stored memories actually helped:

Confidence scoring uses reinforcement formula from confidence.ts
Evidence tracking updates usage_count and last_used_at
Outcome classification: used, helpful, wrong
Impact metrics aggregated by memory type

Deployment Architecture

graph LR
    subgraph "Deployment Options"
        NPM[npm package<br/>npx audrey]
        DOCKER[Docker<br/>audrey-runtime]
        COMPOSE[Docker Compose<br/>Full stack]
        HOST[MCP Config<br/>Host-specific]
    end
    
    subgraph "Environment"
        ENV1[AUDREY_DATA_DIR]
        ENV2[AUDREY_LLM_PROVIDER]
        ENV3[AUDREY_EMBEDDING_PROVIDER]
        ENV4[AUDREY_API_KEY]
    end
    
    NPM --> ENV1
    DOCKER --> ENV1
    COMPOSE --> ENV1
    HOST --> ENV2
    HOST --> ENV3

Interface Options by Agent

Agent	Integration
Claude Code	`npx audrey install --host claude-code` + hook-config
Claude Desktop	MCP config via `npx audrey mcp-config generic`
Codex	MCP config via `npx audrey mcp-config codex`
Cursor	MCP config
Windsurf	MCP config
VS Code	MCP config
JetBrains	MCP config
Ollama	MCP config
Custom	REST API or JS/Python SDK

REST Sidecar Security

Configuration	Bind Address	Auth Required
Default	`127.0.0.1:7437`	No (loopback)
Production	`0.0.0.0:7437`	`AUDREY_API_KEY` required
Unsafe override	Any host	`AUDREY_ALLOW_NO_AUTH=1` (not recommended)

AUDREY_HOST env var explicitly opts in to network exposure.

CLI Architecture

graph TD
    CLI[audrey CLI] --> PARSE[Argument Parser]
    PARSE --> KNOWN[Known Subcommands]
    KNOWN --> INSTALL[install]
    KNOWN --> UNINSTALL[uninstall]
    KNOWN --> MCP[mcp-config]
    KNOWN --> HOOK[hook-config]
    KNOWN --> DOCTOR[doctor]
    KNOWN --> DEMO[demo]
    KNOWN --> GUARD[guard]
    KNOWN --> DREAM[dream]
    KNOWN --> REEMBED[reembed]
    KNOWN --> PROMOTE[promote]
    KNOWN --> IMPACT[impact]
    KNOWN --> UNKNOWN[Unknown/No subcommand]
    
    UNKNOWN --> TTY{Human TTY?}
    TTY -->|Yes| HELP[Print help]
    TTY -->|No| MCP_SERVER[Start MCP server]
    
    INSTALL --> HOST[Host-specific config]
    HOOK --> APPLY[Apply hooks to settings]
    PROMOTE --> WRITES[Write to project files]

Key CLI Commands

Command	Purpose
`audrey doctor`	Diagnose configuration issues
`audrey status`	Show runtime health
`audrey demo`	Run interactive demonstration
`audrey guard`	Check action against memory
`audrey install`	Register Audrey with host
`audrey mcp-config`	Generate MCP server configuration
`audrey hook-config`	Generate agent hook configuration
`audrey dream`	Trigger consolidation and decay
`audrey reembed`	Re-embed all memories
`audrey promote`	Write memories to project rules
`audrey impact`	Show memory effectiveness report

Configuration Environment Variables

Variable	Default	Purpose
`AUDREY_DATA_DIR`	System temp	Memory storage directory
`AUDREY_HOST`	`127.0.0.1`	REST sidecar bind address
`AUDREY_PORT`	`7437`	REST sidecar port
`AUDREY_API_KEY`	unset	Bearer token for non-loopback
`AUDREY_LLM_PROVIDER`	Configured	LLM for causal/validation
`AUDREY_EMBEDDING_PROVIDER`	Configured	Embedding generation
`AUDREY_EMBEDDING_MODEL`	Configured	Model name for embeddings
`AUDREY_EMBEDDING_DIM`	Configured	Vector dimensions
`AUDREY_CONTEXT_BUDGET_CHARS`	`4000`	Capsule character budget
`AUDREY_DISABLE_WARMUP`	`0`	Skip embedding warmup
`AUDREY_DEBUG`	`0`	Enable MCP debug logs
`AUDREY_PROFILE`	`0`	Emit per-stage timings
`AUDREY_PROMOTE_ROOTS`	unset	Allowed write roots for promote
`AUDREY_ENABLE_ADMIN_TOOLS`	`0`	Enable export/import/forget

SDK Architecture

JavaScript SDK

Direct TypeScript/Node import from audrey package:

import Audrey from 'audrey';

const brain = new Audrey({ 
  baseUrl: 'http://127.0.0.1:7437',
  agent: 'support-agent'
});

await brain.encode('Deploy failed due to OOM', { 
  source: 'direct-observation' 
});

const results = await brain.recall('deploy failure', { limit: 5 });

Python SDK (`audrey-memory`)

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

Async clients available via AsyncAudrey / asyncio support.

Release Readiness Gates

graph LR
    GATE[release:gate] --> CI[CI Workflow]
    GATE --> PY[python:release:check]
    GATE --> BENCH[bench:guard:*]
    GATE --> DOCTOR[audrey doctor]
    GATE --> DEMO[audrey demo]

Check	Purpose
`npm run release:gate`	Full release readiness checklist
`npm run python:release:check`	Python artifact verification
`npm run bench:guard:card`	Guard benchmark suite
`npm run bench:guard:validate`	Validation benchmarks
`npx audrey doctor`	Runtime diagnostics
`npx audrey demo`	Functional verification

Data Flow: Encode to Recall

sequenceDiagram
    participant Client
    participant API as REST API<br/>/v1/encode
    participant Audrey as Audrey Core
    participant Embed as Embedding Engine
    participant DB as SQLite + vec
    
    Client->>API: POST /v1/encode<br/>content, source, tags
    API->>Audrey: encode(content, options)
    Audrey->>Embed: generateEmbedding(content)
    Embed-->>Audrey: vector[1536]
    Audrey->>DB: INSERT memory + vector
    DB-->>Audrey: memory_id
    Audrey-->>API: { id, confidence, ... }
    API-->>Client: { id, ... }

sequenceDiagram
    participant Client
    participant API as REST API<br/>/v1/recall
    participant Audrey as Audrey Core
    participant Embed as Embedding Engine
    participant DB as SQLite + vec
    participant Causal as Causal Validator
    
    Client->>API: POST /v1/recall<br/>query, limit, scope
    API->>Audrey: recall(query, options)
    Audrey->>Embed: generateEmbedding(query)
    Embed-->>Audrey: query_vector
    Audrey->>DB: hybrid search<br/>vector_similarity + FTS
    DB-->>Audrey: [entries]
    Audrey->>Causal: score(entries)
    Causal-->>Audrey: [scored_entries]
    Audrey-->>API: { results, ... }
    API-->>Client: { results, ... }

Key Architectural Decisions

Decision	Rationale
Local-only storage	Eliminates dependency on external services; ensures data isolation
SQLite + sqlite-vec	Proven reliability, no separate vector DB required
WAL mode without advisory lock	Performance for single-process; isolation required for multi-agent
Separate `AUDREY_DATA_DIR` per tenant	Hard isolation boundary; prevents cross-tenant contamination
REST sidecar defaulting to loopback	Security by default; non-loopback requires explicit opt-in
Embedding warmup at boot	Eliminates cold-start penalty (~18.7x improvement)
Closed-loop validation	Closed feedback loop lifts autopilot ALIVE dimension

Source: https://github.com/Evilander/Audrey / Human Manual

Memory Model

Related topics: System Architecture, Core Memory Operations, Preflight and Reflexes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Episodic Memory

Continue reading this section for the full explanation and source context.

Section Semantic Memory

Continue reading this section for the full explanation and source context.

Section Procedural Memory

Continue reading this section for the full explanation and source context.

Memory Model

Audrey's Memory Model is a cognitive-inspired system that provides AI agents with persistent, evolving memory capabilities. Unlike simple vector databases, it implements a multi-layered memory architecture that mirrors human memory structures—episodic, semantic, and procedural—while incorporating affect, salience, and decay mechanisms to ensure memories remain relevant and actionable.

Architecture Overview

The Memory Model consists of several interconnected subsystems that work together to store, retrieve, consolidate, and forget information over time.

graph TD
    A[User/Agent Input] --> B[Episodic Memory]
    B --> C[Consolidation Process]
    C --> D[Semantic Memory]
    C --> E[Procedural Memory]
    D --> F[Confidence Scoring]
    E --> F
    B --> G[Affect Module]
    F --> G
    G --> H[Salience Calculation]
    H --> I[Recall Ranking]
    I --> J[Preflight Check]
    J --> K[Guard Decision]
    L[Interference] -.->F
    L -.->I
    M[Decay Engine] -.->D
    M -.->E

Sources: README.md

Memory Types

Audrey distinguishes between three primary memory types, each serving a distinct role in agent cognition.

Episodic Memory

Episodic memory stores specific observations, tool results, preferences, and session facts. These are the raw recordings of events and interactions that agents experience directly.

Property	Description
`memory_type`	`episode`
`source`	`direct-observation`, `told-by-user`, `retrieved`
`confidence`	Initial high confidence that decays over time
`retrieval_count`	Number of times this memory was recalled

Sources: src/capsule.ts

Semantic Memory

Semantic memory represents consolidated principles extracted from repeated evidence. These memories encode general knowledge and learned rules that persist beyond specific sessions.

Property	Description
`memory_type`	`semantic`
`confidence`	Derived from supporting episode frequency
`supporting_count`	Number of episodes supporting this principle
`challenge_count`	Number of contradictory episodes

Sources: src/causal.ts

Procedural Memory

Procedural memory stores remembered ways to act, avoid, retry, or verify. These encode action patterns and procedures that agents have learned through experience.

Property	Description
`memory_type`	`procedural`
`tags`	`procedure`, `retry`, `avoid`, `verify`
`confidence`	Reinforced by successful outcomes

Sources: src/capsule.ts

Confidence System

The confidence system is the foundational mechanism that determines memory reliability and recall priority. It incorporates multiple signals including recency, reinforcement, and affect.

Confidence Calculation

graph LR
    A[Base Confidence] --> B[Recency Decay]
    B --> C[Reinforcement Boost]
    C --> D[Affect Adjustment]
    D --> E[Interference Penalty]
    E --> F[Final Confidence]

Sources: src/confidence.ts

Recency Decay

Memory confidence decreases over time through a half-life decay mechanism. Memories become less authoritative unless reinforced through retrieval or validation.

// From src/confidence.ts
recencyDecay(halfLifeDays: number, createdAt: Date): number

Parameter	Type	Description
`halfLifeDays`	`number`	Days until confidence halves
`createdAt`	`Date`	Memory creation timestamp

The decay function throws RangeError when halfLifeDays <= 0 to prevent NaN or Infinity results.

Sources: src/confidence.ts

Reinforcement Formula

Validation outcomes reinforce or diminish memory confidence through the feedback loop:

Outcome	Effect
`helpful`	Increases salience, bumps `retrieval_count` for semantic/procedural
`wrong`	Decreases salience, bumps `challenge_count` for semantic
`used`	Neutral signal with smaller salience delta

The math reuses the existing reinforcement formula from confidence.ts.

Sources: CHANGELOG.md

Consolidation System

Consolidation transforms episodic memories into semantic and procedural knowledge through periodic processing, often called "dream" mode.

Consolidation Workflow

graph TD
    A[Nightly Dream Process] --> B[Identify Repeated Episodes]
    B --> C[Extract Common Patterns]
    C --> D[Generate Semantic Principles]
    C --> E[Extract Procedures]
    D --> F[Create New Semantic Memory]
    E --> G[Create/Update Procedural Memory]
    F --> H[Link Supporting Episodes]
    G --> H

Sources: README.md

Consolidation Implementation

The consolidation process runs through memory_dream and is scheduled to ensure that consolidation and decay remain current.

// From src/consolidate.ts - conceptual interface
async function consolidate(audrey: Audrey, options?: ConsolidateOptions): Promise<ConsolidationResult>

Consolidation moves SELECTs inside the surrounding transaction to prevent concurrent writers from slipping rows in or out between read and write.

Sources: CHANGELOG.md

Decay Engine

The decay engine implements forgetting curves that reduce memory authority over time, ensuring stale information doesn't dominate recall.

Decay Mechanism

graph LR
    A[Time Passes] --> B{Still Being Used?}
    B -->|Yes| C[Decay Paused]
    B -->|No| D[Gradual Decay]
    D --> E[Confidence Decreases]
    E --> F[Memory Becomes Less Authoritative]

Sources: src/decay.ts

Decay Parameters

Parameter	Default	Purpose
`halfLifeDays`	Configurable	Base decay rate
`minConfidence`	0.1	Floor value
`decayEnabled`	true	Global on/off

Decay applies to semantic and procedural memories differently, with semantic memories decaying faster unless reinforced.

Sources: src/decay.ts

Affect and Salience

Affect (emotional weight and importance) influences salience, determining which memories demand attention and which fade into background knowledge.

Affect Module

graph TD
    A[Memory Event] --> B[Detect Emotional Signals]
    B --> C[Calculate Valence]
    B --> D[Calculate Arousal]
    C --> E[Determine Mood State]
    D --> E
    E --> F[Affect Boost/Penalty]
    F --> G[Effective Salience]

Sources: src/affect.ts

Salience Calculation

Effective salience is clamped to the range [0, 1] to prevent unbounded values from extreme arousal boosts. The formula considers:

Memory type (episodic, semantic, procedural)
Confidence level
Recency
Emotional valence and arousal

// From src/affect.ts
effectiveSalience(baseSalience: number, arousalBoost: number): number

The timeDeltaDays function no longer propagates NaN from invalid created_at timestamps.

Sources: src/affect.ts

Interference Handling

Interference prevents conflicting or competing memories from silently overwriting each other, maintaining an accurate picture of contradictory knowledge.

Interference Types

graph TD
    A[New Memory] --> B{Conflicting Memory Exists?}
    B -->|Yes| C[Track Contradiction]
    B -->|No| D[Normal Storage]
    C --> E[Disputed State]
    E --> F[Monitor Both]
    F --> G[Resolution Through Validation]

Sources: src/interference.ts

Memory States for Contradictions

State	Description
`active`	Default stable state
`disputed`	Competing claims detected
`context_dependent`	Truth depends on context
`superseded`	Older knowledge replaced

When memories have contradictory content, both are preserved with appropriate states rather than silently overwriting.

Sources: src/capsule.ts

Causal Inference

The causal module extracts cause-effect relationships from episodic memory patterns, enabling agents to understand why certain actions lead to certain outcomes.

Causal Analysis

// From src/causal.ts - conceptual interface
async function analyzeCausalLinks(episodes: Episode[]): Promise<CausalRelationship[]>

The causal module validates LLM response shapes before reading fields and rejects non-finite confidence values.

Sources: src/causal.ts

Causal Memory Properties

Property	Description
`cause_id`	Memory that triggers outcome
`effect_id`	Resulting memory
`confidence`	Causal link strength
`evidence_count`	Episodes supporting this link

Validation Feedback Loop

The closed-loop feedback system enables continuous improvement of memory accuracy through agent validation.

Validation Flow

graph TD
    A[Memory Recall] --> B[Agent Uses Memory]
    B --> C[Validation Request]
    C --> D{Helpful?}
    D -->|Yes| E[Reinforce: helpful]
    D -->|No| F{Wrong?}
    F -->|Yes| G[Diminish: wrong]
    F -->|No| H[Mark: used]
    E --> I[Update Salience & Stats]
    G --> I
    H --> I

Sources: CHANGELOG.md

Validation API

Endpoint	Method	Description
`/v1/validate`	POST	Canonical validation endpoint
`/v1/mark-used`	POST	Legacy alias (defaults to `outcome=used`)

The memory_validate MCP tool accepts outcomes: helpful, wrong, or used.

Sources: CHANGELOG.md

Recall and Retrieval

Memory recall uses hybrid retrieval combining vector similarity and full-text search to balance precision and recall.

Retrieval Modes

Mode	Description
`hybrid`	Vector similarity + FTS (default)
`vector`	FTS-bypass fast path

The hybrid mode was the default since v0.22.0, replacing the removed hybrid_strict mode (which was a silent alias with no behavioral difference).

Sources: CHANGELOG.md

Recall Factors

When ranking results, Audrey considers:

Semantic similarity - Vector distance from query
Recency - Time since creation or last retrieval
Confidence - Current confidence score
Salience - Effective importance (affect-adjusted)
Agent relevance - Scope and ownership

Tool-Trace Learning

Audrey learns from tool execution traces, converting tool results into memory events that inform future actions.

Tool-Trace Memory Cycle

graph TD
    A[Tool Execution] --> B[Capture Tool Trace]
    B --> C[Extract Results & Errors]
    C --> D{Successful?}
    D -->|Yes| E[Encode Success Pattern]
    D -->|No| F[Encode Failure Pattern]
    E --> G[Episodic Memory]
    F --> G
    G --> H[Consolidation]
    H --> I[Procedural Memory]

The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes.

Sources: README.md

Memory Capsule

The Memory Capsule provides a turn-sized memory packet containing categorized sections relevant to the current context.

Capsule Sections

Section	Content
`must_follow`	Trusted rules and critical constraints
`risks`	Identified dangers and warnings
`procedures`	Known action procedures
`user_preferences`	Stated and inferred preferences
`uncertain_or_disputed`	Contested or low-confidence knowledge
`recent_changes`	Freshly updated memories
`project_facts`	Default for semantic/episodic

Sources: src/capsule.ts

Capsule Generation

Capsule sections are determined by memory type, tags, source trust level, state, confidence, and recency:

// From src/capsule.ts
determineSections(
  entry: MemoryEntry,
  result: RecallResult,
  tags: string[],
  recentWindowMs: number
): Array<keyof MemoryCapsule['sections']>

Trusted sources include direct-observation and told-by-user; these can populate must_follow sections.

Sources: src/capsule.ts

Guard Integration

The Memory Guard uses the Memory Model to enforce pre-action checks, returning allow, warn, or block decisions with evidence.

Guard Decision Flow

graph TD
    A[Action Request] --> B[Preflight Check]
    B --> C[Recall Relevant Memory]
    C --> D[Apply Reflexes]
    D --> E{Blocking Reflex?}
    E -->|Yes| F[BLOCK]
    E -->|No| G{Warning Reflex?}
    G -->|Yes| H[ WARN]
    G -->|No| I[ALLOW]

The Guard decision reuses existing preflight and reflex machinery without performing two independent recall passes.

Sources: CHANGELOG.md

Summary

Audrey's Memory Model provides a comprehensive cognitive architecture for AI agents:

Multi-type storage with episodic, semantic, and procedural memories
Dynamic confidence that evolves through use and validation
Consolidation that transforms experience into knowledge
Decay that prevents stale information from dominating
Affect that weights memories by emotional importance
Interference tracking that maintains truth in the face of contradictions
Causal inference that extracts cause-effect relationships
Closed-loop validation that continuously improves accuracy

This architecture ensures agents remember what matters, forget what doesn't, and maintain coherent, actionable knowledge across sessions.

Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)

Audrey Guard

Related topics: Core Memory Operations, Preflight and Reflexes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Components

Continue reading this section for the full explanation and source context.

Section Guard Decision Flow

Continue reading this section for the full explanation and source context.

Section Command Syntax

Continue reading this section for the full explanation and source context.

Audrey Guard

Overview

Audrey Guard is the headline memory loop in the Audrey system—a memory-before-action enforcement mechanism that checks AI agents' intended operations against accumulated memory before execution. It serves as a firewall layer that can allow, warn, or block tool invocations based on historical evidence, prior failures, project rules, and risk patterns.

The Guard operates by retrieving relevant memories through semantic recall, evaluating them against the proposed action, and returning a structured decision with supporting evidence. This enables agents to avoid repeating past mistakes, respect project-specific rules, and make informed decisions grounded in durable context.

Sources: README.md

Purpose and Scope

Audrey Guard addresses a fundamental problem: agents forget the exact mistakes they made yesterday. They repeat broken commands, lose project-specific rules, miss contradictions, and treat every new session like a cold start.

Guard's scope encompasses:

Concern	Description
Failure Prevention	Block or warn on repeated failures identified through `memory_recall`
Risk Awareness	Surface prior failures, risks, and warnings as preflight evidence
Rule Enforcement	Check must-follow rules and procedures before action
Evidence Generation	Return structured decisions with provenance metadata
Closed-Loop Validation	Validate whether the memory helped after action execution

Sources: README.md

Architecture

High-Level Components

graph TD
    A[Agent Tool Call] --> B[Audrey Guard]
    B --> C[memory_preflight]
    C --> D[memory_recall]
    D --> E[SQLite Store<br/>Episodic + Semantic + Procedural]
    C --> F[Rule Evaluation]
    F --> G[Reflex Pattern Matching]
    C --> H[Decision Engine]
    H --> I[block<br/>warn<br/>allow]
    I --> J[Evidence Capsule]
    J --> K[Agent Action Execution]
    K --> L[memory_validate]
    L --> M[Outcome: helpful<br/>used<br/>wrong]
    M --> E

Guard Decision Flow

The Guard evaluates incoming tool actions through a multi-stage pipeline:

Action Canonicalization - Normalize the tool name and action string
Semantic Recall - Query memory store for relevant past experiences
Risk Assessment - Evaluate prior failures, warnings, and risks
Rule Matching - Check against must-follow rules and procedures
Decision Synthesis - Combine signals into block/warn/allow verdict
Evidence Packaging - Return decision with provenance and references

Sources: src/reflexes.ts

CLI Interface

Command Syntax

audrey guard --tool <tool_name> "<action_command>" [options]

Core Options

Option	Description	Default
`--tool <name>`	The tool category (e.g., `Bash`, `Write`, `Edit`)	Required
`<action>`	The specific action string to evaluate	Required
`--cwd <path>`	Working directory for context	Current directory
`--session-id <id>`	Session identifier for event correlation	Auto-generated
`--hook`	Run in hook mode (for agent integration)	`false`
`--fail-on-warn`	Treat warnings as errors (exit code non-zero)	`false`
`--strict`	Enable strict preflight evaluation	`false`
`--json`	Output results as JSON	`false`
`--explain`	Include detailed explanation in output	`false`
`--include-capsule`	Embed full memory capsule in response	`false`

Sources: mcp-server/index.ts

Example Usage

# Block a repeated failed deploy
audrey guard --tool Bash "npm run deploy"

# Warn on risky file operations
audrey guard --tool Write --strict "database.sql"

# Hook mode for Claude Code integration
audrey guard --tool Bash --hook "rm -rf node_modules"

SDK Integration

Sync Client

import Audrey from 'audrey-memory';

const brain = new Audrey({
  base_url: 'http://127.0.0.1:7437',
  agent: 'support-agent',
});

const decision = await brain.beforeAction({
  tool: 'Bash',
  action: 'npm run deploy',
});

console.log(decision.decision); // 'block' | 'warn' | 'allow'
console.log(decision.evidence); // Array of MemoryEvidence
brain.close();

Preflight Options

Option	Type	Description
`tool`	`string`	Tool category being evaluated
`action`	`string`	Action string to preflight
`sessionId`	`string`	Correlation ID for event tracking
`mode`	`'standard' \	'strict'`	Evaluation strictness
`includeCapsule`	`boolean`	Include full memory capsule
`failureWindowHours`	`number`	Hours to look back for failures
`recentChangeWindowHours`	`number`	Hours for recent-change rules

Sources: src/routes.ts

Decision Outcomes

Verdict Types

Decision	Description	Agent Behavior
`block`	Action is prohibited based on memory	Must not execute
`warn`	Action has risk indicators	Should pause and confirm
`allow`	No memory conflicts detected	May proceed

Decision Display Mapping

function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
  if (result.decision === 'block') return 'block';
  if (result.decision === 'caution') return 'warn';
  return 'allow';
}

Sources: mcp-server/index.ts

Memory Preflight

The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes. It builds a structured preflight report containing:

Capsule Sections

Section	Content Source	Trigger Condition
`recent_changes`	Memories within recent-change window	Created or reinforced recently
`must_follow`	Must-follow rules	Tagged as must-follow
`procedures`	Procedural memories + procedures	Matching query or tagged
`user_preferences`	User-stated preferences	User-told or tagged
`risks`	Risk-tagged memories + recent failures	Tagged risk or 7-day failures
`uncertain_or_disputed`	Low-confidence or disputed memories	Low confidence or disputed state

Sources: src/capsule.ts

Reflex System

Memory reflexes convert remembered evidence into trigger-response guidance that agents can follow.

Reflex Response Types

Type	Description
`block`	Strict prohibition based on evidence
`warn`	Caution signal with context
`guide`	Recommended action or approach

Reflex Report Generation

function summarizeReflexes(decision: PreflightDecision, reflexes: MemoryReflex[]): string {
  const blocks = reflexes.filter(r => r.response_type === 'block').length;
  const warnings = reflexes.filter(r => r.response_type === 'warn').length;
  const guides = reflexes.filter(r => r.response_type === 'guide').length;
  // Returns human-readable summary
}

Sources: src/reflexes.ts

Validation Loop

After action execution, agents validate whether the memory helped:

Outcome Types

Outcome	Meaning	Effect
`helpful`	Memory was correct and beneficial	Increases salience
`used`	Memory was referenced	Updates usage metrics
`wrong`	Memory was incorrect	Triggers decay or dispute

Validation Endpoint

POST /v1/event

{
  "outcome": "helpful",
  "receipt_id": "receipt-from-preflight",
  "evidence_feedback": {
    "evidence-id-1": "used",
    "evidence-id-2": "helpful"
  }
}

Sources: src/routes.ts

Failure Decay

Starting from version 1.0.1, Audrey Guard implements failure decay to prevent stale blocks:

Configuration	Default	Behavior
`failureDecayDays`	`7`	Same-action failures older than window treated as stale

To restore pre-1.0.1 blocking behavior (permanent blocks):

const controller = new MemoryController({
  failureDecayDays: 0,
});

Sources: CHANGELOG.md

Security Considerations

HTTP API Security

Default bind address changed from 0.0.0.0 to 127.0.0.1
Refuses to start on non-loopback without AUDREY_API_KEY or AUDREY_ALLOW_NO_AUTH=1
API key comparison uses crypto.timingSafeEqual to prevent timing attacks
/v1/recall and /v1/capsule no longer body-spread caller options

Sources: CHANGELOG.md

Hook Configuration Safety

The audrey promote --yes command refuses to write .claude/rules/*.md outside process.cwd() unless the target path is in AUDREY_PROMOTE_ROOTS. This prevents prompt-injection attacks via malicious MCP callers.

Tool Trace Handling

Tool traces are recorded through PostToolUse hooks with redaction applied:

Redaction - Sensitive fields (API keys, tokens, credentials) are masked
Action Key Generation - Deterministic ID for trace correlation
Event Recording - Tool inputs/outputs stored with session context

Sources: mcp-server/index.ts

Demo Scenario: Repeated Failure

The repeated-failure demo demonstrates Guard's blocking behavior:

npx audrey demo --scenario repeated-failure

This no-key, no-network demo:

Creates a temporary memory store
Records a failed deploy with the fix
Teaches Audrey the failure pattern
Shows Guard blocking the repeated attempt with evidence

Sources: README.md

Core Memory Operations

Related topics: Memory Model, Audrey Guard, Preflight and Reflexes, Data Storage

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Encode Process Flow

Continue reading this section for the full explanation and source context.

Section Encode Options

Continue reading this section for the full explanation and source context.

Section Source Types

Continue reading this section for the full explanation and source context.

Core Memory Operations

This page documents the fundamental memory operations in Audrey: encoding, recall, hybrid retrieval, capsule generation, and impact tracking. Together, these operations form the core pipeline that enables agents to store, retrieve, and learn from persistent memory across sessions.

Overview

Audrey's Core Memory Operations handle the complete lifecycle of memory within the system. The operations are designed around a local-first, SQLite-backed architecture that provides semantic search capabilities without requiring external vector databases or hosted services.

graph LR
    A[Encode] -->|store| B[(SQLite)]
    B -->|recall| C[Recall]
    C -->|hybrid| D[Hybrid Search]
    D -->|compose| E[Capsule]
    E -->|track| F[Impact]
    F -->|reinforce| A

The primary design goals are:

Durability: All memories persist in local SQLite storage
Semantic Search: Vector embeddings enable similarity-based recall
Hybrid Retrieval: Combines vector and keyword search for accuracy
Feedback Loop: Impact tracking enables continuous memory reinforcement

Memory Types

Audrey distinguishes between three primary memory types that influence retrieval behavior and storage strategy.

Memory Type	Description	Typical Content
`episodic`	Specific observations and session facts	Tool results, error messages, user feedback
`semantic`	Consolidated principles extracted from evidence	Learned rules, best practices, project conventions
`procedural`	Remembered ways to act, avoid, or verify	Deployment procedures, recovery steps, verification commands

Each memory type has distinct promotion criteria. Procedural memories can be promoted to rules with lower evidence thresholds, while semantic memories require higher confidence and evidence counts before promotion.

Sources: src/recall.ts:15-17

Memory Encoding

The encoding operation transforms raw observations into persistent memory entries. When encoding, Audrey generates embeddings, assigns salience scores, and stores metadata that enables future retrieval.

Encode Process Flow

graph TD
    A[Input: Raw Text] --> B[Generate Embedding]
    B --> C[Calculate Salience]
    C --> D[Assign Memory Type]
    D --> E[Tag Analysis]
    E --> F[Store in SQLite]
    F --> G[Update Vector Index]

Encode Options

The encode operation accepts several configuration parameters:

Parameter	Type	Default	Purpose
`source`	`string`	`'direct-observation'`	Origin of the memory
`memory_type`	`string`	`'episodic'`	Classification of memory content
`tags`	`string[]`	`[]`	Categorical labels for filtering
`wait_for_consolidation`	`boolean`	`false`	Opt-in read-after-write semantics

Sources: src/encode.ts

Source Types

Memory sources indicate provenance and affect how memories are treated during recall:

Source	Trust Level	Description
`direct-observation`	High	Agent's own observations from tool execution
`told-by-user`	High	Explicit user-provided information
`inferred`	Medium	AI-inferred conclusions
`external`	Low	Information from external systems

Trusted sources (direct-observation, told-by-user) can populate must-follow sections in capsules, while untrusted sources are flagged as uncertain or disputed.

Sources: src/capsule.ts:18-20

Memory Recall

Recall is the primary mechanism for retrieving relevant memories based on semantic similarity and keyword matching. The recall operation searches across all memory types using configurable retrieval strategies.

Retrieval Modes

Audrey supports three retrieval modes that determine how search results are computed:

Mode	Description	Use Case
`hybrid`	Combines vector similarity with FTS keyword matching (default)	Balanced accuracy for general queries
`vector`	Pure semantic similarity using embeddings	When keywords are ambiguous
`keyword`	Full-text search only, bypasses vector index	Fast, keyword-exact matching

Sources: src/recall.ts:12-14

Recall Architecture

graph TD
    A[Query Input] --> B{Mode Check}
    B -->|hybrid| C[Vector Pass]
    B -->|hybrid| D[Keyword Pass]
    B -->|vector| E[Vector Pass Only]
    B -->|keyword| F[Keyword Pass Only]
    C --> G[Merge & Score]
    D --> G
    G --> H[Filter by Confidence]
    H --> I[Apply Filters]
    I --> J[Return Results]

Recall Options

The recall operation accepts a comprehensive set of filtering and result-shaping options:

Parameter	Type	Default	Purpose
`minConfidence`	`number`	`0`	Minimum confidence threshold (0-1)
`types`	`MemoryType[]`	`['episodic', 'semantic', 'procedural']`	Memory types to search
`limit`	`number`	`10`	Maximum results to return
`includeProvenance`	`boolean`	`false`	Include source metadata
`includeDormant`	`boolean`	`false`	Include decayed/inactive memories
`tags`	`string[]`	`undefined`	Filter by tags
`sources`	`string[]`	`undefined`	Filter by source type
`after`	`Date`	`undefined`	Filter memories created after date
`before`	`Date`	`undefined`	Filter memories created before date
`includePrivate`	`boolean`	`false`	Include agent-private memories
`retrieval`	`string`	`'hybrid'`	Retrieval mode selection
`scope`	`'agent' \	'shared'`	`'agent'`	Memory scope filter

Sources: src/recall.ts:5-22

RecallFilters Structure

interface RecallFilters {
  tags?: string[];
  sources?: string[];
  after?: Date;
  before?: Date;
  agent?: string;  // Filtered by scope when scope === 'agent'
}

Filters are combined with AND logic—memories must match all specified filters to be included in results.

Agent and Scope Filtering

The scope parameter determines which memories are accessible:

agent (default): Only memories associated with the requesting agent
shared: Memories marked as shared across agents

When scope is 'agent', the agent filter is automatically set to the requesting agent's identity. This ensures memory isolation between different agents.

Hybrid Search

Hybrid search combines vector similarity and full-text search to achieve more accurate recall results than either method alone.

Hybrid Recall Pipeline

graph LR
    A[Query] --> B[Embedding Model]
    A --> C[FTS Index]
    B --> D[Vector Scores]
    C --> E[Keyword Scores]
    D --> F[Score Normalization]
    E --> F
    F --> G[Weighted Merge]
    G --> H[Ranked Results]

Score Merging Strategy

The hybrid approach normalizes scores from both vector and keyword passes before merging. This ensures that memories matched by keywords are not overshadowed by high vector similarity scores, and vice versa.

Sources: src/hybrid-recall.ts

Full-Text Search Integration

The FTS module provides keyword-based search capabilities:

interface FTSResult {
  memory_id: string;
  rank: number;
  snippet?: string;
}

FTS uses SQLite's built-in FTS5 extension for fast keyword matching. The FTS index is updated synchronously during encoding to ensure keyword search reflects current memory state.

Sources: src/fts.ts

Memory Capsule

The capsule is a turn-sized memory packet that organizes relevant memories into actionable sections for agent consumption. It synthesizes recall results into a structured format optimized for quick agent review.

Capsule Sections

Section	Purpose	Trigger Conditions
`must_follow`	High-priority directives	Trusted source + must-follow tags
`uncertain_or_disputed`	Flagged content requiring verification	Low confidence, disputed state, or untrusted source
`risks`	Known risks and hazards	Risk-related tags
`procedures`	Step-by-step instructions	Procedural memory type or procedure tags
`user_preferences`	User-specific preferences	Preference tags or told-by-user source
`project_facts`	Consolidated project knowledge	Semantic memories with no other section match
`recent_changes`	Recently updated information	Memories within recent time window

Sources: src/capsule.ts:22-38

Section Determination Logic

graph TD
    A[Memory Entry] --> B{Source Trusted?}
    B -->|Yes| C{Has Must-Follow Tags?}
    B -->|No| D[Uncertain/Disputed]
    C -->|Yes| E[Must-Follow Section]
    C -->|No| F{Has Risk Tags?}
    D --> F
    F -->|Yes| G[Risks Section]
    F -->|No| H{Procedural Type?}
    H -->|Yes| I[Procedures Section]
    H -->|No| J{Has Preference Tags?}
    J -->|Yes| K[User Preferences]
    J -->|No| L{Uncertain State?}
    L -->|Yes| M[Uncertain/Disputed]
    L -->|No| N[Project Facts]

Capsule Structure

interface MemoryCapsule {
  generated_at: string;
  sections: {
    must_follow?: MemorySection;
    uncertain_or_disputed?: MemorySection;
    risks?: MemorySection;
    procedures?: MemorySection;
    user_preferences?: MemorySection;
    project_facts?: MemorySection;
    recent_changes?: MemorySection;
  };
}

Tag-Based Section Assignment

Capsule generation uses predefined tag sets to categorize memories:

Tag Set	Matching Tags
`MUST_FOLLOW_TAGS`	Critical directives that must be followed
`RISK_TAGS`	Risk-related keywords
`PROCEDURE_TAGS`	Procedure-related keywords
`PREFERENCE_TAGS`	User preference keywords

Sources: src/capsule.ts:12-15

Impact Tracking

Impact tracking closes the feedback loop by recording whether recalled memories proved useful. This enables continuous reinforcement of valuable memories and decay of misleading ones.

Outcome Types

Outcome	Description	Effect on Memory
`helpful`	Memory drove a correct action	Increases salience, bumps retrieval_count
`wrong`	Memory was misleading	Decreases salience, bumps challenge_count
`used`	Memory was referenced	Small positive salience delta

Sources: src/impact.ts

Impact Report Structure

interface ImpactReport {
  generatedAt: string;
  windowDays: number;
  totals: {
    episodic: number;
    semantic: number;
    procedural: number;
  };
  validatedTotal: number;
  validatedInWindow: number;
  byType: {
    episodic: { validated: number; recent: number };
    semantic: { validated: number; recent: number; challenged: number };
    procedural: { validated: number; recent: number };
  };
  outcomeBreakdownInWindow: {
    helpful: number;
    wrong: number;
    used: number;
  };
  topUsed: MemoryStat[];
  weakest: MemoryStat[];
  recentActivity: MemoryStat[];
}

Impact Metrics

The impact system tracks several key metrics:

usage_count: Number of times a memory was successfully used
salience: Computed importance score based on reinforcement history
validation events: Recorded outcomes linked to specific recall events
challenge_count: Number of times a memory was marked as wrong

This data feeds into consolidation and decay processes, ensuring that frequently useful memories remain prominent while stale or misleading memories lose authority over time.

Data Flow Summary

graph LR
    subgraph Encode
        A1[Text Input] --> A2[Embed]
        A2 --> A3[Salience]
        A3 --> A4[Store]
    end
    
    subgraph Recall
        B1[Query] --> B2[Hybrid Search]
        B2 --> B3[Score & Rank]
        B3 --> B4[Filter]
        B4 --> B5[Recall Results]
    end
    
    subgraph Capsule
        C1[Recall Results] --> C2[Section Analysis]
        C2 --> C3[Tag Matching]
        C3 --> C4[Capsule Output]
    end
    
    subgraph Impact
        D1[Agent Feedback] --> D2[Outcome Recording]
        D2 --> D3[Reinforce/Decay]
        D3 --> A3
    end
    
    A4 --> B5
    B5 --> C1
    C4 --> D1

Configuration Considerations

When deploying Audrey's core memory operations, consider these configuration points:

Setting	Recommendation	Impact
`AUDREY_EMBEDDING_PROVIDER`	Pin explicitly	Determines embedding quality
`AUDREY_LLM_PROVIDER`	Pin explicitly	Affects consolidation quality
`AUDREY_DATA_DIR`	Separate per tenant/environment	Ensures isolation and backup simplicity
Retrieval mode	Use `hybrid` for most cases	Balances precision and recall
`wait_for_consolidation`	Enable for critical writes	Guarantees read-after-write consistency

The core memory operations interact with several supporting systems:

Guard: Uses preflight checks before tool execution
Reflexes: Trigger-response patterns derived from memory
Consolidation: Extracts semantic memories from episodic evidence
Decay: Reduces authority of stale memories over time
Promotion: Converts high-value memories to Claude rules

Sources: [src/recall.ts:15-17](https://github.com/Evilander/Audrey/blob/main/src/recall.ts)

Preflight and Reflexes

Related topics: Audrey Guard, Core Memory Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Response Types

Continue reading this section for the full explanation and source context.

Section Reflex Structure

Continue reading this section for the full explanation and source context.

Section Decision Flow

Continue reading this section for the full explanation and source context.

Related topics: Audrey Guard, Core Memory Operations

Preflight and Reflexes

Overview

Preflight and Reflexes form Audrey's core decision-making loop for AI agents. Before any tool action executes, Audrey performs a preflight check that consults memory to determine whether the action should be allowed, warned about, or blocked entirely.

The system operates as Audrey's "memory firewall"—a security and guidance layer that prevents agents from repeating mistakes, reinforces learned behaviors, and surfaces relevant context before sensitive operations. This mechanism transforms episodic and semantic memories into actionable guidance that agents can evaluate in real-time.

Architecture

graph TD
    A[Agent Action Request] --> B[Preflight Check]
    B --> C{Memory Recall}
    C --> D[Episodic Memory]
    C --> E[Semantic Memory]
    C --> F[Procedural Memory]
    D --> G[Memory Reflexes]
    E --> G
    F --> G
    G --> H{Decision?}
    H -->|Match Found| I[Return Reflex Result]
    H -->|No Match| J[Allow Action]
    I --> K{block}
    I --> L[warn]
    I --> M[guide]
    K --> N[Block with Evidence]
    L --> O[Warn with Guidance]
    M --> P[Proceed with Hints]

Memory Reflexes

Memory Reflexes are the atomic decision units within the Preflight system. Each reflex contains a trigger condition, a response type, and optional guidance content.

Response Types

Response Type	Decision	Description
`block`	`block`	Prevents the action entirely; returns blocking evidence
`warn`	`caution`	Allows action but presents warning with recommendations
`guide`	`allow`	Provides informational guidance without blocking

Sources: src/reflexes.ts:1-50

Reflex Structure

interface MemoryReflex {
  response_type: 'block' | 'warn' | 'guide';
  triggered_by: string;       // Memory tag or rule identifier
  message: string;            // Human-readable explanation
  recommended_action?: string; // Suggested alternative
  memory_ids: string[];        // Source memories that triggered this reflex
  confidence: number;         // Reflex confidence score
}

Preflight Process

Decision Flow

The preflight process evaluates incoming actions against three memory types and returns a consolidated decision:

graph LR
    A[Action + Context] --> B[Tag Extraction]
    B --> C{Must-Follow Rules?}
    C -->|Yes| D[BLOCK or UNCERTAIN]
    C -->|No| E{Risk Tags?}
    E -->|Yes| F[Add to WARN]
    E -->|No| G{Procedures?}
    G -->|Yes| H[Add GUIDANCE]
    G -->|No| I{Preferences?}
    I -->|Yes| J[Include in Capsule]
    I -->|No| K[Default ALLOW]

Decision Types

Decision	Meaning	Exit Code Behavior
`allow`	Action proceeds normally	Continue execution
`caution`	Action proceeds with warning	Log warning, continue
`block`	Action is prevented	Return error, halt

Sources: mcp-server/index.ts:80-95

Memory Capsule Integration

Preflight builds a Memory Capsule—a structured context bundle that aggregates relevant memories by category. The capsule sections determine which memories appear in the agent's context window.

interface MemoryCapsule {
  sections: {
    must_follow: MemoryReflex[];        // Critical rules
    recent_changes: MemoryReflex[];     // New learnings
    procedures: MemoryReflex[];        // How-to guidance
    user_preferences: MemoryReflex[];   // Stated preferences
    risks: MemoryReflex[];              // Warnings and hazards
    uncertain_or_disputed: MemoryReflex[]; // Low-confidence or contested
    project_facts: MemoryReflex[];      // Relevant facts
  };
  triggered_by: string;
  generated_at: string;
}

Sources: src/capsule.ts:1-50

Tag-Based Classification

Memory reflexes are classified using tag matching against predefined tag sets:

Tag Set	Purpose	Associated Section
`MUST_FOLLOW_TAGS`	Critical rules that must be obeyed	`must_follow`
`RISK_TAGS`	Potential hazards or warnings	`risks`
`PROCEDURE_TAGS`	Step-by-step guidance	`procedures`
`PREFERENCE_TAGS`	User-stated preferences	`user_preferences`

Sources: src/capsule.ts:50-100

Building Reflex Reports

Report Generation

The buildReflexReport function constructs a complete preflight report from an action:

export async function buildReflexReport(
  audrey: Audrey,
  action: string,
  options: ReflexOptions = {},
): Promise<MemoryReflexReport>

Report Structure

interface MemoryReflexReport {
  decision: 'allow' | 'caution' | 'block';
  reflexes: MemoryReflex[];
  capsule: MemoryCapsule;
  summary: string;           // Human-readable summary
  triggered_at: string;      // ISO timestamp
  session_id?: string;      // Optional session context
}

Sources: src/reflexes.ts:80-120

Summarization Logic

The summarizeReflexes function generates human-readable summaries:

function summarizeReflexes(
  decision: PreflightDecision,
  reflexes: MemoryReflex[],
): string {
  const blocks = reflexes.filter(r => r.response_type === 'block').length;
  const warnings = reflexes.filter(r => r.response_type === 'warn').length;
  const guides = reflexes.filter(r => r.response_type === 'guide').length;
  
  // Returns format: "Stop: 2 blocking, 1 warning, 3 guidance matched."
  // Or: "Slow down: ..." or "Proceed: ..."
}

Validation Layer

Before reflexes are applied, the validation layer ensures response integrity:

Response Validation

// From src/validate.ts
// Validates LLM response shape before reading fields
// - Rejects non-object/array conditions
// - Only counts new evidence toward supporting_count
// - Throws on malformed response shapes

Sources: src/validate.ts

Validation Behavior

Check	Invalid Condition	Behavior
Response Shape	Non-object/array	Reject and throw
Evidence Count	Missing supporting_count	Skip from count
Confidence	Non-finite value	Reject in causal module

Sources: CHANGELOG.md

CLI Integration

Guard Command

The audrey guard command exposes preflight checks via terminal:

audrey guard --tool Bash "npm run deploy"

Guard Options

Option	Description
`--tool <name>`	Tool name being invoked
`--action <command>`	Specific action/command
`--cwd <path>`	Working directory
`--session-id <id>`	Session identifier
`--files <paths>`	Files affected by action
`--json`	Output results as JSON
`--strict`	Fail on warnings
`--include-capsule`	Include full memory capsule
`--explain`	Show reasoning breakdown

Sources: mcp-server/index.ts:40-70

Display Mapping

The CLI maps internal decisions to display messages:

function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
  if (result.decision === 'block') return 'block';
  if (result.decision === 'caution') return 'warn';
  return 'allow';
}

Configuration Options

Environment Variables

Variable	Default	Purpose
`AUDREY_CONTEXT_BUDGET_CHARS`	`4000`	Memory capsule character budget
`AUDREY_ENABLE_ADMIN_TOOLS`	`0`	Enable export/import/forget routes
`AUDREY_DEBUG`	`0`	Print MCP info logs

Runtime Options

interface ReflexOptions {
  agent?: string;                    // Agent identifier
  sessionId?: string;               // Session context
  includeCapsule?: boolean;          // Include full capsule
  includePreflight?: boolean;       // Include preflight details
  context?: Record<string, string>; // Additional context
  mood?: MoodConfig;                // Emotional context
}

Memory Types and Section Assignment

Assignment Logic

graph TD
    A[Memory Entry] --> B{Source Trusted?}
    B -->|Yes + Must-Follow Tags| C[must_follow section]
    B -->|No + Must-Follow Tags| D[uncertain_or_disputed]
    B -->|Risk Tags| E[risks section]
    B -->|Procedural Type/Tags| F[procedures section]
    B -->|Preference Tags| G[user_preferences section]
    A --> H{State or Low Confidence?}
    H -->|disputed/context_dependent/confidence<0.55| I[uncertain_or_disputed]
    A --> J{Recent Window?}
    J -->|Yes| K[recent_changes section]
    J -->|No| L[Default: project_facts]

Threshold Values

Condition	Threshold	Section Assignment
Confidence (disputed)	< 0.55	`uncertain_or_disputed`
Recent Window	7 days	`recent_changes`
Tool Failure	7 days	`risks`

Data Flow Example

sequenceDiagram
    participant Agent
    participant MCP as MCP Server
    participant Audrey
    participant Memory
    participant Reflex

    Agent->>MCP: tool_use(Bash, "rm -rf /")
    MCP->>Audrey: buildPreflight(audrey, action)
    Audrey->>Memory: recall(action, context)
    Memory-->>Audrey: MemoryReflex[]
    Audrey->>Reflex: classifyAndScore(reflexes)
    Reflex-->>Audrey: MemoryReflexReport
    Audrey-->>MCP: PreflightDecision
    MCP-->>Agent: block/caution/allow response

Security Considerations

HTTP Endpoint Protection

The preflight system includes security hardening for HTTP access:

REST endpoints default to loopback-only binding
API key comparison uses crypto.timingSafeEqual to prevent timing attacks
Options like includePrivate: true cannot be passed via HTTP bodies
Non-loopback binding requires explicit AUDREY_API_KEY

Sources: src/routes.ts

Recall Sanitization

HTTP /v1/recall and /v1/capsule endpoints sanitize input through sanitizeRecallOptions():

// Allowed keys only
const ALLOWED_KEYS = ['limit', 'agent', 'tags', 'sources', 'after', 'before', 'context', 'mood', 'retrieval', 'scope'];

Any keys not in the allowlist are silently dropped before processing.

Component	File	Role
Rules Compiler	`src/rules-compiler.ts`	Compiles memories into Claude rules
Validation	`src/validate.ts`	Validates LLM response integrity
Impact Tracking	`src/impact.ts`	Tracks reflex effectiveness over time
Memory Capsule	`src/capsule.ts`	Structures context bundles

Data Storage

Related topics: System Architecture, Core Memory Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Technology Stack

Continue reading this section for the full explanation and source context.

Section Database Schema Design

Continue reading this section for the full explanation and source context.

Section Episodic Memory Storage

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Core Memory Operations

Data Storage

Overview

Audrey's data storage layer is built as a local-first, SQLite-backed persistence system designed for AI agent memory continuity. The storage architecture eliminates external database dependencies while providing vector similarity search capabilities through sqlite-vec, enabling semantic memory retrieval without cloud infrastructure.

The storage system serves as the foundation for Audrey's multi-type memory model, supporting episodic, semantic, and procedural memory with built-in confidence tracking, contradiction handling, and temporal decay mechanisms.

Storage Architecture

Core Technology Stack

Component	Technology	Purpose
Primary Database	SQLite	Structured memory storage, ACID transactions
Vector Search	sqlite-vec	Semantic similarity search on embeddings
Data Directory	`AUDREY_DATA_DIR`	Tenant/environment isolation boundary

The storage backend runs entirely locally, requiring no hosted database services. Each tenant or environment should use a dedicated AUDREY_DATA_DIR to maintain isolation boundaries.

Sources: README.md

Database Schema Design

Audrey maintains three primary memory tables that correspond to its memory model:

erDiagram
    MEMORIES ||--o{ VECTORS : contains
    MEMORIES {
        string id PK
        string content
        string memory_type
        float confidence
        float salience
        string state
        int evidence_count
        int usage_count
        timestamp created_at
        timestamp last_used_at
    }
    VECTORS {
        int rowid
        float vector
        text content
    }

#### Memory Type Storage

Memory Type	Description	Key Attributes
episodic	Specific observations, tool results, session facts	`source`, `tags`, `created_at`
semantic	Consolidated principles from repeated evidence	`evidence_count`, `supporting_count`, `contradicting_count`
procedural	Remembered procedures, actions to avoid or retry	`usage_count`, `failure_prevented`, `tags`

Sources: src/promote.ts

Memory Model Implementation

Episodic Memory Storage

Episodic memories capture specific observations and session-level facts. These entries are created during direct agent interactions and tool executions.

Key storage characteristics:

High-volume insertion during active sessions
Temporal ordering via created_at timestamps
Tag-based categorization for filtered retrieval
Source attribution (direct-observation, told-by-user)

Semantic Memory Storage

Semantic memories represent consolidated principles extracted from accumulated episodic evidence. The promotion system converts episodic memories into semantic rules when confidence thresholds are met.

The promotion criteria for semantic memories include:

Minimum evidence count threshold (default: 3)
Zero contradicting evidence
State must be active

Sources: src/promote.ts:78-92

Procedural Memory Storage

Procedural memories track action sequences and their outcomes. These are distinguished by usage tracking and failure prevention metrics.

Procedural candidates are promoted when:

evidence_count >= minEvidence
contradicting_count === 0
retrieval_count > 0 OR failure_prevented > 0

Confidence and Salience Tracking

Confidence Scoring

Confidence scores are computed from supporting versus contradicting evidence:

confidence = supporting / max(evidence, 1)

The confidence value is clamped to the range [0, 1] to prevent invalid states. Negative salience values from malformed arousal calculations are also clamped.

Sources: CHANGELOG.md

Salience System

Salience represents the importance and emotional weight of memories, influencing recall priority. The effectiveSalience calculation factors in:

Base salience from evidence strength
Temporal decay over time
Arousal/affect resonance from recent memories
Validation feedback (helpful, wrong, used outcomes)

Validation Feedback Loop

The closed-loop validation system updates salience based on memory utility:

Outcome	Salience Effect	Counts Updated
`helpful`	Increases	`retrieval_count`, salience
`wrong`	Decreases	`challenge_count` (semantic only)
`used`	Neutral/slight	`usage_count`

Sources: CHANGELOG.md

Retrieval and Search

Hybrid Recall Architecture

Audrey implements a hybrid retrieval strategy combining vector similarity with keyword matching:

graph TD
    A[Query Input] --> B[Embedding Provider]
    B --> C[Vector Similarity Search]
    A --> D[Full-Text Search FTS]
    C --> E[Confidence Scoring]
    D --> E
    E --> F[Memory Filtering]
    F --> G[Ranked Results]

#### Retrieval Modes

Mode	Behavior
`hybrid` (default)	Combines vector + FTS for balanced recall
`vector`	Pure semantic similarity, bypasses FTS
`keyword`	Skips vector pass, uses FTS only

The vector mode serves as a fast path when FTS overhead is unacceptable.

Sources: src/recall.ts

Filtering Capabilities

Recall operations support multiple filter dimensions:

tags: Array of tag values to match
sources: Array of source identifiers
after/before: Temporal bounds via ISO timestamps
scope: shared or agent-scoped memories
types: Filter by memory type (episodic/semantic/procedural)

Private Memory Isolation

The includePrivate flag controls access to agent-specific private memories. The HTTP API implements an allowlist-based sanitizer (sanitizeRecallOptions()) that prevents bypassing private-memory ACL controls through body options.

Sources: src/routes.ts

Data Lifecycle

Consolidation and Decay

The audrey dream command triggers memory consolidation:

Episodic memories are evaluated for principle extraction
Low-confidence or conflicting memories undergo decay
Stale memories lose retrieval authority over time
Contradicting claims are tracked rather than silently overwritten

Sources: README.md

Rollback Operations

The rollback system (src/rollback.ts) updates memories with verification:

Checks .changes to confirm affected rows
Aggregates real counts rather than assuming success
Reports failure when targeted IDs don't exist

Sources: CHANGELOG.md

Reembedding

When embedding models or dimensions change, reembedding regenerates all vector representations:

Chunks embeddings into 256-row batches
Labels failures by kind and row range
Provides clear error messages for partial failures

Sources: CHANGELOG.md

Import and Export

Data Portability

Audrey supports full data export and import for:

Snapshot restoration to fresh stores
Backup before configuration changes
Migration between environments

Exported snapshots should only be restored into empty Audrey stores with fresh AUDREY_DATA_DIR to prevent data corruption.

Sources: python/README.md

Export Process

Export operations create portable snapshots containing:

All memory records (episodic, semantic, procedural)
Associated metadata (timestamps, confidence scores)
Configuration state

Import Validation

Import operations verify store emptiness before restoration:

isDatabaseEmpty() // Checks both records and vector tables

Sources: CHANGELOG.md

Security Considerations

Credential Protection

Raw credentials and API keys must be excluded from encoded memory content. Audrey provides redaction functionality to prevent sensitive data exposure:

const SENSITIVE_KEY_PATTERN = /(password|secret|api[_-]?key|auth[_-]?token|...)$/i;

Sources: src/redact.ts

API Security

Audrey serve defaults to binding 127.0.0.1 (previously 0.0.0.0)
Non-loopback hosts require AUDREY_API_KEY or explicit AUDREY_ALLOW_NO_AUTH=1
HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks

Sources: CHANGELOG.md

Production Recommendations

Recommendation	Rationale
Set one `AUDREY_DATA_DIR` per tenant	Isolation boundary
Pin embedding and LLM providers	Reproducibility
Backup before provider changes	Data integrity
Keep credentials out of memory content	Security
Use `AUDREY_API_KEY` for network exposure	Access control

Configuration

Environment Variables

Variable	Default	Purpose
`AUDREY_DATA_DIR`	-	Data directory path (required for isolation)
`AUDREY_EMBEDDING_PROVIDER`	-	Embedding model provider
`AUDREY_LLM_PROVIDER`	-	LLM provider for memory operations
`AUDREY_API_KEY`	-	API authentication key
`AUDREY_HOST`	`127.0.0.1`	Network binding address
`AUDREY_ALLOW_NO_AUTH`	`0`	Allow unauthenticated access

Sources: README.md

Memory Model - Multi-type memory architecture
Recall System - Retrieval and search mechanisms
Guard Loop - Pre-action memory checking
Impact Analysis - Memory effectiveness tracking

Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)

MCP Server

Related topics: System Architecture, REST API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Context

Continue reading this section for the full explanation and source context.

Section Server Initialization Flow

Continue reading this section for the full explanation and source context.

Section Core Server Setup

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, REST API

MCP Server

Overview

The Audrey MCP Server is a Model Context Protocol (MCP) stdio server that provides a local-first memory layer for AI agents. It enables agents to encode experiences into persistent memory, recall relevant context before actions, and maintain a durable memory state across sessions. Sources: README.md

The server exposes 20+ tools plus status, recent, and principles resources, along with briefing, recall, and reflection prompts. It communicates via stdio (standard input/output), making it compatible with MCP-compatible hosts like Claude Code, Claude Desktop, Cursor, Windsurf, and VS Code. Sources: README.md

Architecture

System Context

graph TD
    subgraph "MCP Hosts"
        A[Claude Code]
        B[Claude Desktop]
        C[Cursor]
        D[Windsurf]
        E[VS Code]
        F[Other MCP Clients]
    end
    
    subgraph "Audrey MCP Server"
        G[MCP stdio Server]
        H[Tool Handlers]
        I[Resource Providers]
        J[Prompt Templates]
    end
    
    subgraph "Audrey Core"
        K[Memory Store<br/>SQLite + sqlite-vec]
        L[Embedding Engine<br/>ONNX]
        M[Retrieval Engine]
    end
    
    A --> G
    B --> G
    C --> G
    D --> G
    E --> G
    F --> G
    
    G --> H
    G --> I
    G --> J
    
    H --> K
    I --> K
    J --> K
    
    K --> L
    K --> M

Server Initialization Flow

sequenceDiagram
    participant Host as MCP Host
    participant Server as McpServer
    participant Audrey as Audrey Instance
    participant Store as Memory Store
    
    Host->>Server: Start Process
    Server->>Audrey: Initialize with config
    Audrey->>Store: Open SQLite + sqlite-vec
    alt Warmup Enabled
        Audrey->>Store: Pre-compute embeddings
    end
    Server->>Server: registerHostResources()
    Server->>Server: registerHostPrompts()
    Server->>Server: Register Tools
    Server->>Host: Ready (stdio)

Server Components

Core Server Setup

The MCP server is initialized with a name and version, then configured with resources, prompts, and tools: Sources: mcp-server/index.ts:101-106

const server = new McpServer({
  name: SERVER_NAME,
  version: VERSION,
});

registerHostResources(server, audrey);
registerHostPrompts(server);

Tool Registry

The server registers the following tool categories:

Category	Tools	Purpose
Memory Operations	`memory_encode`, `memory_recall`	Store and retrieve memories
Memory Management	`memory_import`, `memory_export`, `memory_forget`	Data management
Impact Tracking	`mark_used`, `impact_report`	Memory utility tracking
Promotion	`promote_memory`, `rule_review`	Memory-to-rules conversion

Tools Reference

memory_encode

Encodes new information into the memory store with diagnostic support.

Parameters:

Parameter	Type	Required	Description
`content`	string	Yes	The memory content to encode
`source`	string	No	Source identifier (e.g., "direct-observation", "told-by-user")
`tags`	string[]	No	Classification tags
`salience`	number	No	Importance weight (0-1)
`private`	boolean	No	Mark as private memory
`context`	object	No	Additional context metadata
`affect`	object	No	Emotional/valence metadata
`wait_for_consolidation`	boolean	No	Opt-in read-after-write semantics (default: false)

Returns: Tool result with memory ID, content, source, and optionally diagnostics. Sources: mcp-server/index.ts:108-133

memory_recall

Retrieves relevant memories based on a query with filtering options.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Search query
`limit`	number	No	Maximum results (default: 10)
`types`	string[]	No	Filter by memory types
`min_confidence`	number	No	Minimum confidence threshold
`tags`	string[]	No	Filter by tags
`sources`	string[]	No	Filter by sources
`after`	string	No	ISO timestamp lower bound
`before`	string	No	ISO timestamp upper bound
`context`	object	No	Context filtering
`mood`	string	No	Mood-based filtering

Returns: Array of recall results with confidence scores and metadata. Sources: mcp-server/index.ts:135-142

Additional Tools

Tool	Purpose
`memory_import`	Import memory snapshots
`memory_export`	Export memory snapshots
`memory_forget`	Delete specific memories
`mark_used`	Record memory utility
`impact_report`	Generate impact analytics
`promote_memory`	Convert memory to rule
`rule_review`	Review promotion candidates

Command Line Interface

Command Routing

The MCP server entry point (mcp-server/index.ts) handles CLI subcommands before starting the stdio loop: Sources: mcp-server/index.ts:200-240

graph TD
    A[audrey CLI] --> B{Subcommand?}
    
    B -->|--help / -h / help| C[printHelp]
    B -->|--version / -v / version| D[printVersion]
    B -->|install| E[install]
    B -->|uninstall| F[uninstall]
    B -->|mcp-config| G[printMcpConfig]
    B -->|hook-config| H[printHookConfig]
    B -->|demo| I[runDemoCommand]
    B -->|reembed| J[reembed]
    B -->|dream| K[dream]
    B -->|greeting| L[greeting]
    B -->|NONE| M[Start MCP Server]
    
    C --> N[Exit 0]
    D --> N
    E --> N
    F --> N
    G --> N
    H --> N
    I --> N
    J --> N
    K --> N
    L --> N
    M --> O[stdio Loop]

Available Subcommands

Command	Description
`audrey install`	Register Audrey with host MCP configuration
`audrey uninstall`	Remove Audrey from host configuration
`audrey mcp-config`	Print MCP server configuration
`audrey hook-config`	Generate Claude Code hook configurations
`audrey demo`	Run interactive demonstration
`audrey reembed`	Regenerate embeddings
`audrey dream`	Generate reflection/memory consolidation
`audrey greeting`	Display greeting message
`audrey doctor`	Run diagnostic checks
`audrey guard`	Check memory before action
`audrey status`	Show memory system status
`audrey promote`	Promote memories to rules
`audrey impact`	Generate impact reports
`audrey observe-tool`	Monitor tool execution

Help and Version Short-Circuit

Help and version flags MUST short-circuit before falling through to the MCP server. A user running audrey --help should see help, not be dropped into a stdio loop: Sources: mcp-server/index.ts:201-206

if (subcommand === '--help' || subcommand === '-h' || subcommand === 'help') {
  printHelp();
  process.exit(0);
} else if (subcommand === '--version' || subcommand === '-v' || subcommand === 'version') {
  printVersion();
  process.exit(0);
}

Configuration Management

MCP Host Configuration

The config.ts module provides functions to generate host-specific MCP configurations: Sources: mcp-server/config.ts:1-50

export function formatMcpHostConfig(
  host: string | undefined = 'generic',
  env: Record<string, string | undefined> = process.env,
): string

Supported Hosts:

Host	Config Format	Notes
`codex`	TOML	GitHub MCP config style
`claude-code`	JSON	Claude Code MCP settings
`claude-desktop`	JSON	Claude Desktop config
`cursor`	JSON	Cursor MCP config
`windsurf`	JSON	Windsurf MCP config
`vscode`	JSON	VS Code MCP config
`jetbrains`	JSON	JetBrains MCP config
`generic`	JSON	Generic MCP fallback

Installation Arguments

The buildInstallArgs() function generates CLI arguments for installing the MCP server: Sources: mcp-server/config.ts:52-66

export function buildInstallArgs(
  env: Record<string, string | undefined> = process.env,
  options: McpEnvOptions = {},
): string[]

Generated Output Example:

mcp add -s user audrey -e AUDREY_AGENT=claude-code -e AUDREY_DATA_DIR=... -- node /path/to/mcp-entrypoint

Install Guide Generation

The formatInstallGuide() function generates human-readable installation instructions: Sources: mcp-server/index.ts:18-48

export function formatInstallGuide(
  host: string,
  env: Record<string, string | undefined> = process.env,
  dryRun = false,
): string

Output Sections:

Title (with dry-run or config-only indicator)
No-modification notice
Generated MCP config
Generated Claude Code hook config (for Claude Code host)
Next steps

Host-Specific Resources

Resource Registration

Resources are registered per-host to provide context: Sources: mcp-server/index.ts:103

registerHostResources(server, audrey);
registerHostPrompts(server);

Available Resources

Resource	Type	Description
`status`	Resource	Current system status
`recent`	Resource	Recent memory activity
`principles`	Resource	Core operational principles

Available Prompts

Prompt	Purpose
`briefing`	Get current session briefing
`recall`	Perform focused recall
`reflection`	Generate self-reflection

Error Handling

Tool Error Response

Tool handlers return structured error responses: Sources: mcp-server/index.ts:100

function toolError(err: unknown): CallToolResult {
  return {
    content: [{ type: 'text', text: `[audrey] error: ${err}` }],
    isError: true,
  };
}

Tool Success Response

Tool handlers return structured success responses with optional diagnostics: Sources: mcp-server/index.ts:99

function toolResult(data: unknown, diagnostics?: unknown): CallToolResult {
  return {
    content: [{ type: 'text', text: JSON.stringify(data) }],
    _meta: diagnostics ? { diagnostics } : undefined,
  };
}

Environment Variables

Variable	Default	Description
`AUDREY_AGENT`	`claude-code`	Host agent identifier
`AUDREY_DATA_DIR`	Platform-specific	Data directory path
`AUDREY_PROFILE`	`0`	Enable profiling diagnostics
`AUDREY_DEBUG`	`0`	Enable debug logging
`AUDREY_DISABLE_WARMUP`	`0`	Skip embedding warmup
`AUDREY_API_KEY`	unset	REST API authentication
`AUDREY_HOST`	`127.0.0.1`	REST bind address
`AUDREY_PORT`	`7437`	REST server port

Performance Characteristics

v0.22.0 Performance Metrics

Operation	Before	After	Improvement
Encode response (p50)	24.7ms	15.2ms	~40% faster
Cold-start first encode	525ms	28ms (with warmup)	~18.7x faster
Hybrid recall (p50)	30.2ms	14.3ms	~2.1x faster

Optimization Details

Eliminated 3 of 4 redundant embedding calls during encode
Validation, interference, and affect resonance reuse the main content vector
Background embedding warmup at MCP boot reduces cold-start latency Sources: CHANGELOG.md

Security

API Key Timing Safety

HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks: Sources: CHANGELOG.md

Recall Options Sanitization

HTTP /v1/recall and /v1/capsule sanitize request bodies to prevent ACL bypass: Sources: CHANGELOG.md

Promote Path Restrictions

audrey promote --yes restricts writes to process.cwd() unless target is in AUDREY_PROMOTE_ROOTS, preventing prompt-injection attacks. Sources: CHANGELOG.md

Profiling Mode

When AUDREY_PROFILE=1, tools return diagnostic metadata: Sources: mcp-server/index.ts:110-120

if (profileEnabled) {
  const { id, diagnostics } = await audrey.encodeWithDiagnostics({
    content,
    source,
    tags,
    salience,
    private: isPrivate,
    context,
    affect,
    waitForConsolidation: wait_for_consolidation,
  });
  return toolResult({ id, content, source, private: isPrivate ?? false }, diagnostics);
}

Diagnostic Data Includes:

Per-stage timing information
Embedding generation time
Retrieval latency breakdown

README.md - Main project documentation
CHANGELOG.md - Version history and release notes
src/capsule.ts - Memory capsule generation
src/rules-compiler.ts - Memory-to-rules promotion
src/impact.ts - Impact analytics reporting

Source: https://github.com/Evilander/Audrey / Human Manual

REST API

Related topics: System Architecture, MCP Server

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Environment Variables

Continue reading this section for the full explanation and source context.

Section Starting the Server

Continue reading this section for the full explanation and source context.

Section Health Check

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, MCP Server

REST API

The Audrey REST API provides a local-first HTTP interface for memory management operations, enabling external agents and services to interact with Audrey's memory system without direct database access.

Overview

Audrey's REST API is built on Hono, a lightweight, high-performance web framework for Edge environments. The API serves as a sidecar service that wraps the core memory engine with HTTP endpoints for encoding, recalling, and managing memory entries.

Key characteristics:

Local-first design with no external database dependencies
SQLite + sqlite-vec for storage and vector search
Bearer token authentication for non-loopback access
Type-safe request/response handling

Sources: README.md:60

Architecture

graph TD
    A[Client<br>Python/JS SDK] --> B[REST API<br>Hono Server]
    B --> C[Audrey Core Engine]
    C --> D[SQLite<br>Memory Store]
    C --> E[sqlite-vec<br>Vector Index]
    B --> F[/health]
    B --> G[/v1/recall]
    B --> H[/v1/capsule]
    B --> I[/v1/encode]
    B --> J[Admin Routes<br>/v1/import<br>/v1/export]

Server Configuration

Environment Variables

Variable	Default	Description
`AUDREY_HOST`	`127.0.0.1`	REST sidecar bind address. Set to `0.0.0.0` only with `AUDREY_API_KEY`.
`AUDREY_PORT`	`7437`	Port for the REST server to listen on.
`AUDREY_API_KEY`	unset	Bearer token required for non-loopback REST traffic.
`AUDREY_ALLOW_NO_AUTH`	`0`	Set to `1` to allow non-loopback bind without an API key. Not recommended.
`AUDREY_ENABLE_ADMIN_TOOLS`	`0`	Set to `1` to enable export, import, and forget routes/tools. Disabled by default.
`AUDREY_PRAGMA_DEFAULTS`	`1`	Set to `0` to revert SQLite PRAGMA tuning to better-sqlite3 defaults.
`AUDREY_DEBUG`	`0`	Set to `1` to print MCP info logs. Errors always log.

Sources: README.md:44-52

Starting the Server

# Default (loopback only)
npx audrey serve

# With explicit port
AUDREY_PORT=8080 npx audrey serve

# Network-exposed with API key
AUDREY_HOST=0.0.0.0 AUDREY_API_KEY=secret npx audrey serve

Sources: python/README.md:18

API Endpoints

Health Check

Method	Path	Description
`GET`	`/health`	Returns server health status

Response:

{
  "status": "ok",
  "version": "0.22.1",
  "timestamp": "2026-04-30T12:00:00.000Z"
}

Sources: README.md:58

Memory Operations

Method	Path	Description
`POST`	`/v1/encode`	Store a new memory entry
`POST`	`/v1/recall`	Retrieve relevant memories by query
`POST`	`/v1/capsule`	Get a turn-sized memory packet
`POST`	`/v1/mark-used`	Mark memory as used with outcome feedback
`POST`	`/v1/observe-tool`	Record tool execution results
`POST`	`/v1/before-action`	Preflight check before tool execution
`POST`	`/v1/validate`	Validate memory helpfulness

Sources: README.md:58, src/routes.ts:1-50

Request Body Schema

The REST API accepts a unified RouteBody type with optional fields:

type RouteBody = {
  action?: string;
  query?: string;
  tool?: string;
  session_id?: string;
  sessionId?: string;
  cwd?: string;
  files?: string[];
  strict?: boolean;
  limit?: number;
  budget_chars?: number;
  budgetChars?: number;
  mode?: PreflightOptions['mode'];
  failure_window_hours?: number;
  recent_failure_window_hours?: number;
  recentFailureWindowHours?: number;
  recent_change_window_hours?: number;
  recentChangeWindowHours?: number;
  include_capsule?: boolean;
  includeCapsule?: boolean;
  include_status?: boolean;
  includeStatus?: boolean;
  record_event?: boolean;
  recordEvent?: boolean;
  include_preflight?: boolean;
  includePreflight?: boolean;
  receipt_id?: string;
  receiptId?: string;
  input?: unknown;
  output?: unknown;
  outcome?: EventOutcome;
  error_summary?: string;
  errorSummary?: string;
  metadata?: Record<string, unknown>;
  retain_details?: boolean;
  retainDetails?: boolean;
  evidence_feedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
  evidenceFeedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
};

Sources: src/routes.ts:5-46

Security Model

Authentication

Non-loopback REST traffic requires a Bearer token:

curl -H "Authorization: Bearer your-secret-token" \
  http://localhost:7437/v1/recall \
  -d '{"query": "deploy failures"}'

Security measures:

HTTP API key comparison uses crypto.timingSafeEqual instead of string !== to prevent timing attacks Sources: README.md:41
Server defaults to binding 127.0.0.1 (was 0.0.0.0) Sources: README.md:40
Refuses to start on non-loopback host without AUDREY_API_KEY unless AUDREY_ALLOW_NO_AUTH=1

Recall Options Sanitization

HTTP /v1/recall and /v1/capsule no longer body-spread caller options into internal calls. The sanitizeRecallOptions() function implements an allowlist that drops anything not in a known-safe key set:

export function sanitizeRecallOptions(options: unknown): SanitizedRecallOptions {
  // Only allows: budget_chars, limit, retrieval, includePrivate, sessionId
}

This prevents bypassing private-memory ACL and integrity controls via includePrivate: true or confidenceConfig overrides in HTTP bodies.

Sources: README.md:39-40, CHANGELOG.md:0.22.1

Admin Tools

Admin routes (/v1/import, /v1/export, /v1/forget) are disabled by default. Enable with:

AUDREY_ENABLE_ADMIN_TOOLS=1 npx audrey serve

Sources: README.md:48

Client Integration

Python SDK

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

# Encode a memory
memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)

# Create snapshot for backup
snapshot = brain.snapshot()
brain.close()

Async usage:

import asyncio
from audrey_memory import AsyncAudrey

async def main() -> None:
    async with AsyncAudrey(base_url="http://127.0.0.1:7437") as brain:
        await brain.health()
        await brain.encode("Deploy failed due to OOM", source="direct-observation")
        await brain.recall("deploy failure", limit=3)

asyncio.run(main())

Sources: python/README.md:22-45

Connection URL Correction

Note: Python client DEFAULT_BASE_URL was corrected from http://127.0.0.1:3487 to http://127.0.0.1:7437 in v0.22.1 to match the TS server's default port.

Sources: CHANGELOG.md:0.22.1

Impact Reporting

The REST API exposes impact analytics through the audrey impact CLI, which calls internal Audrey methods:

Endpoint	Description
Total memories by type	episodic, semantic, procedural counts
All-time validated count	Memories validated as helpful/wrong
Recent validations	Validation activity in time window
Top-N most-used memories	Memories with highest `usage_count`
Weakest-N memories	Lowest salience candidates for forgetting
Recent activity timeline	`last_used_at` based activity log

# Basic report
audrey impact

# JSON output for automation
audrey impact --json

# Custom window and limits
audrey impact --window 30 --limit 20

Sources: src/impact.ts:1-50, CHANGELOG.md:0.22.1

Deployment

Docker

# docker-compose.yml
services:
  audrey:
    image: ghcr.io/evilander/audrey:latest
    ports:
      - "7437:7437"
    environment:
      - AUDREY_API_KEY=your-secret-token
      - AUDREY_HOST=0.0.0.0
    volumes:
      - audrey-data:/data

Doctor Check

The audrey doctor command validates REST server configuration:

npx audrey doctor

Checks performed:

Check	Description
`serve-bind-safety`	Validates bind address with auth configuration
`node-runtime`	Node.js version >= 20
`entrypoint-exists`	MCP stdio entrypoint file exists
`data-dir`	Data directory accessibility
`embedding`	Embedding provider configuration
`llm`	LLM provider configuration

graph TD
    A[audrey doctor] --> B{Is bind loopback?}
    B -->|Yes| C[✅ loopback only]
    B -->|No| D{Has AUDREY_API_KEY?}
    D -->|Yes| E[✅ non-loopback with API key]
    D -->|No| F{Has AUDREY_ALLOW_NO_AUTH?}
    F -->|Yes| G[⚠️ warning - network exposure]
    F -->|No| H[❌ error - refuse to start]

Sources: mcp-server/index.ts:100-130

Error Handling

Common Issues

Error	Cause	Solution
Connection refused	Wrong port or host	Check `AUDREY_PORT` and `AUDREY_HOST`
401 Unauthorized	Missing/invalid API key	Provide `Authorization: Bearer <token>` header
404 Not Found	Wrong endpoint	Use `/v1/*` routes, not `/openapi.json` or `/docs`
Validation error	Malformed request body	Check RouteBody schema

Status Codes

Code	Meaning
`200`	Success
`400`	Bad request (malformed body)
`401`	Unauthorized (missing/invalid API key)
`404`	Endpoint not found
`500`	Internal server error

Note: /openapi.json and /docs routes are not currently wired. The README matches the actual surface (/health + /v1/*).

Sources: CHANGELOG.md:0.22.1, README.md:58

Sources: [README.md:60]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors

First-time setup may fail or require extra isolation and rollback planning.

medium Audrey Guard 0.23.0

First-time setup may fail or require extra isolation and rollback planning.

medium v0.16.0

First-time setup may fail or require extra isolation and rollback planning.

medium v0.16.1 — Windows MCP fix

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 14 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors

Severity: medium
Finding: Installation risk is backed by a source signal: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.1

2. Installation risk: Audrey Guard 0.23.0

Severity: medium
Finding: Installation risk is backed by a source signal: Audrey Guard 0.23.0. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.23.0

3. Installation risk: v0.16.0

Severity: medium
Finding: Installation risk is backed by a source signal: v0.16.0. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.0

4. Installation risk: v0.16.1 — Windows MCP fix

Severity: medium
Finding: Installation risk is backed by a source signal: v0.16.1 — Windows MCP fix. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.1

5. Installation risk: v0.17.0

Severity: medium
Finding: Installation risk is backed by a source signal: v0.17.0. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.17.0

6. Configuration risk: Configuration risk needs validation

Severity: medium
Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.host_targets | github_repo:1161444210 | https://github.com/Evilander/Audrey | host_targets=mcp_host, claude, claude_code

7. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:1161444210 | https://github.com/Evilander/Audrey | README/documentation is current enough for a first validation pass.

8. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:1161444210 | https://github.com/Evilander/Audrey | last_activity_observed missing

9. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium

10. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium

11. Security or permission risk: Audrey 1.0.0

Severity: medium
Finding: Security or permission risk is backed by a source signal: Audrey 1.0.0. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.0

12. Security or permission risk: v0.22.2 — correctness pass + legitimate benchmarking

Severity: medium
Finding: Security or permission risk is backed by a source signal: v0.22.2 — correctness pass + legitimate benchmarking. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.22.2

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Audrey with real data or production workflows.

Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured vali - github / github_release
Audrey 1.0.0 - github / github_release
Audrey Guard 0.23.0 - github / github_release
v0.22.2 — correctness pass + legitimate benchmarking - github / github_release
v0.17.0 - github / github_release
v0.16.1 — Windows MCP fix - github / github_release
v0.16.0 - github / github_release
Configuration risk needs validation - GitHub / issue

Source: Project Pack community evidence and pitfall evidence