Doramagic Project Pack · Human Manual

Audrey

Related topics: System Architecture, Memory Model, Quick Start Guide

Audrey Overview

Related topics: System Architecture, Memory Model, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Memory Types

Continue reading this section for the full explanation and source context.

Section Audrey Guard

Continue reading this section for the full explanation and source context.

Section Memory Capsule

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Model, Quick Start Guide

Audrey Overview

Audrey is a local-first memory firewall for AI agents. It provides a durable, SQLite-backed memory layer that enables AI agents to remember past mistakes, learned principles, and project-specific rules across sessions. Audrey acts as a continuity layer that sits under any local or sidecar agent loop, preventing agents from repeating the same mistakes and enabling smarter, more context-aware behavior.

Sources: README.md:1-10

What Problem Audrey Solves

AI agents typically suffer from "cold start" problems—they treat every new session as if they've never interacted with the project before. They repeat broken commands, lose project-specific rules, miss contradictions, and forget the exact mistakes they made yesterday.

Audrey addresses this by implementing a closed feedback loop:

  1. Record what happened during agent actions
  2. Remember what mattered from those events
  3. Check before new actions using stored memories
  4. Return decisions (allow, warn, or block) with evidence
  5. Validate whether the memory helped improve outcomes

Sources: README.md:25-40

Architecture Overview

Audrey is built with a layered architecture that separates concerns between memory storage, retrieval, governance, and agent integration.

graph TD
    subgraph Client Layer
        CLI[CLI Tool<br>npx audrey]
        PythonSDK[Python SDK<br>audrey_memory]
        MCPServer[MCP Server]
    end
    
    subgraph Integration Layer
        Hooks[Claude Code Hooks<br>PreToolUse/PostToolUse]
        MCPConfig[MCP Config<br>Codex, VSCode, etc.]
    end
    
    subgraph Core Engine
        Guard[Audrey Guard<br>Memory-before-action]
        Routes[REST API<br>/v1/*]
    end
    
    subgraph Memory Layer
        SQLite[(SQLite<br>WAL Mode)]
        Episodic[Episodic<br>Memory]
        Semantic[Semantic<br>Memory]
        Procedural[Procedural<br>Memory]
    end
    
    subgraph Embedding
        ONNX[ONNX Runtime<br>Local Embeddings]
        Providers[Cloud Providers<br>OpenAI, Anthropic]
    end
    
    CLI --> Routes
    PythonSDK --> Routes
    MCPServer --> Routes
    Hooks --> Guard
    MCPConfig --> MCPServer
    Guard --> Routes
    Routes --> SQLite
    SQLite --> Episodic
    SQLite --> Semantic
    SQLite --> Procedural
    Routes --> ONNX
    Routes --> Providers

Sources: README.md:1-50

Core Components

Memory Types

Audrey manages three distinct types of memory, each serving a different purpose:

Memory TypePurposeExamples
EpisodicRecords specific events and outcomes"Deploy failed at 3:42 PM with OOM error"
SemanticStores learned facts, principles, and rules"Stripe rate limits are 100 req/s"
ProceduralCaptures how-to knowledge and workflows"To deploy, run npm run deploy after npm test"

Each memory type can be tagged, sourced, and validated independently. Memories gain salience through usage—memories that are repeatedly helpful become more prominent, while unused memories decay over time.

Sources: README.md:40-55

Audrey Guard

Audrey Guard is the core decision-making component that checks memories before agent actions execute. It implements a preflight check that returns structured decisions:

graph LR
    Action[Agent Action<br>tool + parameters] --> Guard
    Guard --> Recall[Recall Relevant<br>Memories]
    Recall --> Decision{Decision}
    Decision -->|No issues| ALLOW[allow]
    Decision -->|Potential risk| WARN[warn<br>+ evidence]
    Decision -->|Dangerous| BLOCK[block<br>+ reason]
    Decision -->|Uncertain| QUERY[Query<br>Human]

The Guard returns a decision with supporting evidence, allowing the agent to make informed choices. When set to strict mode, warnings are treated as blocks.

Sources: README.md:25-35

Memory Capsule

The Memory Capsule is a structured response format that bundles contextual information for agent preflight checks:

SectionDescription
recent_changesMemories created/modified within the recent-change window
must_followCritical rules tagged as mandatory
proceduresStep-by-step workflows relevant to the query
user_preferencesExplicitly stated user preferences
risksWarnings and risk indicators
uncertain_or_disputedLow-confidence or contested memories

Sources: src/capsule.ts:1-50

Impact Tracking

Audrey tracks the effectiveness of its memories through a closed validation loop:

graph TD
    Action[Action with Memory] --> Outcome{Outcome}
    Outcome -->|helpful| Boost[Boost salience<br>+usage_count]
    Outcome -->|used| Maintain[Maintain salience]
    Outcome -->|wrong| Challenge[Challenge memory<br>Decrease salience]
    Boost --> Consolidation[Consolidation<br>Dream cycle]
    Maintain --> Consolidation
    Challenge --> Consolidation
    Consolidation --> Principles[New Semantic<br>Principles]

Outcome types:

  • helpful: The memory contributed to a successful outcome
  • used: The memory was consulted but didn't directly contribute
  • wrong: The memory led to an incorrect decision

Sources: src/impact.ts:1-60

Installation Methods

Audrey supports multiple installation patterns depending on your use case.

CLI Installation

For direct terminal usage:

npx audrey doctor          # Verify setup
npx audrey demo --scenario repeated-failure  # Run demo
npx audrey guard --tool Bash "npm run deploy"  # Check before action

Sources: README.md:55-65

MCP Server Integration

For integration with agents like Codex, Claude Desktop, Cursor, and VS Code:

# Generate MCP configuration
npx audrey mcp-config codex
npx audrey mcp-config generic
npx audrey mcp-config vscode

Sources: mcp-server/index.ts:1-40

Claude Code Hooks

For Claude Code, install directly and configure memory-before-action hooks:

npx audrey install
claude mcp list

# Apply hooks to project or user scope
npx audrey hook-config claude-code --apply --scope project  # Project-local
npx audrey hook-config claude-code --apply --scope user     # User-wide

The generated hooks include:

  • PreToolUse: Runs audrey guard --hook --fail-on-warn
  • PostToolUse: Records successful tool executions
  • PostToolUseFailure: Records failed tool executions

Sources: README.md:67-85

Python SDK

For Python-based agent integrations:

pip install audrey-memory
from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

# Encode new memories
memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)

# Close connection
brain.close()

Sources: python/README.md:1-50

REST API Reference

The Audrey REST API exposes core memory operations via HTTP.

Endpoints Overview

MethodEndpointDescription
GET/healthServer health check
POST/v1/encodeStore a new memory
POST/v1/recallRetrieve memories by semantic similarity
POST/v1/preflightMemory-before-action check
POST/v1/validateSubmit outcome feedback
POST/v1/impactGet impact statistics

Sources: src/routes.ts:1-80

Core API Operations

#### Encode Memory

interface EncodeRequest {
  content: string;           // The memory content
  memory_type: 'episodic' | 'semantic' | 'procedural';
  source: string;            // e.g., "direct-observation", "told-by-user"
  tags?: string[];
  private?: boolean;        // Agent-only memory
  wait_for_consolidation?: boolean;
}

Sources: src/routes.ts:80-120

#### Recall Memories

interface RecallRequest {
  query: string;
  limit?: number;           // Default: 5
  budget_chars?: number;    // Context budget
  retrieval?: 'hybrid' | 'vector';  // Default: hybrid
  mood?: {                  // Optional affect configuration
    min_valence?: number;
    min_arousal?: number;
  };
}

#### Preflight Check

interface PreflightRequest {
  tool: string;             // e.g., "Bash", "Write"
  action: string;           // The specific action/command
  session_id?: string;
  cwd?: string;
  include_capsule?: boolean;
  include_preflight?: boolean;
  record_event?: boolean;
}

#### Validate Outcome

interface ValidateRequest {
  receipt_id: string;       // From preflight response
  outcome: 'helpful' | 'used' | 'wrong';
  evidence_feedback?: Record<string, 'helpful' | 'used' | 'wrong'>;
  metadata?: Record<string, unknown>;
}

Sources: src/routes.ts:120-200

Configuration

Environment Variables

VariableDefaultPurpose
AUDREY_DATA_DIR~/.audreySQLite data directory (set per tenant/agent)
AUDREY_EMBEDDING_PROVIDERonnxEmbedding provider: onnx, openai, anthropic, google
AUDREY_LLM_PROVIDERopenaiLLM provider for consolidation
AUDREY_MODELvariesSpecific model to use
AUDREY_HOST127.0.0.1REST sidecar bind address
AUDREY_PORT7437REST sidecar port
AUDREY_API_KEYunsetBearer token for non-loopback access
AUDREY_ALLOW_NO_AUTH0Allow non-loopback without API key (not recommended)
AUDREY_ENABLE_ADMIN_TOOLS0Enable export/import/forget routes
AUDREY_DEBUG0Enable debug logging
AUDREY_PROFILE0Emit per-stage diagnostics
AUDREY_DISABLE_WARMUP0Skip embedding warmup at boot
AUDREY_CONTEXT_BUDGET_CHARS4000Default capsule character budget

Sources: README.md:150-180

Data Isolation

SQLite uses WAL mode without an advisory lock, so two processes sharing a directory will contend on writes. Isolation is a hard requirement for multi-agent setups.

Important: Set a distinct AUDREY_DATA_DIR per tenant, agent identity, or concurrent host to avoid write contention.

Sources: README.md:55-60

Security

Redaction

Audrey automatically redacts sensitive information from stored memories and logs:

ClassPatterns
api_keyapi_key, apiKey, API_KEY patterns
passwordpassword, passwd, pwd
tokentoken, bearer_token, access_token, jwt
secretsecret, client_secret, private_key

The redaction system walks JSON structures recursively and applies pattern matching to both keys and values.

Sources: src/redact.ts:1-60

Access Control

  • HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks
  • audrey serve defaults to binding 127.0.0.1 (was 0.0.0.0)
  • Non-loopback bind requires AUDREY_API_KEY or explicit AUDREY_ALLOW_NO_AUTH=1
  • Private memories have ACL enforcement at the recall endpoint
  • sanitizeRecallOptions() allowlists HTTP body parameters to prevent option injection

Sources: CHANGELOG.md:1-30

Production Readiness

Release Gates

npm run release:gate           # Full release checklist
npm run python:release:check   # Python package verification
npm run bench:guard:card       # Guard performance benchmarks
npm run bench:guard:validate   # Guard accuracy validation
npx audrey doctor              # Runtime health check
npx audrey status --json --fail-on-unhealthy

Production Checklist

  • Set one AUDREY_DATA_DIR per tenant, environment, or isolation boundary
  • Pin AUDREY_EMBEDDING_PROVIDER and AUDREY_LLM_PROVIDER explicitly
  • Back up the SQLite data directory before provider or dimension changes
  • Keep API keys and raw credentials out of encoded memory content
  • Use AUDREY_API_KEY if the REST sidecar is reachable beyond local process boundary
  • Run audrey dream on a schedule for consolidation and decay
  • Add application-level encryption, retention, access control, and audit logging for regulated environments

Sources: README.md:100-125

Memory Lifecycle

graph TD
    Event[Agent Event<br>Action/Failure] --> Encode[Encode Memory<br>episodic]
    Encode --> Salience[Initial Salience<br>from confidence]
    Salience --> Usage[Usage Cycle]
    
    Usage --> Preflight[Preflight Check]
    Preflight --> Decision[Guard Decision]
    Decision --> Action[Execute Action]
    Action --> Outcome{Outcome}
    
    Outcome -->|Success| Boost[Boost Salience<br/>usage_count++]
    Outcome -->|Partial| Maintain[Maintain]
    Outcome -->|Failure| Challenge[Challenge<br/>decay confidence]
    
    Boost --> Consolidation{Dream Cycle}
    Maintain --> Consolidation
    Challenge --> Consolidation
    
    Consolidation --> Principle[Extract Principle<br/>semantic memory]
    Consolidation --> Decay[Apply Decay<br/>unused memories]
    
    Principle --> NewMemory[New Semantic<br/>Memory]
    Decay --> Prune[Prune Very Low<br/>salience memories]

Dream Cycle

The memory_dream operation consolidates episodes into principles and applies decay:

  • Consolidation: Groups related episodic memories into higher-level semantic principles
  • Decay: Reduces salience of memories that haven't been used recently
  • Challenge: Flags memories that led to wrong outcomes for review

Sources: README.md:40-50

Supported Integrations

Agent/IDEIntegration MethodFeatures
Claude CodeHooks + MCPFull memory-guard loop
CodexMCP ConfigMemory recall
Claude DesktopMCPMemory access
CursorMCP ConfigMemory recall
WindsurfMCP ConfigMemory recall
VS CodeMCP ConfigMemory recall
JetBrainsMCP ConfigMemory recall
OllamaGeneric MCPMemory recall
Custom AgentsREST API / Python SDKFull integration

Sources: README.md:10-20

Key Files Reference

FilePurpose
src/audrey.tsCore Audrey class with memory operations
src/routes.tsREST API route handlers
src/capsule.tsMemory capsule builder
src/impact.tsImpact tracking and validation
src/redact.tsSensitive data redaction
src/rules-compiler.tsRule file generation from memories
mcp-server/index.tsMCP server and CLI commands
python/Python SDK implementation

Sources: src/audrey.ts, src/routes.ts, src/capsule.ts, src/impact.ts, src/redact.ts, src/rules-compiler.ts, mcp-server/index.ts, python/README.md

Sources: [README.md:1-10]()

Quick Start Guide

Related topics: Audrey Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section CLI Installation

Continue reading this section for the full explanation and source context.

Section Python SDK Installation

Continue reading this section for the full explanation and source context.

Section 1. Run the Health Check

Continue reading this section for the full explanation and source context.

Related topics: Audrey Overview

Quick Start Guide

Overview

Audrey is a local-first memory firewall for AI agents that provides a durable memory layer they can check before executing tools. This guide covers installation, configuration, and basic usage patterns across all supported surfaces: CLI, REST API, JavaScript SDK, and Python client.

Prerequisites

RequirementVersion/Details
Node.jsv18+ recommended
npmv8+
Python3.9+ (for Python SDK)
SQLiteBuilt-in (bundled)
DockerOptional for containerized deployment

Installation

CLI Installation

npm install -g audrey

Verify installation:

audrey --version

Python SDK Installation

pip install audrey-memory

Sources: python/README.md:1-20

Quick Setup

1. Run the Health Check

audrey doctor

This command validates the installation and checks for any configuration issues.

Sources: mcp-server/index.ts:85-120

2. Start the REST API Server

npx audrey serve

By default, the server binds to 127.0.0.1:7437. Configure using environment variables:

Environment VariableDefaultDescription
AUDREY_PORT7437REST API port
AUDREY_HOST127.0.0.1REST sidecar bind address
AUDREY_API_KEYunsetBearer token for non-loopback traffic
AUDREY_DATA_DIR~/.audreyData directory path

Sources: README.md:1-50

3. Install MCP Configuration

For Claude Code integration:

audrey install --host claude-code

Generate MCP config without applying:

audrey mcp-config --host claude-code --dry-run

Apply project hooks:

audrey hook-config claude-code --apply --scope project

Apply user hooks:

audrey hook-config claude-code --apply --scope user

Sources: mcp-server/index.ts:40-75

Core Usage Patterns

Memory Guard Workflow

The primary safety loop that records events, checks memory before action, and returns decisions:

graph TD
    A[Agent Action] --> B[audrey guard --tool Bash npm run deploy]
    B --> C{Memory Check}
    C -->|allow| D[Execute Action]
    C -->|warn| E[Execute with Warning]
    C -->|block| F[Block Action]
    D --> G[Record Outcome]
    E --> G
    F --> G
    G --> H[Memory Consolidation]

Execute the guard command:

audrey guard --tool Bash "npm run deploy"

Parameters:

ParameterDescription
--toolTool name (Bash, Write, Read, etc.)
--session-idSession identifier
--filesFile paths involved
--strictFail on warning

Sources: mcp-server/index.ts:150-200

Encoding Memories

#### Via REST API

curl -X POST http://127.0.0.1:7437/v1/encode \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Stripe returns HTTP 429 above 100 req/s",
    "source": "direct-observation",
    "tags": ["stripe", "rate-limit"]
  }'

#### Via Python SDK

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

Sources: python/README.md:25-45

Recalling Memories

#### Via REST API

curl -X POST http://127.0.0.1:7437/v1/recall \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "stripe rate limits",
    "limit": 5,
    "retrieval": "hybrid"
  }'

#### Via Python SDK

results = brain.recall("stripe rate limits", limit=5)

#### Available Recall Options

OptionTypeDefaultDescription
querystringrequiredSearch query
limitnumber10Maximum results
budget_charsnumber4000Context budget in characters
retrievalstring"hybrid""hybrid" or "vector" mode
include_privatebooleanfalseInclude private memories
agentstring-Filter by agent name

Sources: src/routes.ts:1-50

Getting Memory Capsule

A turn-sized memory packet containing relevant context:

curl -X POST http://127.0.0.1:7437/v1/capsule \
  -H "Authorization: Bearer secret" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "current deploy status",
    "budget_chars": 4000
  }'

The capsule contains sections:

  • recent_changes - Memories from recent window
  • must_follow - Critical rules
  • procedures - Step-by-step memories
  • user_preferences - User-stated preferences
  • risks - Warnings and risks
  • uncertain_or_disputed - Low-confidence items

Sources: src/capsule.ts:1-60

Check Health

curl http://127.0.0.1:7437/v1/status

Async version:

import asyncio
from audrey_memory import AsyncAudrey

async def main():
    async with AsyncAudrey(base_url="http://127.0.0.1:7437", api_key="secret") as brain:
        health = await brain.health()
        print(health)

asyncio.run(main())

Sources: python/README.md:50-70

Advanced CLI Commands

Dream - Consolidate Memory

audrey dream

Triggers memory consolidation process.

Reembed - Rebuild Vector Indices

audrey reembed

Rebuilds embedding indices after schema changes.

Observe Tool

Record tool execution results:

audrey observe-tool --tool Bash --input '{"command": "npm test"}' --output '{"exitCode": 0}'

Impact Report

Generate memory impact analysis:

audrey impact --window-days 30 --limit 10

The report includes:

  • Memory counts by type (episodic, semantic, procedural)
  • Validated memories count
  • Outcome breakdown (helpful, wrong, used)
  • Top used memories
  • Weakest memories by salience
  • Recent activity

Sources: src/impact.ts:1-80

Promote - Extract Rules

Promote memory candidates to reviewable Markdown files:

audrey promote --yes

Rules are saved to .claude/rules/ with YAML front matter for traceability.

Check Status

audrey status

Displays:

  • Current mood (valence, arousal)
  • Memory counts
  • Learned principles
  • Recent memories
  • Unresolved threads

Sources: mcp-server/index.ts:200-280

MCP Server Tools

Audrey provides 20 tools via MCP stdio protocol:

ToolPurpose
memory_encodeRecord new memories
memory_recallRetrieve relevant memories
memory_capsuleGet turn-sized context packet
preflight_checkValidate before action
record_outcomeRecord action results
promote_memoryConvert to persistent rule
impact_reportAnalyze memory effectiveness

Resources include:

  • status - System health
  • recent - Recent memories
  • principles - Semantic memories
  • briefing - Current context

Prompts include:

  • memory-recall - Search memories
  • memory-reflection - Self-analysis

Sources: README.md:60-90

Configuration Reference

Environment Variables

VariableDefaultDescription
AUDREY_PORT7437REST API port
AUDREY_HOST127.0.0.1Bind address
AUDREY_API_KEYunsetBearer token
AUDREY_DATA_DIR~/.audreyData directory
AUDREY_ENABLE_ADMIN_TOOLS0Enable export/import routes
AUDREY_PROMOTE_ROOTSunsetExtra write roots
AUDREY_DEBUG0Enable debug logging
AUDREY_PROFILE0Emit per-stage timings
AUDREY_DISABLE_WARMUP0Skip embedding warmup
AUDREY_CONTEXT_BUDGET_CHARS4000Default capsule budget

Sources: README.md:40-55

Privacy Controls

By default, private memories are ACL-protected:

  • include_private: true is restricted in HTTP API
  • confidenceConfig overrides are blocked via sanitizeRecallOptions()

For full control, use the SDK directly or enable admin tools:

AUDREY_ENABLE_ADMIN_TOOLS=1 audrey serve

Sources: CHANGELOG.md:1-30

Deployment Options

Docker

docker run -p 7437:7437 \
  -e AUDREY_API_KEY=secret \
  -v audrey-data:/root/.audrey \
  audrey:latest

Docker Compose

Use the provided docker-compose.yml for persistent deployments with volume mounts.

Host-Specific Setup

Generate platform-specific MCP configurations:

audrey mcp-config --host claude-code
audrey mcp-config --host cursor
audrey mcp-config --host windsurf

Sources: README.md:80-100

Next Steps

  • Review audrey doctor output for any warnings
  • Configure AUDREY_API_KEY for production deployments
  • Set up MCP integration for your preferred IDE/agent
  • Explore memory types: episodic, semantic, procedural
  • Enable impact tracking to measure memory effectiveness

For detailed API documentation, see the REST API endpoints at /v1/* when the server is running.

Sources: [python/README.md:1-20](https://github.com/Evilander/Audrey/blob/main/python/README.md)

System Architecture

Related topics: Audrey Overview, Memory Model, Data Storage, MCP Server, REST API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MCP Server (mcp-server/index.ts)

Continue reading this section for the full explanation and source context.

Section REST API (src/routes.ts)

Continue reading this section for the full explanation and source context.

Section Audrey Core (src/audrey.ts)

Continue reading this section for the full explanation and source context.

Related topics: Audrey Overview, Memory Model, Data Storage, MCP Server, REST API

System Architecture

Overview

Audrey is a local-first memory runtime designed to give AI agents persistent, queryable memory across sessions. It operates as a stateful infrastructure layer that records observations, consolidates principles, and provides memory-before-action checks through multiple interfaces.

The system is built around a closed-loop safety architecture where every tool action can be validated against stored memory before execution, returning allow, warn, or block decisions with supporting evidence.

Core Design Principles

PrincipleDescription
Local-firstAll data stored in local SQLite; no external database required
Agent-agnosticWorks with Codex, Claude Code, Cursor, Windsurf, VS Code, JetBrains, Ollama, and custom agents
Safety loopPre-action validation through Audrey Guard before tool execution
Isolation per tenantOne AUDREY_DATA_DIR per tenant/agent/isolation boundary
Privacy-by-defaultAudrey Guard redacts tool traces; private memory ACL enforcement

High-Level Component Architecture

graph TD
    subgraph "Agent Host"
        A[AI Agent<br/>Claude Code / Codex / Cursor]
    end
    
    subgraph "Audrey Runtime"
        CLI[CLI<br/>doctor, demo, guard,<br/>install, status, dream]
        MCP[MCP Server<br/>stdio interface]
        REST[REST API<br/>Hono server :7437]
        SDK[JS SDK<br/>TypeScript/Node]
        PY[Python SDK<br/>audrey-memory]
    end
    
    subgraph "Core Engine"
        AUD[Audrey Core<br/>encode, recall, consolidate,<br/>validate, impact]
        MEM[(Memory Store<br/>SQLite + sqlite-vec)]
        EMB[Embedding Engine<br/>ONNX runtime]
        CAUSAL[Causal Validation<br/>confidence scoring]
    end
    
    A <--> MCP
    A <--> REST
    CLI --> MCP
    CLI --> REST
    SDK --> REST
    PY --> REST
    AUD <--> MEM
    AUD <--> EMB
    AUD <--> CAUSAL

Component Specifications

MCP Server (`mcp-server/index.ts`)

The MCP stdio server provides 20+ tools plus status, recent, principles resources and briefing/recall/reflection prompts.

Interface TypeCountPurpose
Tools20+encode, recall, capsule, guard, promote, impact, dream, reembed, observe-tool
Resources3status, recent, principles
Prompts3briefing, recall, reflection

The server processes CLI arguments before entering stdio mode to handle --help, --version, and subcommands like install, mcp-config, hook-config. Sources: mcp-server/index.ts:1-100

REST API (`src/routes.ts`)

Hono-based HTTP server exposing the following endpoints:

EndpointMethodPurpose
/healthGETHealth check
/v1/encodePOSTStore memory with source, tags, salience
/v1/recallPOSTRetrieve relevant context
/v1/capsulePOSTGet turn-sized memory packet
/v1/statusGETRuntime status
/v1/observePOSTRecord tool outcome
/v1/validatePOSTValidate memory usefulness

Security: HTTP recall/capsule routes use sanitizeRecallOptions() to prevent private-memory ACL bypass via caller-supplied options. API key comparison uses crypto.timingSafeEqual to prevent timing attacks. Sources: src/routes.ts:1-80

Audrey Core (`src/audrey.ts`)

The central engine handling memory operations:

graph LR
    ENC[encode] --> VEC[Vector Embedding]
    ENC --> DB[(SQLite)]
    REC[recall] --> VEC
    REC --> DB
    VEC --> EMB[Embedding Engine]
    DB --> CAUSAL[Causal Validation]
    CAUSAL --> CONF[Confidence Scoring]

Key operations:

  • encode(): Stores episodic, semantic, or procedural memory with vector embedding
  • recall(): Retrieves memories using hybrid (vector + FTS) search
  • consolidate(): Extracts principles from repeated evidence
  • decay(): Reduces authority of stale, low-confidence memories
  • beforeAction(): Guard check returning allow/warn/block
  • afterAction(): Records tool execution outcomes

Storage Layer (`src/db.ts`)

SQLite with sqlite-vec extension for vector search.

FeatureConfiguration
ModeWAL (Write-Ahead Logging)
ConcurrencyNo advisory lock; single writer per AUDREY_DATA_DIR
Indexingsqlite-vec for vector similarity; FTS for full-text
IsolationOne directory per tenant required

The AUDREY_PRAGMA_DEFAULTS environment variable (default 1) applies custom PRAGMA tuning. Set to 0 to revert to better-sqlite3 defaults.

Embedding Engine (`src/embedding.ts`)

ONNX runtime for local vector embedding without external API calls by default.

FeatureBehavior
WarmupBackground embedding warmup at MCP boot (skippable with AUDREY_DISABLE_WARMUP=1)
Cold-startFirst encode: ~525ms cold, ~28ms warm
VerbosityAUDREY_ONNX_VERBOSE=1 restores EP-assignment warnings
ReuseValidation, interference, affect resonance reuse main content vector

Performance targets (v0.22.0):

  • Encode p50: 15.2ms (40% faster than prior)
  • Hybrid recall p50: 14.3ms (2.1x faster)
  • Embedding reuse eliminated 3 of 4 redundant calls

Memory Model Architecture

graph TD
    subgraph "Memory Types"
        EPI[Episodic<br/>Specific observations,<br/>tool results, facts]
        SEM[Semantic<br/>Consolidated principles<br/>from evidence]
        PROC[Procedural<br/>Remembered ways to act,<br/>avoid, retry, verify]
    end
    
    subgraph "Memory Properties"
        AFF[Affect & Salience<br/>Emotional weight, importance]
        DEC[Interference & Decay<br/>Stale/conflicting lose authority]
        CON[Contradiction Handling<br/>Competing claims tracked]
    end
    
    EPI --> AFF
    SEM --> AFF
    PROC --> AFF
    EPI --> DEC
    SEM --> CON
    CON --> DEC

Memory Types

TypeDescriptionExample
EpisodicSpecific observations, tool results, session facts"Stripe returns HTTP 429 above 100 req/s"
SemanticConsolidated principles from repeated evidence"Always check rate limits before batch operations"
ProceduralRemembered ways to act, avoid, retry, verify"Retry with exponential backoff on network failures"

Capsule Generation (`src/capsule.ts`)

The capsule endpoint assembles a turn-sized memory packet with sections:

graph TD
    CAP[POST /v1/capsule] --> SEC[Section Assigner]
    SEC --> R[recent_changes<br/>Created/reinforced recently]
    SEC --> M[must_follow<br/>Critical rules]
    SEC --> P[procedures<br/>Procedural memories]
    SEC --> U[user_preferences<br/>Stated or tagged preferences]
    SEC --> RK[risks<br/>Warnings and recent failures]
    SEC --> UN[uncertain_or_disputed<br/>Disputed or low-confidence]

Each section includes a reason field explaining why the entry was included. Recent tool failures (last 7 days) are automatically added to risks when includeRisks is enabled.

Audrey Guard Safety Loop

sequenceDiagram
    participant Agent
    participant Guard as Audrey Guard
    participant Memory as Memory Store
    participant LLM as LLM Provider
    
    Agent->>Guard: tool + action
    Guard->>Memory: recall(relevant)
    Memory-->>Guard: context entries
    Guard->>LLM: preflight check
    LLM-->>Guard: decision + evidence
    Guard-->>Agent: allow/warn/block + reasoning
    Agent->>Guard: outcome (success/failure/wrong)
    Guard->>Memory: record outcome

Guard Modes

ModeBehavior
allowAction proceeds normally
warnAction allowed but user notified
blockAction prevented with evidence
cautionMaps to warn display

CLI usage:

audrey guard --tool Bash "npm run deploy"
audrey guard --hook --fail-on-warn  # For hook integration

Validation Pipeline

The causal validation system (via src/causal.ts and src/validate.ts) evaluates whether stored memories actually helped:

  1. Confidence scoring uses reinforcement formula from confidence.ts
  2. Evidence tracking updates usage_count and last_used_at
  3. Outcome classification: used, helpful, wrong
  4. Impact metrics aggregated by memory type

Deployment Architecture

graph LR
    subgraph "Deployment Options"
        NPM[npm package<br/>npx audrey]
        DOCKER[Docker<br/>audrey-runtime]
        COMPOSE[Docker Compose<br/>Full stack]
        HOST[MCP Config<br/>Host-specific]
    end
    
    subgraph "Environment"
        ENV1[AUDREY_DATA_DIR]
        ENV2[AUDREY_LLM_PROVIDER]
        ENV3[AUDREY_EMBEDDING_PROVIDER]
        ENV4[AUDREY_API_KEY]
    end
    
    NPM --> ENV1
    DOCKER --> ENV1
    COMPOSE --> ENV1
    HOST --> ENV2
    HOST --> ENV3

Interface Options by Agent

AgentIntegration
Claude Codenpx audrey install --host claude-code + hook-config
Claude DesktopMCP config via npx audrey mcp-config generic
CodexMCP config via npx audrey mcp-config codex
CursorMCP config
WindsurfMCP config
VS CodeMCP config
JetBrainsMCP config
OllamaMCP config
CustomREST API or JS/Python SDK

REST Sidecar Security

ConfigurationBind AddressAuth Required
Default127.0.0.1:7437No (loopback)
Production0.0.0.0:7437AUDREY_API_KEY required
Unsafe overrideAny hostAUDREY_ALLOW_NO_AUTH=1 (not recommended)

AUDREY_HOST env var explicitly opts in to network exposure.

CLI Architecture

graph TD
    CLI[audrey CLI] --> PARSE[Argument Parser]
    PARSE --> KNOWN[Known Subcommands]
    KNOWN --> INSTALL[install]
    KNOWN --> UNINSTALL[uninstall]
    KNOWN --> MCP[mcp-config]
    KNOWN --> HOOK[hook-config]
    KNOWN --> DOCTOR[doctor]
    KNOWN --> DEMO[demo]
    KNOWN --> GUARD[guard]
    KNOWN --> DREAM[dream]
    KNOWN --> REEMBED[reembed]
    KNOWN --> PROMOTE[promote]
    KNOWN --> IMPACT[impact]
    KNOWN --> UNKNOWN[Unknown/No subcommand]
    
    UNKNOWN --> TTY{Human TTY?}
    TTY -->|Yes| HELP[Print help]
    TTY -->|No| MCP_SERVER[Start MCP server]
    
    INSTALL --> HOST[Host-specific config]
    HOOK --> APPLY[Apply hooks to settings]
    PROMOTE --> WRITES[Write to project files]

Key CLI Commands

CommandPurpose
audrey doctorDiagnose configuration issues
audrey statusShow runtime health
audrey demoRun interactive demonstration
audrey guardCheck action against memory
audrey installRegister Audrey with host
audrey mcp-configGenerate MCP server configuration
audrey hook-configGenerate agent hook configuration
audrey dreamTrigger consolidation and decay
audrey reembedRe-embed all memories
audrey promoteWrite memories to project rules
audrey impactShow memory effectiveness report

Configuration Environment Variables

VariableDefaultPurpose
AUDREY_DATA_DIRSystem tempMemory storage directory
AUDREY_HOST127.0.0.1REST sidecar bind address
AUDREY_PORT7437REST sidecar port
AUDREY_API_KEYunsetBearer token for non-loopback
AUDREY_LLM_PROVIDERConfiguredLLM for causal/validation
AUDREY_EMBEDDING_PROVIDERConfiguredEmbedding generation
AUDREY_EMBEDDING_MODELConfiguredModel name for embeddings
AUDREY_EMBEDDING_DIMConfiguredVector dimensions
AUDREY_CONTEXT_BUDGET_CHARS4000Capsule character budget
AUDREY_DISABLE_WARMUP0Skip embedding warmup
AUDREY_DEBUG0Enable MCP debug logs
AUDREY_PROFILE0Emit per-stage timings
AUDREY_PROMOTE_ROOTSunsetAllowed write roots for promote
AUDREY_ENABLE_ADMIN_TOOLS0Enable export/import/forget

SDK Architecture

JavaScript SDK

Direct TypeScript/Node import from audrey package:

import Audrey from 'audrey';

const brain = new Audrey({ 
  baseUrl: 'http://127.0.0.1:7437',
  agent: 'support-agent'
});

await brain.encode('Deploy failed due to OOM', { 
  source: 'direct-observation' 
});

const results = await brain.recall('deploy failure', { limit: 5 });

Python SDK (`audrey-memory`)

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

Async clients available via AsyncAudrey / asyncio support.

Release Readiness Gates

graph LR
    GATE[release:gate] --> CI[CI Workflow]
    GATE --> PY[python:release:check]
    GATE --> BENCH[bench:guard:*]
    GATE --> DOCTOR[audrey doctor]
    GATE --> DEMO[audrey demo]
CheckPurpose
npm run release:gateFull release readiness checklist
npm run python:release:checkPython artifact verification
npm run bench:guard:cardGuard benchmark suite
npm run bench:guard:validateValidation benchmarks
npx audrey doctorRuntime diagnostics
npx audrey demoFunctional verification

Data Flow: Encode to Recall

sequenceDiagram
    participant Client
    participant API as REST API<br/>/v1/encode
    participant Audrey as Audrey Core
    participant Embed as Embedding Engine
    participant DB as SQLite + vec
    
    Client->>API: POST /v1/encode<br/>content, source, tags
    API->>Audrey: encode(content, options)
    Audrey->>Embed: generateEmbedding(content)
    Embed-->>Audrey: vector[1536]
    Audrey->>DB: INSERT memory + vector
    DB-->>Audrey: memory_id
    Audrey-->>API: { id, confidence, ... }
    API-->>Client: { id, ... }
sequenceDiagram
    participant Client
    participant API as REST API<br/>/v1/recall
    participant Audrey as Audrey Core
    participant Embed as Embedding Engine
    participant DB as SQLite + vec
    participant Causal as Causal Validator
    
    Client->>API: POST /v1/recall<br/>query, limit, scope
    API->>Audrey: recall(query, options)
    Audrey->>Embed: generateEmbedding(query)
    Embed-->>Audrey: query_vector
    Audrey->>DB: hybrid search<br/>vector_similarity + FTS
    DB-->>Audrey: [entries]
    Audrey->>Causal: score(entries)
    Causal-->>Audrey: [scored_entries]
    Audrey-->>API: { results, ... }
    API-->>Client: { results, ... }

Key Architectural Decisions

DecisionRationale
Local-only storageEliminates dependency on external services; ensures data isolation
SQLite + sqlite-vecProven reliability, no separate vector DB required
WAL mode without advisory lockPerformance for single-process; isolation required for multi-agent
Separate AUDREY_DATA_DIR per tenantHard isolation boundary; prevents cross-tenant contamination
REST sidecar defaulting to loopbackSecurity by default; non-loopback requires explicit opt-in
Embedding warmup at bootEliminates cold-start penalty (~18.7x improvement)
Closed-loop validationClosed feedback loop lifts autopilot ALIVE dimension

Source: https://github.com/Evilander/Audrey / Human Manual

Memory Model

Related topics: System Architecture, Core Memory Operations, Preflight and Reflexes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Episodic Memory

Continue reading this section for the full explanation and source context.

Section Semantic Memory

Continue reading this section for the full explanation and source context.

Section Procedural Memory

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Core Memory Operations, Preflight and Reflexes

Memory Model

Audrey's Memory Model is a cognitive-inspired system that provides AI agents with persistent, evolving memory capabilities. Unlike simple vector databases, it implements a multi-layered memory architecture that mirrors human memory structures—episodic, semantic, and procedural—while incorporating affect, salience, and decay mechanisms to ensure memories remain relevant and actionable.

Architecture Overview

The Memory Model consists of several interconnected subsystems that work together to store, retrieve, consolidate, and forget information over time.

graph TD
    A[User/Agent Input] --> B[Episodic Memory]
    B --> C[Consolidation Process]
    C --> D[Semantic Memory]
    C --> E[Procedural Memory]
    D --> F[Confidence Scoring]
    E --> F
    B --> G[Affect Module]
    F --> G
    G --> H[Salience Calculation]
    H --> I[Recall Ranking]
    I --> J[Preflight Check]
    J --> K[Guard Decision]
    L[Interference] -.->F
    L -.->I
    M[Decay Engine] -.->D
    M -.->E

Sources: README.md

Memory Types

Audrey distinguishes between three primary memory types, each serving a distinct role in agent cognition.

Episodic Memory

Episodic memory stores specific observations, tool results, preferences, and session facts. These are the raw recordings of events and interactions that agents experience directly.

PropertyDescription
memory_typeepisode
sourcedirect-observation, told-by-user, retrieved
confidenceInitial high confidence that decays over time
retrieval_countNumber of times this memory was recalled

Sources: src/capsule.ts

Semantic Memory

Semantic memory represents consolidated principles extracted from repeated evidence. These memories encode general knowledge and learned rules that persist beyond specific sessions.

PropertyDescription
memory_typesemantic
confidenceDerived from supporting episode frequency
supporting_countNumber of episodes supporting this principle
challenge_countNumber of contradictory episodes

Sources: src/causal.ts

Procedural Memory

Procedural memory stores remembered ways to act, avoid, retry, or verify. These encode action patterns and procedures that agents have learned through experience.

PropertyDescription
memory_typeprocedural
tagsprocedure, retry, avoid, verify
confidenceReinforced by successful outcomes

Sources: src/capsule.ts

Confidence System

The confidence system is the foundational mechanism that determines memory reliability and recall priority. It incorporates multiple signals including recency, reinforcement, and affect.

Confidence Calculation

graph LR
    A[Base Confidence] --> B[Recency Decay]
    B --> C[Reinforcement Boost]
    C --> D[Affect Adjustment]
    D --> E[Interference Penalty]
    E --> F[Final Confidence]

Sources: src/confidence.ts

Recency Decay

Memory confidence decreases over time through a half-life decay mechanism. Memories become less authoritative unless reinforced through retrieval or validation.

// From src/confidence.ts
recencyDecay(halfLifeDays: number, createdAt: Date): number
ParameterTypeDescription
halfLifeDaysnumberDays until confidence halves
createdAtDateMemory creation timestamp

The decay function throws RangeError when halfLifeDays <= 0 to prevent NaN or Infinity results.

Sources: src/confidence.ts

Reinforcement Formula

Validation outcomes reinforce or diminish memory confidence through the feedback loop:

OutcomeEffect
helpfulIncreases salience, bumps retrieval_count for semantic/procedural
wrongDecreases salience, bumps challenge_count for semantic
usedNeutral signal with smaller salience delta

The math reuses the existing reinforcement formula from confidence.ts.

Sources: CHANGELOG.md

Consolidation System

Consolidation transforms episodic memories into semantic and procedural knowledge through periodic processing, often called "dream" mode.

Consolidation Workflow

graph TD
    A[Nightly Dream Process] --> B[Identify Repeated Episodes]
    B --> C[Extract Common Patterns]
    C --> D[Generate Semantic Principles]
    C --> E[Extract Procedures]
    D --> F[Create New Semantic Memory]
    E --> G[Create/Update Procedural Memory]
    F --> H[Link Supporting Episodes]
    G --> H

Sources: README.md

Consolidation Implementation

The consolidation process runs through memory_dream and is scheduled to ensure that consolidation and decay remain current.

// From src/consolidate.ts - conceptual interface
async function consolidate(audrey: Audrey, options?: ConsolidateOptions): Promise<ConsolidationResult>

Consolidation moves SELECTs inside the surrounding transaction to prevent concurrent writers from slipping rows in or out between read and write.

Sources: CHANGELOG.md

Decay Engine

The decay engine implements forgetting curves that reduce memory authority over time, ensuring stale information doesn't dominate recall.

Decay Mechanism

graph LR
    A[Time Passes] --> B{Still Being Used?}
    B -->|Yes| C[Decay Paused]
    B -->|No| D[Gradual Decay]
    D --> E[Confidence Decreases]
    E --> F[Memory Becomes Less Authoritative]

Sources: src/decay.ts

Decay Parameters

ParameterDefaultPurpose
halfLifeDaysConfigurableBase decay rate
minConfidence0.1Floor value
decayEnabledtrueGlobal on/off

Decay applies to semantic and procedural memories differently, with semantic memories decaying faster unless reinforced.

Sources: src/decay.ts

Affect and Salience

Affect (emotional weight and importance) influences salience, determining which memories demand attention and which fade into background knowledge.

Affect Module

graph TD
    A[Memory Event] --> B[Detect Emotional Signals]
    B --> C[Calculate Valence]
    B --> D[Calculate Arousal]
    C --> E[Determine Mood State]
    D --> E
    E --> F[Affect Boost/Penalty]
    F --> G[Effective Salience]

Sources: src/affect.ts

Salience Calculation

Effective salience is clamped to the range [0, 1] to prevent unbounded values from extreme arousal boosts. The formula considers:

  • Memory type (episodic, semantic, procedural)
  • Confidence level
  • Recency
  • Emotional valence and arousal
// From src/affect.ts
effectiveSalience(baseSalience: number, arousalBoost: number): number

The timeDeltaDays function no longer propagates NaN from invalid created_at timestamps.

Sources: src/affect.ts

Interference Handling

Interference prevents conflicting or competing memories from silently overwriting each other, maintaining an accurate picture of contradictory knowledge.

Interference Types

graph TD
    A[New Memory] --> B{Conflicting Memory Exists?}
    B -->|Yes| C[Track Contradiction]
    B -->|No| D[Normal Storage]
    C --> E[Disputed State]
    E --> F[Monitor Both]
    F --> G[Resolution Through Validation]

Sources: src/interference.ts

Memory States for Contradictions

StateDescription
activeDefault stable state
disputedCompeting claims detected
context_dependentTruth depends on context
supersededOlder knowledge replaced

When memories have contradictory content, both are preserved with appropriate states rather than silently overwriting.

Sources: src/capsule.ts

Causal Inference

The causal module extracts cause-effect relationships from episodic memory patterns, enabling agents to understand why certain actions lead to certain outcomes.

Causal Analysis

// From src/causal.ts - conceptual interface
async function analyzeCausalLinks(episodes: Episode[]): Promise<CausalRelationship[]>

The causal module validates LLM response shapes before reading fields and rejects non-finite confidence values.

Sources: src/causal.ts

Causal Memory Properties

PropertyDescription
cause_idMemory that triggers outcome
effect_idResulting memory
confidenceCausal link strength
evidence_countEpisodes supporting this link

Validation Feedback Loop

The closed-loop feedback system enables continuous improvement of memory accuracy through agent validation.

Validation Flow

graph TD
    A[Memory Recall] --> B[Agent Uses Memory]
    B --> C[Validation Request]
    C --> D{Helpful?}
    D -->|Yes| E[Reinforce: helpful]
    D -->|No| F{Wrong?}
    F -->|Yes| G[Diminish: wrong]
    F -->|No| H[Mark: used]
    E --> I[Update Salience & Stats]
    G --> I
    H --> I

Sources: CHANGELOG.md

Validation API

EndpointMethodDescription
/v1/validatePOSTCanonical validation endpoint
/v1/mark-usedPOSTLegacy alias (defaults to outcome=used)

The memory_validate MCP tool accepts outcomes: helpful, wrong, or used.

Sources: CHANGELOG.md

Recall and Retrieval

Memory recall uses hybrid retrieval combining vector similarity and full-text search to balance precision and recall.

Retrieval Modes

ModeDescription
hybridVector similarity + FTS (default)
vectorFTS-bypass fast path

The hybrid mode was the default since v0.22.0, replacing the removed hybrid_strict mode (which was a silent alias with no behavioral difference).

Sources: CHANGELOG.md

Recall Factors

When ranking results, Audrey considers:

  1. Semantic similarity - Vector distance from query
  2. Recency - Time since creation or last retrieval
  3. Confidence - Current confidence score
  4. Salience - Effective importance (affect-adjusted)
  5. Agent relevance - Scope and ownership

Tool-Trace Learning

Audrey learns from tool execution traces, converting tool results into memory events that inform future actions.

Tool-Trace Memory Cycle

graph TD
    A[Tool Execution] --> B[Capture Tool Trace]
    B --> C[Extract Results & Errors]
    C --> D{Successful?}
    D -->|Yes| E[Encode Success Pattern]
    D -->|No| F[Encode Failure Pattern]
    E --> G[Episodic Memory]
    F --> G
    G --> H[Consolidation]
    H --> I[Procedural Memory]

The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes.

Sources: README.md

Memory Capsule

The Memory Capsule provides a turn-sized memory packet containing categorized sections relevant to the current context.

Capsule Sections

SectionContent
must_followTrusted rules and critical constraints
risksIdentified dangers and warnings
proceduresKnown action procedures
user_preferencesStated and inferred preferences
uncertain_or_disputedContested or low-confidence knowledge
recent_changesFreshly updated memories
project_factsDefault for semantic/episodic

Sources: src/capsule.ts

Capsule Generation

Capsule sections are determined by memory type, tags, source trust level, state, confidence, and recency:

// From src/capsule.ts
determineSections(
  entry: MemoryEntry,
  result: RecallResult,
  tags: string[],
  recentWindowMs: number
): Array<keyof MemoryCapsule['sections']>

Trusted sources include direct-observation and told-by-user; these can populate must_follow sections.

Sources: src/capsule.ts

Guard Integration

The Memory Guard uses the Memory Model to enforce pre-action checks, returning allow, warn, or block decisions with evidence.

Guard Decision Flow

graph TD
    A[Action Request] --> B[Preflight Check]
    B --> C[Recall Relevant Memory]
    C --> D[Apply Reflexes]
    D --> E{Blocking Reflex?}
    E -->|Yes| F[BLOCK]
    E -->|No| G{Warning Reflex?}
    G -->|Yes| H[ WARN]
    G -->|No| I[ALLOW]

The Guard decision reuses existing preflight and reflex machinery without performing two independent recall passes.

Sources: CHANGELOG.md

Summary

Audrey's Memory Model provides a comprehensive cognitive architecture for AI agents:

  • Multi-type storage with episodic, semantic, and procedural memories
  • Dynamic confidence that evolves through use and validation
  • Consolidation that transforms experience into knowledge
  • Decay that prevents stale information from dominating
  • Affect that weights memories by emotional importance
  • Interference tracking that maintains truth in the face of contradictions
  • Causal inference that extracts cause-effect relationships
  • Closed-loop validation that continuously improves accuracy

This architecture ensures agents remember what matters, forget what doesn't, and maintain coherent, actionable knowledge across sessions.

Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)

Audrey Guard

Related topics: Core Memory Operations, Preflight and Reflexes

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Components

Continue reading this section for the full explanation and source context.

Section Guard Decision Flow

Continue reading this section for the full explanation and source context.

Section Command Syntax

Continue reading this section for the full explanation and source context.

Related topics: Core Memory Operations, Preflight and Reflexes

Audrey Guard

Overview

Audrey Guard is the headline memory loop in the Audrey system—a memory-before-action enforcement mechanism that checks AI agents' intended operations against accumulated memory before execution. It serves as a firewall layer that can allow, warn, or block tool invocations based on historical evidence, prior failures, project rules, and risk patterns.

The Guard operates by retrieving relevant memories through semantic recall, evaluating them against the proposed action, and returning a structured decision with supporting evidence. This enables agents to avoid repeating past mistakes, respect project-specific rules, and make informed decisions grounded in durable context.

Sources: README.md

Purpose and Scope

Audrey Guard addresses a fundamental problem: agents forget the exact mistakes they made yesterday. They repeat broken commands, lose project-specific rules, miss contradictions, and treat every new session like a cold start.

Guard's scope encompasses:

ConcernDescription
Failure PreventionBlock or warn on repeated failures identified through memory_recall
Risk AwarenessSurface prior failures, risks, and warnings as preflight evidence
Rule EnforcementCheck must-follow rules and procedures before action
Evidence GenerationReturn structured decisions with provenance metadata
Closed-Loop ValidationValidate whether the memory helped after action execution

Sources: README.md

Architecture

High-Level Components

graph TD
    A[Agent Tool Call] --> B[Audrey Guard]
    B --> C[memory_preflight]
    C --> D[memory_recall]
    D --> E[SQLite Store<br/>Episodic + Semantic + Procedural]
    C --> F[Rule Evaluation]
    F --> G[Reflex Pattern Matching]
    C --> H[Decision Engine]
    H --> I[block<br/>warn<br/>allow]
    I --> J[Evidence Capsule]
    J --> K[Agent Action Execution]
    K --> L[memory_validate]
    L --> M[Outcome: helpful<br/>used<br/>wrong]
    M --> E

Guard Decision Flow

The Guard evaluates incoming tool actions through a multi-stage pipeline:

  1. Action Canonicalization - Normalize the tool name and action string
  2. Semantic Recall - Query memory store for relevant past experiences
  3. Risk Assessment - Evaluate prior failures, warnings, and risks
  4. Rule Matching - Check against must-follow rules and procedures
  5. Decision Synthesis - Combine signals into block/warn/allow verdict
  6. Evidence Packaging - Return decision with provenance and references

Sources: src/reflexes.ts

CLI Interface

Command Syntax

audrey guard --tool <tool_name> "<action_command>" [options]

Core Options

OptionDescriptionDefault
--tool <name>The tool category (e.g., Bash, Write, Edit)Required
<action>The specific action string to evaluateRequired
--cwd <path>Working directory for contextCurrent directory
--session-id <id>Session identifier for event correlationAuto-generated
--hookRun in hook mode (for agent integration)false
--fail-on-warnTreat warnings as errors (exit code non-zero)false
--strictEnable strict preflight evaluationfalse
--jsonOutput results as JSONfalse
--explainInclude detailed explanation in outputfalse
--include-capsuleEmbed full memory capsule in responsefalse

Sources: mcp-server/index.ts

Example Usage

# Block a repeated failed deploy
audrey guard --tool Bash "npm run deploy"

# Warn on risky file operations
audrey guard --tool Write --strict "database.sql"

# Hook mode for Claude Code integration
audrey guard --tool Bash --hook "rm -rf node_modules"

SDK Integration

Sync Client

import Audrey from 'audrey-memory';

const brain = new Audrey({
  base_url: 'http://127.0.0.1:7437',
  agent: 'support-agent',
});

const decision = await brain.beforeAction({
  tool: 'Bash',
  action: 'npm run deploy',
});

console.log(decision.decision); // 'block' | 'warn' | 'allow'
console.log(decision.evidence); // Array of MemoryEvidence
brain.close();

Preflight Options

OptionTypeDescription
toolstringTool category being evaluated
actionstringAction string to preflight
sessionIdstringCorrelation ID for event tracking
mode`'standard' \'strict'`Evaluation strictness
includeCapsulebooleanInclude full memory capsule
failureWindowHoursnumberHours to look back for failures
recentChangeWindowHoursnumberHours for recent-change rules

Sources: src/routes.ts

Decision Outcomes

Verdict Types

DecisionDescriptionAgent Behavior
blockAction is prohibited based on memoryMust not execute
warnAction has risk indicatorsShould pause and confirm
allowNo memory conflicts detectedMay proceed

Decision Display Mapping

function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
  if (result.decision === 'block') return 'block';
  if (result.decision === 'caution') return 'warn';
  return 'allow';
}

Sources: mcp-server/index.ts

Memory Preflight

The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes. It builds a structured preflight report containing:

Capsule Sections

SectionContent SourceTrigger Condition
recent_changesMemories within recent-change windowCreated or reinforced recently
must_followMust-follow rulesTagged as must-follow
proceduresProcedural memories + proceduresMatching query or tagged
user_preferencesUser-stated preferencesUser-told or tagged
risksRisk-tagged memories + recent failuresTagged risk or 7-day failures
uncertain_or_disputedLow-confidence or disputed memoriesLow confidence or disputed state

Sources: src/capsule.ts

Reflex System

Memory reflexes convert remembered evidence into trigger-response guidance that agents can follow.

Reflex Response Types

TypeDescription
blockStrict prohibition based on evidence
warnCaution signal with context
guideRecommended action or approach

Reflex Report Generation

function summarizeReflexes(decision: PreflightDecision, reflexes: MemoryReflex[]): string {
  const blocks = reflexes.filter(r => r.response_type === 'block').length;
  const warnings = reflexes.filter(r => r.response_type === 'warn').length;
  const guides = reflexes.filter(r => r.response_type === 'guide').length;
  // Returns human-readable summary
}

Sources: src/reflexes.ts

Validation Loop

After action execution, agents validate whether the memory helped:

Outcome Types

OutcomeMeaningEffect
helpfulMemory was correct and beneficialIncreases salience
usedMemory was referencedUpdates usage metrics
wrongMemory was incorrectTriggers decay or dispute

Validation Endpoint

POST /v1/event
{
  "outcome": "helpful",
  "receipt_id": "receipt-from-preflight",
  "evidence_feedback": {
    "evidence-id-1": "used",
    "evidence-id-2": "helpful"
  }
}

Sources: src/routes.ts

Failure Decay

Starting from version 1.0.1, Audrey Guard implements failure decay to prevent stale blocks:

ConfigurationDefaultBehavior
failureDecayDays7Same-action failures older than window treated as stale

To restore pre-1.0.1 blocking behavior (permanent blocks):

const controller = new MemoryController({
  failureDecayDays: 0,
});

Sources: CHANGELOG.md

Security Considerations

HTTP API Security

  • Default bind address changed from 0.0.0.0 to 127.0.0.1
  • Refuses to start on non-loopback without AUDREY_API_KEY or AUDREY_ALLOW_NO_AUTH=1
  • API key comparison uses crypto.timingSafeEqual to prevent timing attacks
  • /v1/recall and /v1/capsule no longer body-spread caller options

Sources: CHANGELOG.md

Hook Configuration Safety

The audrey promote --yes command refuses to write .claude/rules/*.md outside process.cwd() unless the target path is in AUDREY_PROMOTE_ROOTS. This prevents prompt-injection attacks via malicious MCP callers.

Tool Trace Handling

Tool traces are recorded through PostToolUse hooks with redaction applied:

  1. Redaction - Sensitive fields (API keys, tokens, credentials) are masked
  2. Action Key Generation - Deterministic ID for trace correlation
  3. Event Recording - Tool inputs/outputs stored with session context

Sources: mcp-server/index.ts

Demo Scenario: Repeated Failure

The repeated-failure demo demonstrates Guard's blocking behavior:

npx audrey demo --scenario repeated-failure

This no-key, no-network demo:

  1. Creates a temporary memory store
  2. Records a failed deploy with the fix
  3. Teaches Audrey the failure pattern
  4. Shows Guard blocking the repeated attempt with evidence

Sources: README.md

See Also

Source: https://github.com/Evilander/Audrey / Human Manual

Core Memory Operations

Related topics: Memory Model, Audrey Guard, Preflight and Reflexes, Data Storage

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Encode Process Flow

Continue reading this section for the full explanation and source context.

Section Encode Options

Continue reading this section for the full explanation and source context.

Section Source Types

Continue reading this section for the full explanation and source context.

Related topics: Memory Model, Audrey Guard, Preflight and Reflexes, Data Storage

Core Memory Operations

This page documents the fundamental memory operations in Audrey: encoding, recall, hybrid retrieval, capsule generation, and impact tracking. Together, these operations form the core pipeline that enables agents to store, retrieve, and learn from persistent memory across sessions.

Overview

Audrey's Core Memory Operations handle the complete lifecycle of memory within the system. The operations are designed around a local-first, SQLite-backed architecture that provides semantic search capabilities without requiring external vector databases or hosted services.

graph LR
    A[Encode] -->|store| B[(SQLite)]
    B -->|recall| C[Recall]
    C -->|hybrid| D[Hybrid Search]
    D -->|compose| E[Capsule]
    E -->|track| F[Impact]
    F -->|reinforce| A

The primary design goals are:

  • Durability: All memories persist in local SQLite storage
  • Semantic Search: Vector embeddings enable similarity-based recall
  • Hybrid Retrieval: Combines vector and keyword search for accuracy
  • Feedback Loop: Impact tracking enables continuous memory reinforcement

Memory Types

Audrey distinguishes between three primary memory types that influence retrieval behavior and storage strategy.

Memory TypeDescriptionTypical Content
episodicSpecific observations and session factsTool results, error messages, user feedback
semanticConsolidated principles extracted from evidenceLearned rules, best practices, project conventions
proceduralRemembered ways to act, avoid, or verifyDeployment procedures, recovery steps, verification commands

Each memory type has distinct promotion criteria. Procedural memories can be promoted to rules with lower evidence thresholds, while semantic memories require higher confidence and evidence counts before promotion.

Sources: src/recall.ts:15-17

Memory Encoding

The encoding operation transforms raw observations into persistent memory entries. When encoding, Audrey generates embeddings, assigns salience scores, and stores metadata that enables future retrieval.

Encode Process Flow

graph TD
    A[Input: Raw Text] --> B[Generate Embedding]
    B --> C[Calculate Salience]
    C --> D[Assign Memory Type]
    D --> E[Tag Analysis]
    E --> F[Store in SQLite]
    F --> G[Update Vector Index]

Encode Options

The encode operation accepts several configuration parameters:

ParameterTypeDefaultPurpose
sourcestring'direct-observation'Origin of the memory
memory_typestring'episodic'Classification of memory content
tagsstring[][]Categorical labels for filtering
wait_for_consolidationbooleanfalseOpt-in read-after-write semantics

Sources: src/encode.ts

Source Types

Memory sources indicate provenance and affect how memories are treated during recall:

SourceTrust LevelDescription
direct-observationHighAgent's own observations from tool execution
told-by-userHighExplicit user-provided information
inferredMediumAI-inferred conclusions
externalLowInformation from external systems

Trusted sources (direct-observation, told-by-user) can populate must-follow sections in capsules, while untrusted sources are flagged as uncertain or disputed.

Sources: src/capsule.ts:18-20

Memory Recall

Recall is the primary mechanism for retrieving relevant memories based on semantic similarity and keyword matching. The recall operation searches across all memory types using configurable retrieval strategies.

Retrieval Modes

Audrey supports three retrieval modes that determine how search results are computed:

ModeDescriptionUse Case
hybridCombines vector similarity with FTS keyword matching (default)Balanced accuracy for general queries
vectorPure semantic similarity using embeddingsWhen keywords are ambiguous
keywordFull-text search only, bypasses vector indexFast, keyword-exact matching

Sources: src/recall.ts:12-14

Recall Architecture

graph TD
    A[Query Input] --> B{Mode Check}
    B -->|hybrid| C[Vector Pass]
    B -->|hybrid| D[Keyword Pass]
    B -->|vector| E[Vector Pass Only]
    B -->|keyword| F[Keyword Pass Only]
    C --> G[Merge & Score]
    D --> G
    G --> H[Filter by Confidence]
    H --> I[Apply Filters]
    I --> J[Return Results]

Recall Options

The recall operation accepts a comprehensive set of filtering and result-shaping options:

ParameterTypeDefaultPurpose
minConfidencenumber0Minimum confidence threshold (0-1)
typesMemoryType[]['episodic', 'semantic', 'procedural']Memory types to search
limitnumber10Maximum results to return
includeProvenancebooleanfalseInclude source metadata
includeDormantbooleanfalseInclude decayed/inactive memories
tagsstring[]undefinedFilter by tags
sourcesstring[]undefinedFilter by source type
afterDateundefinedFilter memories created after date
beforeDateundefinedFilter memories created before date
includePrivatebooleanfalseInclude agent-private memories
retrievalstring'hybrid'Retrieval mode selection
scope`'agent' \'shared'`'agent'Memory scope filter

Sources: src/recall.ts:5-22

RecallFilters Structure

interface RecallFilters {
  tags?: string[];
  sources?: string[];
  after?: Date;
  before?: Date;
  agent?: string;  // Filtered by scope when scope === 'agent'
}

Filters are combined with AND logic—memories must match all specified filters to be included in results.

Agent and Scope Filtering

The scope parameter determines which memories are accessible:

  • agent (default): Only memories associated with the requesting agent
  • shared: Memories marked as shared across agents

When scope is 'agent', the agent filter is automatically set to the requesting agent's identity. This ensures memory isolation between different agents.

Hybrid search combines vector similarity and full-text search to achieve more accurate recall results than either method alone.

Hybrid Recall Pipeline

graph LR
    A[Query] --> B[Embedding Model]
    A --> C[FTS Index]
    B --> D[Vector Scores]
    C --> E[Keyword Scores]
    D --> F[Score Normalization]
    E --> F
    F --> G[Weighted Merge]
    G --> H[Ranked Results]

Score Merging Strategy

The hybrid approach normalizes scores from both vector and keyword passes before merging. This ensures that memories matched by keywords are not overshadowed by high vector similarity scores, and vice versa.

Sources: src/hybrid-recall.ts

Full-Text Search Integration

The FTS module provides keyword-based search capabilities:

interface FTSResult {
  memory_id: string;
  rank: number;
  snippet?: string;
}

FTS uses SQLite's built-in FTS5 extension for fast keyword matching. The FTS index is updated synchronously during encoding to ensure keyword search reflects current memory state.

Sources: src/fts.ts

Memory Capsule

The capsule is a turn-sized memory packet that organizes relevant memories into actionable sections for agent consumption. It synthesizes recall results into a structured format optimized for quick agent review.

Capsule Sections

SectionPurposeTrigger Conditions
must_followHigh-priority directivesTrusted source + must-follow tags
uncertain_or_disputedFlagged content requiring verificationLow confidence, disputed state, or untrusted source
risksKnown risks and hazardsRisk-related tags
proceduresStep-by-step instructionsProcedural memory type or procedure tags
user_preferencesUser-specific preferencesPreference tags or told-by-user source
project_factsConsolidated project knowledgeSemantic memories with no other section match
recent_changesRecently updated informationMemories within recent time window

Sources: src/capsule.ts:22-38

Section Determination Logic

graph TD
    A[Memory Entry] --> B{Source Trusted?}
    B -->|Yes| C{Has Must-Follow Tags?}
    B -->|No| D[Uncertain/Disputed]
    C -->|Yes| E[Must-Follow Section]
    C -->|No| F{Has Risk Tags?}
    D --> F
    F -->|Yes| G[Risks Section]
    F -->|No| H{Procedural Type?}
    H -->|Yes| I[Procedures Section]
    H -->|No| J{Has Preference Tags?}
    J -->|Yes| K[User Preferences]
    J -->|No| L{Uncertain State?}
    L -->|Yes| M[Uncertain/Disputed]
    L -->|No| N[Project Facts]

Capsule Structure

interface MemoryCapsule {
  generated_at: string;
  sections: {
    must_follow?: MemorySection;
    uncertain_or_disputed?: MemorySection;
    risks?: MemorySection;
    procedures?: MemorySection;
    user_preferences?: MemorySection;
    project_facts?: MemorySection;
    recent_changes?: MemorySection;
  };
}

Tag-Based Section Assignment

Capsule generation uses predefined tag sets to categorize memories:

Tag SetMatching Tags
MUST_FOLLOW_TAGSCritical directives that must be followed
RISK_TAGSRisk-related keywords
PROCEDURE_TAGSProcedure-related keywords
PREFERENCE_TAGSUser preference keywords

Sources: src/capsule.ts:12-15

Impact Tracking

Impact tracking closes the feedback loop by recording whether recalled memories proved useful. This enables continuous reinforcement of valuable memories and decay of misleading ones.

Outcome Types

OutcomeDescriptionEffect on Memory
helpfulMemory drove a correct actionIncreases salience, bumps retrieval_count
wrongMemory was misleadingDecreases salience, bumps challenge_count
usedMemory was referencedSmall positive salience delta

Sources: src/impact.ts

Impact Report Structure

interface ImpactReport {
  generatedAt: string;
  windowDays: number;
  totals: {
    episodic: number;
    semantic: number;
    procedural: number;
  };
  validatedTotal: number;
  validatedInWindow: number;
  byType: {
    episodic: { validated: number; recent: number };
    semantic: { validated: number; recent: number; challenged: number };
    procedural: { validated: number; recent: number };
  };
  outcomeBreakdownInWindow: {
    helpful: number;
    wrong: number;
    used: number;
  };
  topUsed: MemoryStat[];
  weakest: MemoryStat[];
  recentActivity: MemoryStat[];
}

Impact Metrics

The impact system tracks several key metrics:

  • usage_count: Number of times a memory was successfully used
  • salience: Computed importance score based on reinforcement history
  • validation events: Recorded outcomes linked to specific recall events
  • challenge_count: Number of times a memory was marked as wrong

This data feeds into consolidation and decay processes, ensuring that frequently useful memories remain prominent while stale or misleading memories lose authority over time.

Data Flow Summary

graph LR
    subgraph Encode
        A1[Text Input] --> A2[Embed]
        A2 --> A3[Salience]
        A3 --> A4[Store]
    end
    
    subgraph Recall
        B1[Query] --> B2[Hybrid Search]
        B2 --> B3[Score & Rank]
        B3 --> B4[Filter]
        B4 --> B5[Recall Results]
    end
    
    subgraph Capsule
        C1[Recall Results] --> C2[Section Analysis]
        C2 --> C3[Tag Matching]
        C3 --> C4[Capsule Output]
    end
    
    subgraph Impact
        D1[Agent Feedback] --> D2[Outcome Recording]
        D2 --> D3[Reinforce/Decay]
        D3 --> A3
    end
    
    A4 --> B5
    B5 --> C1
    C4 --> D1

Configuration Considerations

When deploying Audrey's core memory operations, consider these configuration points:

SettingRecommendationImpact
AUDREY_EMBEDDING_PROVIDERPin explicitlyDetermines embedding quality
AUDREY_LLM_PROVIDERPin explicitlyAffects consolidation quality
AUDREY_DATA_DIRSeparate per tenant/environmentEnsures isolation and backup simplicity
Retrieval modeUse hybrid for most casesBalances precision and recall
wait_for_consolidationEnable for critical writesGuarantees read-after-write consistency

The core memory operations interact with several supporting systems:

  • Guard: Uses preflight checks before tool execution
  • Reflexes: Trigger-response patterns derived from memory
  • Consolidation: Extracts semantic memories from episodic evidence
  • Decay: Reduces authority of stale memories over time
  • Promotion: Converts high-value memories to Claude rules

Sources: [src/recall.ts:15-17](https://github.com/Evilander/Audrey/blob/main/src/recall.ts)

Preflight and Reflexes

Related topics: Audrey Guard, Core Memory Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Response Types

Continue reading this section for the full explanation and source context.

Section Reflex Structure

Continue reading this section for the full explanation and source context.

Section Decision Flow

Continue reading this section for the full explanation and source context.

Related topics: Audrey Guard, Core Memory Operations

Preflight and Reflexes

Overview

Preflight and Reflexes form Audrey's core decision-making loop for AI agents. Before any tool action executes, Audrey performs a preflight check that consults memory to determine whether the action should be allowed, warned about, or blocked entirely.

The system operates as Audrey's "memory firewall"—a security and guidance layer that prevents agents from repeating mistakes, reinforces learned behaviors, and surfaces relevant context before sensitive operations. This mechanism transforms episodic and semantic memories into actionable guidance that agents can evaluate in real-time.

Architecture

graph TD
    A[Agent Action Request] --> B[Preflight Check]
    B --> C{Memory Recall}
    C --> D[Episodic Memory]
    C --> E[Semantic Memory]
    C --> F[Procedural Memory]
    D --> G[Memory Reflexes]
    E --> G
    F --> G
    G --> H{Decision?}
    H -->|Match Found| I[Return Reflex Result]
    H -->|No Match| J[Allow Action]
    I --> K{block}
    I --> L[warn]
    I --> M[guide]
    K --> N[Block with Evidence]
    L --> O[Warn with Guidance]
    M --> P[Proceed with Hints]

Memory Reflexes

Memory Reflexes are the atomic decision units within the Preflight system. Each reflex contains a trigger condition, a response type, and optional guidance content.

Response Types

Response TypeDecisionDescription
blockblockPrevents the action entirely; returns blocking evidence
warncautionAllows action but presents warning with recommendations
guideallowProvides informational guidance without blocking

Sources: src/reflexes.ts:1-50

Reflex Structure

interface MemoryReflex {
  response_type: 'block' | 'warn' | 'guide';
  triggered_by: string;       // Memory tag or rule identifier
  message: string;            // Human-readable explanation
  recommended_action?: string; // Suggested alternative
  memory_ids: string[];        // Source memories that triggered this reflex
  confidence: number;         // Reflex confidence score
}

Preflight Process

Decision Flow

The preflight process evaluates incoming actions against three memory types and returns a consolidated decision:

graph LR
    A[Action + Context] --> B[Tag Extraction]
    B --> C{Must-Follow Rules?}
    C -->|Yes| D[BLOCK or UNCERTAIN]
    C -->|No| E{Risk Tags?}
    E -->|Yes| F[Add to WARN]
    E -->|No| G{Procedures?}
    G -->|Yes| H[Add GUIDANCE]
    G -->|No| I{Preferences?}
    I -->|Yes| J[Include in Capsule]
    I -->|No| K[Default ALLOW]

Decision Types

DecisionMeaningExit Code Behavior
allowAction proceeds normallyContinue execution
cautionAction proceeds with warningLog warning, continue
blockAction is preventedReturn error, halt

Sources: mcp-server/index.ts:80-95

Memory Capsule Integration

Preflight builds a Memory Capsule—a structured context bundle that aggregates relevant memories by category. The capsule sections determine which memories appear in the agent's context window.

interface MemoryCapsule {
  sections: {
    must_follow: MemoryReflex[];        // Critical rules
    recent_changes: MemoryReflex[];     // New learnings
    procedures: MemoryReflex[];        // How-to guidance
    user_preferences: MemoryReflex[];   // Stated preferences
    risks: MemoryReflex[];              // Warnings and hazards
    uncertain_or_disputed: MemoryReflex[]; // Low-confidence or contested
    project_facts: MemoryReflex[];      // Relevant facts
  };
  triggered_by: string;
  generated_at: string;
}

Sources: src/capsule.ts:1-50

Tag-Based Classification

Memory reflexes are classified using tag matching against predefined tag sets:

Tag SetPurposeAssociated Section
MUST_FOLLOW_TAGSCritical rules that must be obeyedmust_follow
RISK_TAGSPotential hazards or warningsrisks
PROCEDURE_TAGSStep-by-step guidanceprocedures
PREFERENCE_TAGSUser-stated preferencesuser_preferences

Sources: src/capsule.ts:50-100

Building Reflex Reports

Report Generation

The buildReflexReport function constructs a complete preflight report from an action:

export async function buildReflexReport(
  audrey: Audrey,
  action: string,
  options: ReflexOptions = {},
): Promise<MemoryReflexReport>

Report Structure

interface MemoryReflexReport {
  decision: 'allow' | 'caution' | 'block';
  reflexes: MemoryReflex[];
  capsule: MemoryCapsule;
  summary: string;           // Human-readable summary
  triggered_at: string;      // ISO timestamp
  session_id?: string;      // Optional session context
}

Sources: src/reflexes.ts:80-120

Summarization Logic

The summarizeReflexes function generates human-readable summaries:

function summarizeReflexes(
  decision: PreflightDecision,
  reflexes: MemoryReflex[],
): string {
  const blocks = reflexes.filter(r => r.response_type === 'block').length;
  const warnings = reflexes.filter(r => r.response_type === 'warn').length;
  const guides = reflexes.filter(r => r.response_type === 'guide').length;
  
  // Returns format: "Stop: 2 blocking, 1 warning, 3 guidance matched."
  // Or: "Slow down: ..." or "Proceed: ..."
}

Validation Layer

Before reflexes are applied, the validation layer ensures response integrity:

Response Validation

// From src/validate.ts
// Validates LLM response shape before reading fields
// - Rejects non-object/array conditions
// - Only counts new evidence toward supporting_count
// - Throws on malformed response shapes

Sources: src/validate.ts

Validation Behavior

CheckInvalid ConditionBehavior
Response ShapeNon-object/arrayReject and throw
Evidence CountMissing supporting_countSkip from count
ConfidenceNon-finite valueReject in causal module

Sources: CHANGELOG.md

CLI Integration

Guard Command

The audrey guard command exposes preflight checks via terminal:

audrey guard --tool Bash "npm run deploy"

Guard Options

OptionDescription
--tool <name>Tool name being invoked
--action <command>Specific action/command
--cwd <path>Working directory
--session-id <id>Session identifier
--files <paths>Files affected by action
--jsonOutput results as JSON
--strictFail on warnings
--include-capsuleInclude full memory capsule
--explainShow reasoning breakdown

Sources: mcp-server/index.ts:40-70

Display Mapping

The CLI maps internal decisions to display messages:

function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
  if (result.decision === 'block') return 'block';
  if (result.decision === 'caution') return 'warn';
  return 'allow';
}

Configuration Options

Environment Variables

VariableDefaultPurpose
AUDREY_CONTEXT_BUDGET_CHARS4000Memory capsule character budget
AUDREY_ENABLE_ADMIN_TOOLS0Enable export/import/forget routes
AUDREY_DEBUG0Print MCP info logs

Runtime Options

interface ReflexOptions {
  agent?: string;                    // Agent identifier
  sessionId?: string;               // Session context
  includeCapsule?: boolean;          // Include full capsule
  includePreflight?: boolean;       // Include preflight details
  context?: Record<string, string>; // Additional context
  mood?: MoodConfig;                // Emotional context
}

Memory Types and Section Assignment

Assignment Logic

graph TD
    A[Memory Entry] --> B{Source Trusted?}
    B -->|Yes + Must-Follow Tags| C[must_follow section]
    B -->|No + Must-Follow Tags| D[uncertain_or_disputed]
    B -->|Risk Tags| E[risks section]
    B -->|Procedural Type/Tags| F[procedures section]
    B -->|Preference Tags| G[user_preferences section]
    A --> H{State or Low Confidence?}
    H -->|disputed/context_dependent/confidence<0.55| I[uncertain_or_disputed]
    A --> J{Recent Window?}
    J -->|Yes| K[recent_changes section]
    J -->|No| L[Default: project_facts]

Threshold Values

ConditionThresholdSection Assignment
Confidence (disputed)< 0.55uncertain_or_disputed
Recent Window7 daysrecent_changes
Tool Failure7 daysrisks

Data Flow Example

sequenceDiagram
    participant Agent
    participant MCP as MCP Server
    participant Audrey
    participant Memory
    participant Reflex

    Agent->>MCP: tool_use(Bash, "rm -rf /")
    MCP->>Audrey: buildPreflight(audrey, action)
    Audrey->>Memory: recall(action, context)
    Memory-->>Audrey: MemoryReflex[]
    Audrey->>Reflex: classifyAndScore(reflexes)
    Reflex-->>Audrey: MemoryReflexReport
    Audrey-->>MCP: PreflightDecision
    MCP-->>Agent: block/caution/allow response

Security Considerations

HTTP Endpoint Protection

The preflight system includes security hardening for HTTP access:

  • REST endpoints default to loopback-only binding
  • API key comparison uses crypto.timingSafeEqual to prevent timing attacks
  • Options like includePrivate: true cannot be passed via HTTP bodies
  • Non-loopback binding requires explicit AUDREY_API_KEY

Sources: src/routes.ts

Recall Sanitization

HTTP /v1/recall and /v1/capsule endpoints sanitize input through sanitizeRecallOptions():

// Allowed keys only
const ALLOWED_KEYS = ['limit', 'agent', 'tags', 'sources', 'after', 'before', 'context', 'mood', 'retrieval', 'scope'];

Any keys not in the allowlist are silently dropped before processing.

ComponentFileRole
Rules Compilersrc/rules-compiler.tsCompiles memories into Claude rules
Validationsrc/validate.tsValidates LLM response integrity
Impact Trackingsrc/impact.tsTracks reflex effectiveness over time
Memory Capsulesrc/capsule.tsStructures context bundles

See Also

Sources: [src/reflexes.ts:1-50](https://github.com/Evilander/Audrey/blob/main/src/reflexes.ts)

Data Storage

Related topics: System Architecture, Core Memory Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Technology Stack

Continue reading this section for the full explanation and source context.

Section Database Schema Design

Continue reading this section for the full explanation and source context.

Section Episodic Memory Storage

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Core Memory Operations

Data Storage

Overview

Audrey's data storage layer is built as a local-first, SQLite-backed persistence system designed for AI agent memory continuity. The storage architecture eliminates external database dependencies while providing vector similarity search capabilities through sqlite-vec, enabling semantic memory retrieval without cloud infrastructure.

The storage system serves as the foundation for Audrey's multi-type memory model, supporting episodic, semantic, and procedural memory with built-in confidence tracking, contradiction handling, and temporal decay mechanisms.

Storage Architecture

Core Technology Stack

ComponentTechnologyPurpose
Primary DatabaseSQLiteStructured memory storage, ACID transactions
Vector Searchsqlite-vecSemantic similarity search on embeddings
Data DirectoryAUDREY_DATA_DIRTenant/environment isolation boundary

The storage backend runs entirely locally, requiring no hosted database services. Each tenant or environment should use a dedicated AUDREY_DATA_DIR to maintain isolation boundaries.

Sources: README.md

Database Schema Design

Audrey maintains three primary memory tables that correspond to its memory model:

erDiagram
    MEMORIES ||--o{ VECTORS : contains
    MEMORIES {
        string id PK
        string content
        string memory_type
        float confidence
        float salience
        string state
        int evidence_count
        int usage_count
        timestamp created_at
        timestamp last_used_at
    }
    VECTORS {
        int rowid
        float vector
        text content
    }

#### Memory Type Storage

Memory TypeDescriptionKey Attributes
episodicSpecific observations, tool results, session factssource, tags, created_at
semanticConsolidated principles from repeated evidenceevidence_count, supporting_count, contradicting_count
proceduralRemembered procedures, actions to avoid or retryusage_count, failure_prevented, tags

Sources: src/promote.ts

Memory Model Implementation

Episodic Memory Storage

Episodic memories capture specific observations and session-level facts. These entries are created during direct agent interactions and tool executions.

Key storage characteristics:

  • High-volume insertion during active sessions
  • Temporal ordering via created_at timestamps
  • Tag-based categorization for filtered retrieval
  • Source attribution (direct-observation, told-by-user)

Semantic Memory Storage

Semantic memories represent consolidated principles extracted from accumulated episodic evidence. The promotion system converts episodic memories into semantic rules when confidence thresholds are met.

The promotion criteria for semantic memories include:

  • Minimum evidence count threshold (default: 3)
  • Zero contradicting evidence
  • State must be active

Sources: src/promote.ts:78-92

Procedural Memory Storage

Procedural memories track action sequences and their outcomes. These are distinguished by usage tracking and failure prevention metrics.

Procedural candidates are promoted when:

  • evidence_count >= minEvidence
  • contradicting_count === 0
  • retrieval_count > 0 OR failure_prevented > 0

Confidence and Salience Tracking

Confidence Scoring

Confidence scores are computed from supporting versus contradicting evidence:

confidence = supporting / max(evidence, 1)

The confidence value is clamped to the range [0, 1] to prevent invalid states. Negative salience values from malformed arousal calculations are also clamped.

Sources: CHANGELOG.md

Salience System

Salience represents the importance and emotional weight of memories, influencing recall priority. The effectiveSalience calculation factors in:

  • Base salience from evidence strength
  • Temporal decay over time
  • Arousal/affect resonance from recent memories
  • Validation feedback (helpful, wrong, used outcomes)

Validation Feedback Loop

The closed-loop validation system updates salience based on memory utility:

OutcomeSalience EffectCounts Updated
helpfulIncreasesretrieval_count, salience
wrongDecreaseschallenge_count (semantic only)
usedNeutral/slightusage_count

Sources: CHANGELOG.md

Hybrid Recall Architecture

Audrey implements a hybrid retrieval strategy combining vector similarity with keyword matching:

graph TD
    A[Query Input] --> B[Embedding Provider]
    B --> C[Vector Similarity Search]
    A --> D[Full-Text Search FTS]
    C --> E[Confidence Scoring]
    D --> E
    E --> F[Memory Filtering]
    F --> G[Ranked Results]

#### Retrieval Modes

ModeBehavior
hybrid (default)Combines vector + FTS for balanced recall
vectorPure semantic similarity, bypasses FTS
keywordSkips vector pass, uses FTS only

The vector mode serves as a fast path when FTS overhead is unacceptable.

Sources: src/recall.ts

Filtering Capabilities

Recall operations support multiple filter dimensions:

  • tags: Array of tag values to match
  • sources: Array of source identifiers
  • after/before: Temporal bounds via ISO timestamps
  • scope: shared or agent-scoped memories
  • types: Filter by memory type (episodic/semantic/procedural)

Private Memory Isolation

The includePrivate flag controls access to agent-specific private memories. The HTTP API implements an allowlist-based sanitizer (sanitizeRecallOptions()) that prevents bypassing private-memory ACL controls through body options.

Sources: src/routes.ts

Data Lifecycle

Consolidation and Decay

The audrey dream command triggers memory consolidation:

  1. Episodic memories are evaluated for principle extraction
  2. Low-confidence or conflicting memories undergo decay
  3. Stale memories lose retrieval authority over time
  4. Contradicting claims are tracked rather than silently overwritten

Sources: README.md

Rollback Operations

The rollback system (src/rollback.ts) updates memories with verification:

  • Checks .changes to confirm affected rows
  • Aggregates real counts rather than assuming success
  • Reports failure when targeted IDs don't exist

Sources: CHANGELOG.md

Reembedding

When embedding models or dimensions change, reembedding regenerates all vector representations:

  • Chunks embeddings into 256-row batches
  • Labels failures by kind and row range
  • Provides clear error messages for partial failures

Sources: CHANGELOG.md

Import and Export

Data Portability

Audrey supports full data export and import for:

  • Snapshot restoration to fresh stores
  • Backup before configuration changes
  • Migration between environments

Exported snapshots should only be restored into empty Audrey stores with fresh AUDREY_DATA_DIR to prevent data corruption.

Sources: python/README.md

Export Process

Export operations create portable snapshots containing:

  • All memory records (episodic, semantic, procedural)
  • Associated metadata (timestamps, confidence scores)
  • Configuration state

Import Validation

Import operations verify store emptiness before restoration:

isDatabaseEmpty() // Checks both records and vector tables

Sources: CHANGELOG.md

Security Considerations

Credential Protection

Raw credentials and API keys must be excluded from encoded memory content. Audrey provides redaction functionality to prevent sensitive data exposure:

const SENSITIVE_KEY_PATTERN = /(password|secret|api[_-]?key|auth[_-]?token|...)$/i;

Sources: src/redact.ts

API Security

  • Audrey serve defaults to binding 127.0.0.1 (previously 0.0.0.0)
  • Non-loopback hosts require AUDREY_API_KEY or explicit AUDREY_ALLOW_NO_AUTH=1
  • HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks

Sources: CHANGELOG.md

Production Recommendations

RecommendationRationale
Set one AUDREY_DATA_DIR per tenantIsolation boundary
Pin embedding and LLM providersReproducibility
Backup before provider changesData integrity
Keep credentials out of memory contentSecurity
Use AUDREY_API_KEY for network exposureAccess control

Configuration

Environment Variables

VariableDefaultPurpose
AUDREY_DATA_DIR-Data directory path (required for isolation)
AUDREY_EMBEDDING_PROVIDER-Embedding model provider
AUDREY_LLM_PROVIDER-LLM provider for memory operations
AUDREY_API_KEY-API authentication key
AUDREY_HOST127.0.0.1Network binding address
AUDREY_ALLOW_NO_AUTH0Allow unauthenticated access

Sources: README.md

Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)

MCP Server

Related topics: System Architecture, REST API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Context

Continue reading this section for the full explanation and source context.

Section Server Initialization Flow

Continue reading this section for the full explanation and source context.

Section Core Server Setup

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, REST API

MCP Server

Overview

The Audrey MCP Server is a Model Context Protocol (MCP) stdio server that provides a local-first memory layer for AI agents. It enables agents to encode experiences into persistent memory, recall relevant context before actions, and maintain a durable memory state across sessions. Sources: README.md

The server exposes 20+ tools plus status, recent, and principles resources, along with briefing, recall, and reflection prompts. It communicates via stdio (standard input/output), making it compatible with MCP-compatible hosts like Claude Code, Claude Desktop, Cursor, Windsurf, and VS Code. Sources: README.md

Architecture

System Context

graph TD
    subgraph "MCP Hosts"
        A[Claude Code]
        B[Claude Desktop]
        C[Cursor]
        D[Windsurf]
        E[VS Code]
        F[Other MCP Clients]
    end
    
    subgraph "Audrey MCP Server"
        G[MCP stdio Server]
        H[Tool Handlers]
        I[Resource Providers]
        J[Prompt Templates]
    end
    
    subgraph "Audrey Core"
        K[Memory Store<br/>SQLite + sqlite-vec]
        L[Embedding Engine<br/>ONNX]
        M[Retrieval Engine]
    end
    
    A --> G
    B --> G
    C --> G
    D --> G
    E --> G
    F --> G
    
    G --> H
    G --> I
    G --> J
    
    H --> K
    I --> K
    J --> K
    
    K --> L
    K --> M

Server Initialization Flow

sequenceDiagram
    participant Host as MCP Host
    participant Server as McpServer
    participant Audrey as Audrey Instance
    participant Store as Memory Store
    
    Host->>Server: Start Process
    Server->>Audrey: Initialize with config
    Audrey->>Store: Open SQLite + sqlite-vec
    alt Warmup Enabled
        Audrey->>Store: Pre-compute embeddings
    end
    Server->>Server: registerHostResources()
    Server->>Server: registerHostPrompts()
    Server->>Server: Register Tools
    Server->>Host: Ready (stdio)

Server Components

Core Server Setup

The MCP server is initialized with a name and version, then configured with resources, prompts, and tools: Sources: mcp-server/index.ts:101-106

const server = new McpServer({
  name: SERVER_NAME,
  version: VERSION,
});

registerHostResources(server, audrey);
registerHostPrompts(server);

Tool Registry

The server registers the following tool categories:

CategoryToolsPurpose
Memory Operationsmemory_encode, memory_recallStore and retrieve memories
Memory Managementmemory_import, memory_export, memory_forgetData management
Impact Trackingmark_used, impact_reportMemory utility tracking
Promotionpromote_memory, rule_reviewMemory-to-rules conversion

Tools Reference

memory_encode

Encodes new information into the memory store with diagnostic support.

Parameters:

ParameterTypeRequiredDescription
contentstringYesThe memory content to encode
sourcestringNoSource identifier (e.g., "direct-observation", "told-by-user")
tagsstring[]NoClassification tags
saliencenumberNoImportance weight (0-1)
privatebooleanNoMark as private memory
contextobjectNoAdditional context metadata
affectobjectNoEmotional/valence metadata
wait_for_consolidationbooleanNoOpt-in read-after-write semantics (default: false)

Returns: Tool result with memory ID, content, source, and optionally diagnostics. Sources: mcp-server/index.ts:108-133

memory_recall

Retrieves relevant memories based on a query with filtering options.

Parameters:

ParameterTypeRequiredDescription
querystringYesSearch query
limitnumberNoMaximum results (default: 10)
typesstring[]NoFilter by memory types
min_confidencenumberNoMinimum confidence threshold
tagsstring[]NoFilter by tags
sourcesstring[]NoFilter by sources
afterstringNoISO timestamp lower bound
beforestringNoISO timestamp upper bound
contextobjectNoContext filtering
moodstringNoMood-based filtering

Returns: Array of recall results with confidence scores and metadata. Sources: mcp-server/index.ts:135-142

Additional Tools

ToolPurpose
memory_importImport memory snapshots
memory_exportExport memory snapshots
memory_forgetDelete specific memories
mark_usedRecord memory utility
impact_reportGenerate impact analytics
promote_memoryConvert memory to rule
rule_reviewReview promotion candidates

Command Line Interface

Command Routing

The MCP server entry point (mcp-server/index.ts) handles CLI subcommands before starting the stdio loop: Sources: mcp-server/index.ts:200-240

graph TD
    A[audrey CLI] --> B{Subcommand?}
    
    B -->|--help / -h / help| C[printHelp]
    B -->|--version / -v / version| D[printVersion]
    B -->|install| E[install]
    B -->|uninstall| F[uninstall]
    B -->|mcp-config| G[printMcpConfig]
    B -->|hook-config| H[printHookConfig]
    B -->|demo| I[runDemoCommand]
    B -->|reembed| J[reembed]
    B -->|dream| K[dream]
    B -->|greeting| L[greeting]
    B -->|NONE| M[Start MCP Server]
    
    C --> N[Exit 0]
    D --> N
    E --> N
    F --> N
    G --> N
    H --> N
    I --> N
    J --> N
    K --> N
    L --> N
    M --> O[stdio Loop]

Available Subcommands

CommandDescription
audrey installRegister Audrey with host MCP configuration
audrey uninstallRemove Audrey from host configuration
audrey mcp-configPrint MCP server configuration
audrey hook-configGenerate Claude Code hook configurations
audrey demoRun interactive demonstration
audrey reembedRegenerate embeddings
audrey dreamGenerate reflection/memory consolidation
audrey greetingDisplay greeting message
audrey doctorRun diagnostic checks
audrey guardCheck memory before action
audrey statusShow memory system status
audrey promotePromote memories to rules
audrey impactGenerate impact reports
audrey observe-toolMonitor tool execution

Help and Version Short-Circuit

Help and version flags MUST short-circuit before falling through to the MCP server. A user running audrey --help should see help, not be dropped into a stdio loop: Sources: mcp-server/index.ts:201-206

if (subcommand === '--help' || subcommand === '-h' || subcommand === 'help') {
  printHelp();
  process.exit(0);
} else if (subcommand === '--version' || subcommand === '-v' || subcommand === 'version') {
  printVersion();
  process.exit(0);
}

Configuration Management

MCP Host Configuration

The config.ts module provides functions to generate host-specific MCP configurations: Sources: mcp-server/config.ts:1-50

export function formatMcpHostConfig(
  host: string | undefined = 'generic',
  env: Record<string, string | undefined> = process.env,
): string

Supported Hosts:

HostConfig FormatNotes
codexTOMLGitHub MCP config style
claude-codeJSONClaude Code MCP settings
claude-desktopJSONClaude Desktop config
cursorJSONCursor MCP config
windsurfJSONWindsurf MCP config
vscodeJSONVS Code MCP config
jetbrainsJSONJetBrains MCP config
genericJSONGeneric MCP fallback

Installation Arguments

The buildInstallArgs() function generates CLI arguments for installing the MCP server: Sources: mcp-server/config.ts:52-66

export function buildInstallArgs(
  env: Record<string, string | undefined> = process.env,
  options: McpEnvOptions = {},
): string[]

Generated Output Example:

mcp add -s user audrey -e AUDREY_AGENT=claude-code -e AUDREY_DATA_DIR=... -- node /path/to/mcp-entrypoint

Install Guide Generation

The formatInstallGuide() function generates human-readable installation instructions: Sources: mcp-server/index.ts:18-48

export function formatInstallGuide(
  host: string,
  env: Record<string, string | undefined> = process.env,
  dryRun = false,
): string

Output Sections:

  1. Title (with dry-run or config-only indicator)
  2. No-modification notice
  3. Generated MCP config
  4. Generated Claude Code hook config (for Claude Code host)
  5. Next steps

Host-Specific Resources

Resource Registration

Resources are registered per-host to provide context: Sources: mcp-server/index.ts:103

registerHostResources(server, audrey);
registerHostPrompts(server);

Available Resources

ResourceTypeDescription
statusResourceCurrent system status
recentResourceRecent memory activity
principlesResourceCore operational principles

Available Prompts

PromptPurpose
briefingGet current session briefing
recallPerform focused recall
reflectionGenerate self-reflection

Error Handling

Tool Error Response

Tool handlers return structured error responses: Sources: mcp-server/index.ts:100

function toolError(err: unknown): CallToolResult {
  return {
    content: [{ type: 'text', text: `[audrey] error: ${err}` }],
    isError: true,
  };
}

Tool Success Response

Tool handlers return structured success responses with optional diagnostics: Sources: mcp-server/index.ts:99

function toolResult(data: unknown, diagnostics?: unknown): CallToolResult {
  return {
    content: [{ type: 'text', text: JSON.stringify(data) }],
    _meta: diagnostics ? { diagnostics } : undefined,
  };
}

Environment Variables

VariableDefaultDescription
AUDREY_AGENTclaude-codeHost agent identifier
AUDREY_DATA_DIRPlatform-specificData directory path
AUDREY_PROFILE0Enable profiling diagnostics
AUDREY_DEBUG0Enable debug logging
AUDREY_DISABLE_WARMUP0Skip embedding warmup
AUDREY_API_KEYunsetREST API authentication
AUDREY_HOST127.0.0.1REST bind address
AUDREY_PORT7437REST server port

Performance Characteristics

v0.22.0 Performance Metrics

OperationBeforeAfterImprovement
Encode response (p50)24.7ms15.2ms~40% faster
Cold-start first encode525ms28ms (with warmup)~18.7x faster
Hybrid recall (p50)30.2ms14.3ms~2.1x faster

Optimization Details

  • Eliminated 3 of 4 redundant embedding calls during encode
  • Validation, interference, and affect resonance reuse the main content vector
  • Background embedding warmup at MCP boot reduces cold-start latency Sources: CHANGELOG.md

Security

API Key Timing Safety

HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks: Sources: CHANGELOG.md

Recall Options Sanitization

HTTP /v1/recall and /v1/capsule sanitize request bodies to prevent ACL bypass: Sources: CHANGELOG.md

Promote Path Restrictions

audrey promote --yes restricts writes to process.cwd() unless target is in AUDREY_PROMOTE_ROOTS, preventing prompt-injection attacks. Sources: CHANGELOG.md

Profiling Mode

When AUDREY_PROFILE=1, tools return diagnostic metadata: Sources: mcp-server/index.ts:110-120

if (profileEnabled) {
  const { id, diagnostics } = await audrey.encodeWithDiagnostics({
    content,
    source,
    tags,
    salience,
    private: isPrivate,
    context,
    affect,
    waitForConsolidation: wait_for_consolidation,
  });
  return toolResult({ id, content, source, private: isPrivate ?? false }, diagnostics);
}

Diagnostic Data Includes:

  • Per-stage timing information
  • Embedding generation time
  • Retrieval latency breakdown

Source: https://github.com/Evilander/Audrey / Human Manual

REST API

Related topics: System Architecture, MCP Server

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Environment Variables

Continue reading this section for the full explanation and source context.

Section Starting the Server

Continue reading this section for the full explanation and source context.

Section Health Check

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, MCP Server

REST API

The Audrey REST API provides a local-first HTTP interface for memory management operations, enabling external agents and services to interact with Audrey's memory system without direct database access.

Overview

Audrey's REST API is built on Hono, a lightweight, high-performance web framework for Edge environments. The API serves as a sidecar service that wraps the core memory engine with HTTP endpoints for encoding, recalling, and managing memory entries.

Key characteristics:

  • Local-first design with no external database dependencies
  • SQLite + sqlite-vec for storage and vector search
  • Bearer token authentication for non-loopback access
  • Type-safe request/response handling

Sources: README.md:60

Architecture

graph TD
    A[Client<br>Python/JS SDK] --> B[REST API<br>Hono Server]
    B --> C[Audrey Core Engine]
    C --> D[SQLite<br>Memory Store]
    C --> E[sqlite-vec<br>Vector Index]
    B --> F[/health]
    B --> G[/v1/recall]
    B --> H[/v1/capsule]
    B --> I[/v1/encode]
    B --> J[Admin Routes<br>/v1/import<br>/v1/export]

Server Configuration

Environment Variables

VariableDefaultDescription
AUDREY_HOST127.0.0.1REST sidecar bind address. Set to 0.0.0.0 only with AUDREY_API_KEY.
AUDREY_PORT7437Port for the REST server to listen on.
AUDREY_API_KEYunsetBearer token required for non-loopback REST traffic.
AUDREY_ALLOW_NO_AUTH0Set to 1 to allow non-loopback bind without an API key. Not recommended.
AUDREY_ENABLE_ADMIN_TOOLS0Set to 1 to enable export, import, and forget routes/tools. Disabled by default.
AUDREY_PRAGMA_DEFAULTS1Set to 0 to revert SQLite PRAGMA tuning to better-sqlite3 defaults.
AUDREY_DEBUG0Set to 1 to print MCP info logs. Errors always log.

Sources: README.md:44-52

Starting the Server

# Default (loopback only)
npx audrey serve

# With explicit port
AUDREY_PORT=8080 npx audrey serve

# Network-exposed with API key
AUDREY_HOST=0.0.0.0 AUDREY_API_KEY=secret npx audrey serve

Sources: python/README.md:18

API Endpoints

Health Check

MethodPathDescription
GET/healthReturns server health status

Response:

{
  "status": "ok",
  "version": "0.22.1",
  "timestamp": "2026-04-30T12:00:00.000Z"
}

Sources: README.md:58

Memory Operations

MethodPathDescription
POST/v1/encodeStore a new memory entry
POST/v1/recallRetrieve relevant memories by query
POST/v1/capsuleGet a turn-sized memory packet
POST/v1/mark-usedMark memory as used with outcome feedback
POST/v1/observe-toolRecord tool execution results
POST/v1/before-actionPreflight check before tool execution
POST/v1/validateValidate memory helpfulness

Sources: README.md:58, src/routes.ts:1-50

Request Body Schema

The REST API accepts a unified RouteBody type with optional fields:

type RouteBody = {
  action?: string;
  query?: string;
  tool?: string;
  session_id?: string;
  sessionId?: string;
  cwd?: string;
  files?: string[];
  strict?: boolean;
  limit?: number;
  budget_chars?: number;
  budgetChars?: number;
  mode?: PreflightOptions['mode'];
  failure_window_hours?: number;
  recent_failure_window_hours?: number;
  recentFailureWindowHours?: number;
  recent_change_window_hours?: number;
  recentChangeWindowHours?: number;
  include_capsule?: boolean;
  includeCapsule?: boolean;
  include_status?: boolean;
  includeStatus?: boolean;
  record_event?: boolean;
  recordEvent?: boolean;
  include_preflight?: boolean;
  includePreflight?: boolean;
  receipt_id?: string;
  receiptId?: string;
  input?: unknown;
  output?: unknown;
  outcome?: EventOutcome;
  error_summary?: string;
  errorSummary?: string;
  metadata?: Record<string, unknown>;
  retain_details?: boolean;
  retainDetails?: boolean;
  evidence_feedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
  evidenceFeedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
};

Sources: src/routes.ts:5-46

Security Model

Authentication

Non-loopback REST traffic requires a Bearer token:

curl -H "Authorization: Bearer your-secret-token" \
  http://localhost:7437/v1/recall \
  -d '{"query": "deploy failures"}'

Security measures:

Recall Options Sanitization

HTTP /v1/recall and /v1/capsule no longer body-spread caller options into internal calls. The sanitizeRecallOptions() function implements an allowlist that drops anything not in a known-safe key set:

export function sanitizeRecallOptions(options: unknown): SanitizedRecallOptions {
  // Only allows: budget_chars, limit, retrieval, includePrivate, sessionId
}

This prevents bypassing private-memory ACL and integrity controls via includePrivate: true or confidenceConfig overrides in HTTP bodies.

Sources: README.md:39-40, CHANGELOG.md:0.22.1

Admin Tools

Admin routes (/v1/import, /v1/export, /v1/forget) are disabled by default. Enable with:

AUDREY_ENABLE_ADMIN_TOOLS=1 npx audrey serve

Sources: README.md:48

Client Integration

Python SDK

from audrey_memory import Audrey

brain = Audrey(
    base_url="http://127.0.0.1:7437",
    api_key="secret",
    agent="support-agent",
)

# Encode a memory
memory_id = brain.encode(
    "Stripe returns HTTP 429 above 100 req/s",
    source="direct-observation",
    tags=["stripe", "rate-limit"],
)

# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)

# Create snapshot for backup
snapshot = brain.snapshot()
brain.close()

Async usage:

import asyncio
from audrey_memory import AsyncAudrey

async def main() -> None:
    async with AsyncAudrey(base_url="http://127.0.0.1:7437") as brain:
        await brain.health()
        await brain.encode("Deploy failed due to OOM", source="direct-observation")
        await brain.recall("deploy failure", limit=3)

asyncio.run(main())

Sources: python/README.md:22-45

Connection URL Correction

Note: Python client DEFAULT_BASE_URL was corrected from http://127.0.0.1:3487 to http://127.0.0.1:7437 in v0.22.1 to match the TS server's default port.

Sources: CHANGELOG.md:0.22.1

Impact Reporting

The REST API exposes impact analytics through the audrey impact CLI, which calls internal Audrey methods:

EndpointDescription
Total memories by typeepisodic, semantic, procedural counts
All-time validated countMemories validated as helpful/wrong
Recent validationsValidation activity in time window
Top-N most-used memoriesMemories with highest usage_count
Weakest-N memoriesLowest salience candidates for forgetting
Recent activity timelinelast_used_at based activity log
# Basic report
audrey impact

# JSON output for automation
audrey impact --json

# Custom window and limits
audrey impact --window 30 --limit 20

Sources: src/impact.ts:1-50, CHANGELOG.md:0.22.1

Deployment

Docker

# docker-compose.yml
services:
  audrey:
    image: ghcr.io/evilander/audrey:latest
    ports:
      - "7437:7437"
    environment:
      - AUDREY_API_KEY=your-secret-token
      - AUDREY_HOST=0.0.0.0
    volumes:
      - audrey-data:/data

Doctor Check

The audrey doctor command validates REST server configuration:

npx audrey doctor

Checks performed:

CheckDescription
serve-bind-safetyValidates bind address with auth configuration
node-runtimeNode.js version >= 20
entrypoint-existsMCP stdio entrypoint file exists
data-dirData directory accessibility
embeddingEmbedding provider configuration
llmLLM provider configuration
graph TD
    A[audrey doctor] --> B{Is bind loopback?}
    B -->|Yes| C[✅ loopback only]
    B -->|No| D{Has AUDREY_API_KEY?}
    D -->|Yes| E[✅ non-loopback with API key]
    D -->|No| F{Has AUDREY_ALLOW_NO_AUTH?}
    F -->|Yes| G[⚠️ warning - network exposure]
    F -->|No| H[❌ error - refuse to start]

Sources: mcp-server/index.ts:100-130

Error Handling

Common Issues

ErrorCauseSolution
Connection refusedWrong port or hostCheck AUDREY_PORT and AUDREY_HOST
401 UnauthorizedMissing/invalid API keyProvide Authorization: Bearer <token> header
404 Not FoundWrong endpointUse /v1/* routes, not /openapi.json or /docs
Validation errorMalformed request bodyCheck RouteBody schema

Status Codes

CodeMeaning
200Success
400Bad request (malformed body)
401Unauthorized (missing/invalid API key)
404Endpoint not found
500Internal server error
Note: /openapi.json and /docs routes are not currently wired. The README matches the actual surface (/health + /v1/*).

Sources: CHANGELOG.md:0.22.1, README.md:58

Sources: [README.md:60]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors

First-time setup may fail or require extra isolation and rollback planning.

medium Audrey Guard 0.23.0

First-time setup may fail or require extra isolation and rollback planning.

medium v0.16.0

First-time setup may fail or require extra isolation and rollback planning.

medium v0.16.1 — Windows MCP fix

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 14 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.1

2. Installation risk: Audrey Guard 0.23.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: Audrey Guard 0.23.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.23.0

3. Installation risk: v0.16.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v0.16.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.0

4. Installation risk: v0.16.1 — Windows MCP fix

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v0.16.1 — Windows MCP fix. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.1

5. Installation risk: v0.17.0

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: v0.17.0. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.17.0

6. Configuration risk: Configuration risk needs validation

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.host_targets | github_repo:1161444210 | https://github.com/Evilander/Audrey | host_targets=mcp_host, claude, claude_code

7. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:1161444210 | https://github.com/Evilander/Audrey | README/documentation is current enough for a first validation pass.

8. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:1161444210 | https://github.com/Evilander/Audrey | last_activity_observed missing

9. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: downstream_validation.risk_items | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium

10. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: risks.scoring_risks | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium

11. Security or permission risk: Audrey 1.0.0

  • Severity: medium
  • Finding: Security or permission risk is backed by a source signal: Audrey 1.0.0. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.0

12. Security or permission risk: v0.22.2 — correctness pass + legitimate benchmarking

  • Severity: medium
  • Finding: Security or permission risk is backed by a source signal: v0.22.2 — correctness pass + legitimate benchmarking. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.22.2

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Audrey with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence