Doramagic Project Pack · Human Manual
Audrey
Related topics: System Architecture, Memory Model, Quick Start Guide
Audrey Overview
Related topics: System Architecture, Memory Model, Quick Start Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Memory Model, Quick Start Guide
Audrey Overview
Audrey is a local-first memory firewall for AI agents. It provides a durable, SQLite-backed memory layer that enables AI agents to remember past mistakes, learned principles, and project-specific rules across sessions. Audrey acts as a continuity layer that sits under any local or sidecar agent loop, preventing agents from repeating the same mistakes and enabling smarter, more context-aware behavior.
Sources: README.md:1-10
What Problem Audrey Solves
AI agents typically suffer from "cold start" problems—they treat every new session as if they've never interacted with the project before. They repeat broken commands, lose project-specific rules, miss contradictions, and forget the exact mistakes they made yesterday.
Audrey addresses this by implementing a closed feedback loop:
- Record what happened during agent actions
- Remember what mattered from those events
- Check before new actions using stored memories
- Return decisions (
allow,warn, orblock) with evidence - Validate whether the memory helped improve outcomes
Sources: README.md:25-40
Architecture Overview
Audrey is built with a layered architecture that separates concerns between memory storage, retrieval, governance, and agent integration.
graph TD
subgraph Client Layer
CLI[CLI Tool<br>npx audrey]
PythonSDK[Python SDK<br>audrey_memory]
MCPServer[MCP Server]
end
subgraph Integration Layer
Hooks[Claude Code Hooks<br>PreToolUse/PostToolUse]
MCPConfig[MCP Config<br>Codex, VSCode, etc.]
end
subgraph Core Engine
Guard[Audrey Guard<br>Memory-before-action]
Routes[REST API<br>/v1/*]
end
subgraph Memory Layer
SQLite[(SQLite<br>WAL Mode)]
Episodic[Episodic<br>Memory]
Semantic[Semantic<br>Memory]
Procedural[Procedural<br>Memory]
end
subgraph Embedding
ONNX[ONNX Runtime<br>Local Embeddings]
Providers[Cloud Providers<br>OpenAI, Anthropic]
end
CLI --> Routes
PythonSDK --> Routes
MCPServer --> Routes
Hooks --> Guard
MCPConfig --> MCPServer
Guard --> Routes
Routes --> SQLite
SQLite --> Episodic
SQLite --> Semantic
SQLite --> Procedural
Routes --> ONNX
Routes --> ProvidersSources: README.md:1-50
Core Components
Memory Types
Audrey manages three distinct types of memory, each serving a different purpose:
| Memory Type | Purpose | Examples |
|---|---|---|
| Episodic | Records specific events and outcomes | "Deploy failed at 3:42 PM with OOM error" |
| Semantic | Stores learned facts, principles, and rules | "Stripe rate limits are 100 req/s" |
| Procedural | Captures how-to knowledge and workflows | "To deploy, run npm run deploy after npm test" |
Each memory type can be tagged, sourced, and validated independently. Memories gain salience through usage—memories that are repeatedly helpful become more prominent, while unused memories decay over time.
Sources: README.md:40-55
Audrey Guard
Audrey Guard is the core decision-making component that checks memories before agent actions execute. It implements a preflight check that returns structured decisions:
graph LR
Action[Agent Action<br>tool + parameters] --> Guard
Guard --> Recall[Recall Relevant<br>Memories]
Recall --> Decision{Decision}
Decision -->|No issues| ALLOW[allow]
Decision -->|Potential risk| WARN[warn<br>+ evidence]
Decision -->|Dangerous| BLOCK[block<br>+ reason]
Decision -->|Uncertain| QUERY[Query<br>Human]The Guard returns a decision with supporting evidence, allowing the agent to make informed choices. When set to strict mode, warnings are treated as blocks.
Sources: README.md:25-35
Memory Capsule
The Memory Capsule is a structured response format that bundles contextual information for agent preflight checks:
| Section | Description |
|---|---|
recent_changes | Memories created/modified within the recent-change window |
must_follow | Critical rules tagged as mandatory |
procedures | Step-by-step workflows relevant to the query |
user_preferences | Explicitly stated user preferences |
risks | Warnings and risk indicators |
uncertain_or_disputed | Low-confidence or contested memories |
Sources: src/capsule.ts:1-50
Impact Tracking
Audrey tracks the effectiveness of its memories through a closed validation loop:
graph TD
Action[Action with Memory] --> Outcome{Outcome}
Outcome -->|helpful| Boost[Boost salience<br>+usage_count]
Outcome -->|used| Maintain[Maintain salience]
Outcome -->|wrong| Challenge[Challenge memory<br>Decrease salience]
Boost --> Consolidation[Consolidation<br>Dream cycle]
Maintain --> Consolidation
Challenge --> Consolidation
Consolidation --> Principles[New Semantic<br>Principles]Outcome types:
- helpful: The memory contributed to a successful outcome
- used: The memory was consulted but didn't directly contribute
- wrong: The memory led to an incorrect decision
Sources: src/impact.ts:1-60
Installation Methods
Audrey supports multiple installation patterns depending on your use case.
CLI Installation
For direct terminal usage:
npx audrey doctor # Verify setup
npx audrey demo --scenario repeated-failure # Run demo
npx audrey guard --tool Bash "npm run deploy" # Check before action
Sources: README.md:55-65
MCP Server Integration
For integration with agents like Codex, Claude Desktop, Cursor, and VS Code:
# Generate MCP configuration
npx audrey mcp-config codex
npx audrey mcp-config generic
npx audrey mcp-config vscode
Sources: mcp-server/index.ts:1-40
Claude Code Hooks
For Claude Code, install directly and configure memory-before-action hooks:
npx audrey install
claude mcp list
# Apply hooks to project or user scope
npx audrey hook-config claude-code --apply --scope project # Project-local
npx audrey hook-config claude-code --apply --scope user # User-wide
The generated hooks include:
PreToolUse: Runsaudrey guard --hook --fail-on-warnPostToolUse: Records successful tool executionsPostToolUseFailure: Records failed tool executions
Sources: README.md:67-85
Python SDK
For Python-based agent integrations:
pip install audrey-memory
from audrey_memory import Audrey
brain = Audrey(
base_url="http://127.0.0.1:7437",
api_key="secret",
agent="support-agent",
)
# Encode new memories
memory_id = brain.encode(
"Stripe returns HTTP 429 above 100 req/s",
source="direct-observation",
tags=["stripe", "rate-limit"],
)
# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)
# Close connection
brain.close()
Sources: python/README.md:1-50
REST API Reference
The Audrey REST API exposes core memory operations via HTTP.
Endpoints Overview
| Method | Endpoint | Description |
|---|---|---|
GET | /health | Server health check |
POST | /v1/encode | Store a new memory |
POST | /v1/recall | Retrieve memories by semantic similarity |
POST | /v1/preflight | Memory-before-action check |
POST | /v1/validate | Submit outcome feedback |
POST | /v1/impact | Get impact statistics |
Sources: src/routes.ts:1-80
Core API Operations
#### Encode Memory
interface EncodeRequest {
content: string; // The memory content
memory_type: 'episodic' | 'semantic' | 'procedural';
source: string; // e.g., "direct-observation", "told-by-user"
tags?: string[];
private?: boolean; // Agent-only memory
wait_for_consolidation?: boolean;
}
Sources: src/routes.ts:80-120
#### Recall Memories
interface RecallRequest {
query: string;
limit?: number; // Default: 5
budget_chars?: number; // Context budget
retrieval?: 'hybrid' | 'vector'; // Default: hybrid
mood?: { // Optional affect configuration
min_valence?: number;
min_arousal?: number;
};
}
#### Preflight Check
interface PreflightRequest {
tool: string; // e.g., "Bash", "Write"
action: string; // The specific action/command
session_id?: string;
cwd?: string;
include_capsule?: boolean;
include_preflight?: boolean;
record_event?: boolean;
}
#### Validate Outcome
interface ValidateRequest {
receipt_id: string; // From preflight response
outcome: 'helpful' | 'used' | 'wrong';
evidence_feedback?: Record<string, 'helpful' | 'used' | 'wrong'>;
metadata?: Record<string, unknown>;
}
Sources: src/routes.ts:120-200
Configuration
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
AUDREY_DATA_DIR | ~/.audrey | SQLite data directory (set per tenant/agent) |
AUDREY_EMBEDDING_PROVIDER | onnx | Embedding provider: onnx, openai, anthropic, google |
AUDREY_LLM_PROVIDER | openai | LLM provider for consolidation |
AUDREY_MODEL | varies | Specific model to use |
AUDREY_HOST | 127.0.0.1 | REST sidecar bind address |
AUDREY_PORT | 7437 | REST sidecar port |
AUDREY_API_KEY | unset | Bearer token for non-loopback access |
AUDREY_ALLOW_NO_AUTH | 0 | Allow non-loopback without API key (not recommended) |
AUDREY_ENABLE_ADMIN_TOOLS | 0 | Enable export/import/forget routes |
AUDREY_DEBUG | 0 | Enable debug logging |
AUDREY_PROFILE | 0 | Emit per-stage diagnostics |
AUDREY_DISABLE_WARMUP | 0 | Skip embedding warmup at boot |
AUDREY_CONTEXT_BUDGET_CHARS | 4000 | Default capsule character budget |
Sources: README.md:150-180
Data Isolation
SQLite uses WAL mode without an advisory lock, so two processes sharing a directory will contend on writes. Isolation is a hard requirement for multi-agent setups.
Important: Set a distinct AUDREY_DATA_DIR per tenant, agent identity, or concurrent host to avoid write contention.
Sources: README.md:55-60
Security
Redaction
Audrey automatically redacts sensitive information from stored memories and logs:
| Class | Patterns |
|---|---|
api_key | api_key, apiKey, API_KEY patterns |
password | password, passwd, pwd |
token | token, bearer_token, access_token, jwt |
secret | secret, client_secret, private_key |
The redaction system walks JSON structures recursively and applies pattern matching to both keys and values.
Sources: src/redact.ts:1-60
Access Control
- HTTP API key comparison uses
crypto.timingSafeEqualto prevent timing attacks audrey servedefaults to binding127.0.0.1(was0.0.0.0)- Non-loopback bind requires
AUDREY_API_KEYor explicitAUDREY_ALLOW_NO_AUTH=1 - Private memories have ACL enforcement at the recall endpoint
sanitizeRecallOptions()allowlists HTTP body parameters to prevent option injection
Sources: CHANGELOG.md:1-30
Production Readiness
Release Gates
npm run release:gate # Full release checklist
npm run python:release:check # Python package verification
npm run bench:guard:card # Guard performance benchmarks
npm run bench:guard:validate # Guard accuracy validation
npx audrey doctor # Runtime health check
npx audrey status --json --fail-on-unhealthy
Production Checklist
- Set one
AUDREY_DATA_DIRper tenant, environment, or isolation boundary - Pin
AUDREY_EMBEDDING_PROVIDERandAUDREY_LLM_PROVIDERexplicitly - Back up the SQLite data directory before provider or dimension changes
- Keep API keys and raw credentials out of encoded memory content
- Use
AUDREY_API_KEYif the REST sidecar is reachable beyond local process boundary - Run
audrey dreamon a schedule for consolidation and decay - Add application-level encryption, retention, access control, and audit logging for regulated environments
Sources: README.md:100-125
Memory Lifecycle
graph TD
Event[Agent Event<br>Action/Failure] --> Encode[Encode Memory<br>episodic]
Encode --> Salience[Initial Salience<br>from confidence]
Salience --> Usage[Usage Cycle]
Usage --> Preflight[Preflight Check]
Preflight --> Decision[Guard Decision]
Decision --> Action[Execute Action]
Action --> Outcome{Outcome}
Outcome -->|Success| Boost[Boost Salience<br/>usage_count++]
Outcome -->|Partial| Maintain[Maintain]
Outcome -->|Failure| Challenge[Challenge<br/>decay confidence]
Boost --> Consolidation{Dream Cycle}
Maintain --> Consolidation
Challenge --> Consolidation
Consolidation --> Principle[Extract Principle<br/>semantic memory]
Consolidation --> Decay[Apply Decay<br/>unused memories]
Principle --> NewMemory[New Semantic<br/>Memory]
Decay --> Prune[Prune Very Low<br/>salience memories]Dream Cycle
The memory_dream operation consolidates episodes into principles and applies decay:
- Consolidation: Groups related episodic memories into higher-level semantic principles
- Decay: Reduces salience of memories that haven't been used recently
- Challenge: Flags memories that led to wrong outcomes for review
Sources: README.md:40-50
Supported Integrations
| Agent/IDE | Integration Method | Features |
|---|---|---|
| Claude Code | Hooks + MCP | Full memory-guard loop |
| Codex | MCP Config | Memory recall |
| Claude Desktop | MCP | Memory access |
| Cursor | MCP Config | Memory recall |
| Windsurf | MCP Config | Memory recall |
| VS Code | MCP Config | Memory recall |
| JetBrains | MCP Config | Memory recall |
| Ollama | Generic MCP | Memory recall |
| Custom Agents | REST API / Python SDK | Full integration |
Sources: README.md:10-20
Key Files Reference
| File | Purpose |
|---|---|
src/audrey.ts | Core Audrey class with memory operations |
src/routes.ts | REST API route handlers |
src/capsule.ts | Memory capsule builder |
src/impact.ts | Impact tracking and validation |
src/redact.ts | Sensitive data redaction |
src/rules-compiler.ts | Rule file generation from memories |
mcp-server/index.ts | MCP server and CLI commands |
python/ | Python SDK implementation |
Sources: src/audrey.ts, src/routes.ts, src/capsule.ts, src/impact.ts, src/redact.ts, src/rules-compiler.ts, mcp-server/index.ts, python/README.md
Sources: [README.md:1-10]()
Quick Start Guide
Related topics: Audrey Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Audrey Overview
Quick Start Guide
Overview
Audrey is a local-first memory firewall for AI agents that provides a durable memory layer they can check before executing tools. This guide covers installation, configuration, and basic usage patterns across all supported surfaces: CLI, REST API, JavaScript SDK, and Python client.
Prerequisites
| Requirement | Version/Details |
|---|---|
| Node.js | v18+ recommended |
| npm | v8+ |
| Python | 3.9+ (for Python SDK) |
| SQLite | Built-in (bundled) |
| Docker | Optional for containerized deployment |
Installation
CLI Installation
npm install -g audrey
Verify installation:
audrey --version
Python SDK Installation
pip install audrey-memory
Sources: python/README.md:1-20
Quick Setup
1. Run the Health Check
audrey doctor
This command validates the installation and checks for any configuration issues.
Sources: mcp-server/index.ts:85-120
2. Start the REST API Server
npx audrey serve
By default, the server binds to 127.0.0.1:7437. Configure using environment variables:
| Environment Variable | Default | Description |
|---|---|---|
AUDREY_PORT | 7437 | REST API port |
AUDREY_HOST | 127.0.0.1 | REST sidecar bind address |
AUDREY_API_KEY | unset | Bearer token for non-loopback traffic |
AUDREY_DATA_DIR | ~/.audrey | Data directory path |
Sources: README.md:1-50
3. Install MCP Configuration
For Claude Code integration:
audrey install --host claude-code
Generate MCP config without applying:
audrey mcp-config --host claude-code --dry-run
Apply project hooks:
audrey hook-config claude-code --apply --scope project
Apply user hooks:
audrey hook-config claude-code --apply --scope user
Sources: mcp-server/index.ts:40-75
Core Usage Patterns
Memory Guard Workflow
The primary safety loop that records events, checks memory before action, and returns decisions:
graph TD
A[Agent Action] --> B[audrey guard --tool Bash npm run deploy]
B --> C{Memory Check}
C -->|allow| D[Execute Action]
C -->|warn| E[Execute with Warning]
C -->|block| F[Block Action]
D --> G[Record Outcome]
E --> G
F --> G
G --> H[Memory Consolidation]Execute the guard command:
audrey guard --tool Bash "npm run deploy"
Parameters:
| Parameter | Description |
|---|---|
--tool | Tool name (Bash, Write, Read, etc.) |
--session-id | Session identifier |
--files | File paths involved |
--strict | Fail on warning |
Sources: mcp-server/index.ts:150-200
Encoding Memories
#### Via REST API
curl -X POST http://127.0.0.1:7437/v1/encode \
-H "Authorization: Bearer secret" \
-H "Content-Type: application/json" \
-d '{
"content": "Stripe returns HTTP 429 above 100 req/s",
"source": "direct-observation",
"tags": ["stripe", "rate-limit"]
}'
#### Via Python SDK
from audrey_memory import Audrey
brain = Audrey(
base_url="http://127.0.0.1:7437",
api_key="secret",
agent="support-agent",
)
memory_id = brain.encode(
"Stripe returns HTTP 429 above 100 req/s",
source="direct-observation",
tags=["stripe", "rate-limit"],
)
Sources: python/README.md:25-45
Recalling Memories
#### Via REST API
curl -X POST http://127.0.0.1:7437/v1/recall \
-H "Authorization: Bearer secret" \
-H "Content-Type: application/json" \
-d '{
"query": "stripe rate limits",
"limit": 5,
"retrieval": "hybrid"
}'
#### Via Python SDK
results = brain.recall("stripe rate limits", limit=5)
#### Available Recall Options
| Option | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
limit | number | 10 | Maximum results |
budget_chars | number | 4000 | Context budget in characters |
retrieval | string | "hybrid" | "hybrid" or "vector" mode |
include_private | boolean | false | Include private memories |
agent | string | - | Filter by agent name |
Sources: src/routes.ts:1-50
Getting Memory Capsule
A turn-sized memory packet containing relevant context:
curl -X POST http://127.0.0.1:7437/v1/capsule \
-H "Authorization: Bearer secret" \
-H "Content-Type: application/json" \
-d '{
"query": "current deploy status",
"budget_chars": 4000
}'
The capsule contains sections:
recent_changes- Memories from recent windowmust_follow- Critical rulesprocedures- Step-by-step memoriesuser_preferences- User-stated preferencesrisks- Warnings and risksuncertain_or_disputed- Low-confidence items
Sources: src/capsule.ts:1-60
Check Health
curl http://127.0.0.1:7437/v1/status
Async version:
import asyncio
from audrey_memory import AsyncAudrey
async def main():
async with AsyncAudrey(base_url="http://127.0.0.1:7437", api_key="secret") as brain:
health = await brain.health()
print(health)
asyncio.run(main())
Sources: python/README.md:50-70
Advanced CLI Commands
Dream - Consolidate Memory
audrey dream
Triggers memory consolidation process.
Reembed - Rebuild Vector Indices
audrey reembed
Rebuilds embedding indices after schema changes.
Observe Tool
Record tool execution results:
audrey observe-tool --tool Bash --input '{"command": "npm test"}' --output '{"exitCode": 0}'
Impact Report
Generate memory impact analysis:
audrey impact --window-days 30 --limit 10
The report includes:
- Memory counts by type (episodic, semantic, procedural)
- Validated memories count
- Outcome breakdown (helpful, wrong, used)
- Top used memories
- Weakest memories by salience
- Recent activity
Sources: src/impact.ts:1-80
Promote - Extract Rules
Promote memory candidates to reviewable Markdown files:
audrey promote --yes
Rules are saved to .claude/rules/ with YAML front matter for traceability.
Check Status
audrey status
Displays:
- Current mood (valence, arousal)
- Memory counts
- Learned principles
- Recent memories
- Unresolved threads
Sources: mcp-server/index.ts:200-280
MCP Server Tools
Audrey provides 20 tools via MCP stdio protocol:
| Tool | Purpose |
|---|---|
memory_encode | Record new memories |
memory_recall | Retrieve relevant memories |
memory_capsule | Get turn-sized context packet |
preflight_check | Validate before action |
record_outcome | Record action results |
promote_memory | Convert to persistent rule |
impact_report | Analyze memory effectiveness |
Resources include:
status- System healthrecent- Recent memoriesprinciples- Semantic memoriesbriefing- Current context
Prompts include:
memory-recall- Search memoriesmemory-reflection- Self-analysis
Sources: README.md:60-90
Configuration Reference
Environment Variables
| Variable | Default | Description |
|---|---|---|
AUDREY_PORT | 7437 | REST API port |
AUDREY_HOST | 127.0.0.1 | Bind address |
AUDREY_API_KEY | unset | Bearer token |
AUDREY_DATA_DIR | ~/.audrey | Data directory |
AUDREY_ENABLE_ADMIN_TOOLS | 0 | Enable export/import routes |
AUDREY_PROMOTE_ROOTS | unset | Extra write roots |
AUDREY_DEBUG | 0 | Enable debug logging |
AUDREY_PROFILE | 0 | Emit per-stage timings |
AUDREY_DISABLE_WARMUP | 0 | Skip embedding warmup |
AUDREY_CONTEXT_BUDGET_CHARS | 4000 | Default capsule budget |
Sources: README.md:40-55
Privacy Controls
By default, private memories are ACL-protected:
include_private: trueis restricted in HTTP APIconfidenceConfigoverrides are blocked viasanitizeRecallOptions()
For full control, use the SDK directly or enable admin tools:
AUDREY_ENABLE_ADMIN_TOOLS=1 audrey serve
Sources: CHANGELOG.md:1-30
Deployment Options
Docker
docker run -p 7437:7437 \
-e AUDREY_API_KEY=secret \
-v audrey-data:/root/.audrey \
audrey:latest
Docker Compose
Use the provided docker-compose.yml for persistent deployments with volume mounts.
Host-Specific Setup
Generate platform-specific MCP configurations:
audrey mcp-config --host claude-code
audrey mcp-config --host cursor
audrey mcp-config --host windsurf
Sources: README.md:80-100
Next Steps
- Review
audrey doctoroutput for any warnings - Configure
AUDREY_API_KEYfor production deployments - Set up MCP integration for your preferred IDE/agent
- Explore memory types: episodic, semantic, procedural
- Enable impact tracking to measure memory effectiveness
For detailed API documentation, see the REST API endpoints at /v1/* when the server is running.
Sources: [python/README.md:1-20](https://github.com/Evilander/Audrey/blob/main/python/README.md)
System Architecture
Related topics: Audrey Overview, Memory Model, Data Storage, MCP Server, REST API
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Audrey Overview, Memory Model, Data Storage, MCP Server, REST API
System Architecture
Overview
Audrey is a local-first memory runtime designed to give AI agents persistent, queryable memory across sessions. It operates as a stateful infrastructure layer that records observations, consolidates principles, and provides memory-before-action checks through multiple interfaces.
The system is built around a closed-loop safety architecture where every tool action can be validated against stored memory before execution, returning allow, warn, or block decisions with supporting evidence.
Core Design Principles
| Principle | Description |
|---|---|
| Local-first | All data stored in local SQLite; no external database required |
| Agent-agnostic | Works with Codex, Claude Code, Cursor, Windsurf, VS Code, JetBrains, Ollama, and custom agents |
| Safety loop | Pre-action validation through Audrey Guard before tool execution |
| Isolation per tenant | One AUDREY_DATA_DIR per tenant/agent/isolation boundary |
| Privacy-by-default | Audrey Guard redacts tool traces; private memory ACL enforcement |
High-Level Component Architecture
graph TD
subgraph "Agent Host"
A[AI Agent<br/>Claude Code / Codex / Cursor]
end
subgraph "Audrey Runtime"
CLI[CLI<br/>doctor, demo, guard,<br/>install, status, dream]
MCP[MCP Server<br/>stdio interface]
REST[REST API<br/>Hono server :7437]
SDK[JS SDK<br/>TypeScript/Node]
PY[Python SDK<br/>audrey-memory]
end
subgraph "Core Engine"
AUD[Audrey Core<br/>encode, recall, consolidate,<br/>validate, impact]
MEM[(Memory Store<br/>SQLite + sqlite-vec)]
EMB[Embedding Engine<br/>ONNX runtime]
CAUSAL[Causal Validation<br/>confidence scoring]
end
A <--> MCP
A <--> REST
CLI --> MCP
CLI --> REST
SDK --> REST
PY --> REST
AUD <--> MEM
AUD <--> EMB
AUD <--> CAUSALComponent Specifications
MCP Server (`mcp-server/index.ts`)
The MCP stdio server provides 20+ tools plus status, recent, principles resources and briefing/recall/reflection prompts.
| Interface Type | Count | Purpose |
|---|---|---|
| Tools | 20+ | encode, recall, capsule, guard, promote, impact, dream, reembed, observe-tool |
| Resources | 3 | status, recent, principles |
| Prompts | 3 | briefing, recall, reflection |
The server processes CLI arguments before entering stdio mode to handle --help, --version, and subcommands like install, mcp-config, hook-config. Sources: mcp-server/index.ts:1-100
REST API (`src/routes.ts`)
Hono-based HTTP server exposing the following endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/health | GET | Health check |
/v1/encode | POST | Store memory with source, tags, salience |
/v1/recall | POST | Retrieve relevant context |
/v1/capsule | POST | Get turn-sized memory packet |
/v1/status | GET | Runtime status |
/v1/observe | POST | Record tool outcome |
/v1/validate | POST | Validate memory usefulness |
Security: HTTP recall/capsule routes use sanitizeRecallOptions() to prevent private-memory ACL bypass via caller-supplied options. API key comparison uses crypto.timingSafeEqual to prevent timing attacks. Sources: src/routes.ts:1-80
Audrey Core (`src/audrey.ts`)
The central engine handling memory operations:
graph LR
ENC[encode] --> VEC[Vector Embedding]
ENC --> DB[(SQLite)]
REC[recall] --> VEC
REC --> DB
VEC --> EMB[Embedding Engine]
DB --> CAUSAL[Causal Validation]
CAUSAL --> CONF[Confidence Scoring]Key operations:
encode(): Stores episodic, semantic, or procedural memory with vector embeddingrecall(): Retrieves memories using hybrid (vector + FTS) searchconsolidate(): Extracts principles from repeated evidencedecay(): Reduces authority of stale, low-confidence memoriesbeforeAction(): Guard check returningallow/warn/blockafterAction(): Records tool execution outcomes
Storage Layer (`src/db.ts`)
SQLite with sqlite-vec extension for vector search.
| Feature | Configuration |
|---|---|
| Mode | WAL (Write-Ahead Logging) |
| Concurrency | No advisory lock; single writer per AUDREY_DATA_DIR |
| Indexing | sqlite-vec for vector similarity; FTS for full-text |
| Isolation | One directory per tenant required |
The AUDREY_PRAGMA_DEFAULTS environment variable (default 1) applies custom PRAGMA tuning. Set to 0 to revert to better-sqlite3 defaults.
Embedding Engine (`src/embedding.ts`)
ONNX runtime for local vector embedding without external API calls by default.
| Feature | Behavior |
|---|---|
| Warmup | Background embedding warmup at MCP boot (skippable with AUDREY_DISABLE_WARMUP=1) |
| Cold-start | First encode: ~525ms cold, ~28ms warm |
| Verbosity | AUDREY_ONNX_VERBOSE=1 restores EP-assignment warnings |
| Reuse | Validation, interference, affect resonance reuse main content vector |
Performance targets (v0.22.0):
- Encode p50: 15.2ms (40% faster than prior)
- Hybrid recall p50: 14.3ms (2.1x faster)
- Embedding reuse eliminated 3 of 4 redundant calls
Memory Model Architecture
graph TD
subgraph "Memory Types"
EPI[Episodic<br/>Specific observations,<br/>tool results, facts]
SEM[Semantic<br/>Consolidated principles<br/>from evidence]
PROC[Procedural<br/>Remembered ways to act,<br/>avoid, retry, verify]
end
subgraph "Memory Properties"
AFF[Affect & Salience<br/>Emotional weight, importance]
DEC[Interference & Decay<br/>Stale/conflicting lose authority]
CON[Contradiction Handling<br/>Competing claims tracked]
end
EPI --> AFF
SEM --> AFF
PROC --> AFF
EPI --> DEC
SEM --> CON
CON --> DECMemory Types
| Type | Description | Example |
|---|---|---|
| Episodic | Specific observations, tool results, session facts | "Stripe returns HTTP 429 above 100 req/s" |
| Semantic | Consolidated principles from repeated evidence | "Always check rate limits before batch operations" |
| Procedural | Remembered ways to act, avoid, retry, verify | "Retry with exponential backoff on network failures" |
Capsule Generation (`src/capsule.ts`)
The capsule endpoint assembles a turn-sized memory packet with sections:
graph TD
CAP[POST /v1/capsule] --> SEC[Section Assigner]
SEC --> R[recent_changes<br/>Created/reinforced recently]
SEC --> M[must_follow<br/>Critical rules]
SEC --> P[procedures<br/>Procedural memories]
SEC --> U[user_preferences<br/>Stated or tagged preferences]
SEC --> RK[risks<br/>Warnings and recent failures]
SEC --> UN[uncertain_or_disputed<br/>Disputed or low-confidence]Each section includes a reason field explaining why the entry was included. Recent tool failures (last 7 days) are automatically added to risks when includeRisks is enabled.
Audrey Guard Safety Loop
sequenceDiagram
participant Agent
participant Guard as Audrey Guard
participant Memory as Memory Store
participant LLM as LLM Provider
Agent->>Guard: tool + action
Guard->>Memory: recall(relevant)
Memory-->>Guard: context entries
Guard->>LLM: preflight check
LLM-->>Guard: decision + evidence
Guard-->>Agent: allow/warn/block + reasoning
Agent->>Guard: outcome (success/failure/wrong)
Guard->>Memory: record outcomeGuard Modes
| Mode | Behavior |
|---|---|
allow | Action proceeds normally |
warn | Action allowed but user notified |
block | Action prevented with evidence |
caution | Maps to warn display |
CLI usage:
audrey guard --tool Bash "npm run deploy"
audrey guard --hook --fail-on-warn # For hook integration
Validation Pipeline
The causal validation system (via src/causal.ts and src/validate.ts) evaluates whether stored memories actually helped:
- Confidence scoring uses reinforcement formula from
confidence.ts - Evidence tracking updates
usage_countandlast_used_at - Outcome classification:
used,helpful,wrong - Impact metrics aggregated by memory type
Deployment Architecture
graph LR
subgraph "Deployment Options"
NPM[npm package<br/>npx audrey]
DOCKER[Docker<br/>audrey-runtime]
COMPOSE[Docker Compose<br/>Full stack]
HOST[MCP Config<br/>Host-specific]
end
subgraph "Environment"
ENV1[AUDREY_DATA_DIR]
ENV2[AUDREY_LLM_PROVIDER]
ENV3[AUDREY_EMBEDDING_PROVIDER]
ENV4[AUDREY_API_KEY]
end
NPM --> ENV1
DOCKER --> ENV1
COMPOSE --> ENV1
HOST --> ENV2
HOST --> ENV3Interface Options by Agent
| Agent | Integration |
|---|---|
| Claude Code | npx audrey install --host claude-code + hook-config |
| Claude Desktop | MCP config via npx audrey mcp-config generic |
| Codex | MCP config via npx audrey mcp-config codex |
| Cursor | MCP config |
| Windsurf | MCP config |
| VS Code | MCP config |
| JetBrains | MCP config |
| Ollama | MCP config |
| Custom | REST API or JS/Python SDK |
REST Sidecar Security
| Configuration | Bind Address | Auth Required |
|---|---|---|
| Default | 127.0.0.1:7437 | No (loopback) |
| Production | 0.0.0.0:7437 | AUDREY_API_KEY required |
| Unsafe override | Any host | AUDREY_ALLOW_NO_AUTH=1 (not recommended) |
AUDREY_HOST env var explicitly opts in to network exposure.
CLI Architecture
graph TD
CLI[audrey CLI] --> PARSE[Argument Parser]
PARSE --> KNOWN[Known Subcommands]
KNOWN --> INSTALL[install]
KNOWN --> UNINSTALL[uninstall]
KNOWN --> MCP[mcp-config]
KNOWN --> HOOK[hook-config]
KNOWN --> DOCTOR[doctor]
KNOWN --> DEMO[demo]
KNOWN --> GUARD[guard]
KNOWN --> DREAM[dream]
KNOWN --> REEMBED[reembed]
KNOWN --> PROMOTE[promote]
KNOWN --> IMPACT[impact]
KNOWN --> UNKNOWN[Unknown/No subcommand]
UNKNOWN --> TTY{Human TTY?}
TTY -->|Yes| HELP[Print help]
TTY -->|No| MCP_SERVER[Start MCP server]
INSTALL --> HOST[Host-specific config]
HOOK --> APPLY[Apply hooks to settings]
PROMOTE --> WRITES[Write to project files]Key CLI Commands
| Command | Purpose |
|---|---|
audrey doctor | Diagnose configuration issues |
audrey status | Show runtime health |
audrey demo | Run interactive demonstration |
audrey guard | Check action against memory |
audrey install | Register Audrey with host |
audrey mcp-config | Generate MCP server configuration |
audrey hook-config | Generate agent hook configuration |
audrey dream | Trigger consolidation and decay |
audrey reembed | Re-embed all memories |
audrey promote | Write memories to project rules |
audrey impact | Show memory effectiveness report |
Configuration Environment Variables
| Variable | Default | Purpose |
|---|---|---|
AUDREY_DATA_DIR | System temp | Memory storage directory |
AUDREY_HOST | 127.0.0.1 | REST sidecar bind address |
AUDREY_PORT | 7437 | REST sidecar port |
AUDREY_API_KEY | unset | Bearer token for non-loopback |
AUDREY_LLM_PROVIDER | Configured | LLM for causal/validation |
AUDREY_EMBEDDING_PROVIDER | Configured | Embedding generation |
AUDREY_EMBEDDING_MODEL | Configured | Model name for embeddings |
AUDREY_EMBEDDING_DIM | Configured | Vector dimensions |
AUDREY_CONTEXT_BUDGET_CHARS | 4000 | Capsule character budget |
AUDREY_DISABLE_WARMUP | 0 | Skip embedding warmup |
AUDREY_DEBUG | 0 | Enable MCP debug logs |
AUDREY_PROFILE | 0 | Emit per-stage timings |
AUDREY_PROMOTE_ROOTS | unset | Allowed write roots for promote |
AUDREY_ENABLE_ADMIN_TOOLS | 0 | Enable export/import/forget |
SDK Architecture
JavaScript SDK
Direct TypeScript/Node import from audrey package:
import Audrey from 'audrey';
const brain = new Audrey({
baseUrl: 'http://127.0.0.1:7437',
agent: 'support-agent'
});
await brain.encode('Deploy failed due to OOM', {
source: 'direct-observation'
});
const results = await brain.recall('deploy failure', { limit: 5 });
Python SDK (`audrey-memory`)
from audrey_memory import Audrey
brain = Audrey(
base_url="http://127.0.0.1:7437",
api_key="secret",
agent="support-agent",
)
memory_id = brain.encode(
"Stripe returns HTTP 429 above 100 req/s",
source="direct-observation",
tags=["stripe", "rate-limit"],
)
Async clients available via AsyncAudrey / asyncio support.
Release Readiness Gates
graph LR
GATE[release:gate] --> CI[CI Workflow]
GATE --> PY[python:release:check]
GATE --> BENCH[bench:guard:*]
GATE --> DOCTOR[audrey doctor]
GATE --> DEMO[audrey demo]| Check | Purpose |
|---|---|
npm run release:gate | Full release readiness checklist |
npm run python:release:check | Python artifact verification |
npm run bench:guard:card | Guard benchmark suite |
npm run bench:guard:validate | Validation benchmarks |
npx audrey doctor | Runtime diagnostics |
npx audrey demo | Functional verification |
Data Flow: Encode to Recall
sequenceDiagram
participant Client
participant API as REST API<br/>/v1/encode
participant Audrey as Audrey Core
participant Embed as Embedding Engine
participant DB as SQLite + vec
Client->>API: POST /v1/encode<br/>content, source, tags
API->>Audrey: encode(content, options)
Audrey->>Embed: generateEmbedding(content)
Embed-->>Audrey: vector[1536]
Audrey->>DB: INSERT memory + vector
DB-->>Audrey: memory_id
Audrey-->>API: { id, confidence, ... }
API-->>Client: { id, ... }sequenceDiagram
participant Client
participant API as REST API<br/>/v1/recall
participant Audrey as Audrey Core
participant Embed as Embedding Engine
participant DB as SQLite + vec
participant Causal as Causal Validator
Client->>API: POST /v1/recall<br/>query, limit, scope
API->>Audrey: recall(query, options)
Audrey->>Embed: generateEmbedding(query)
Embed-->>Audrey: query_vector
Audrey->>DB: hybrid search<br/>vector_similarity + FTS
DB-->>Audrey: [entries]
Audrey->>Causal: score(entries)
Causal-->>Audrey: [scored_entries]
Audrey-->>API: { results, ... }
API-->>Client: { results, ... }Key Architectural Decisions
| Decision | Rationale |
|---|---|
| Local-only storage | Eliminates dependency on external services; ensures data isolation |
| SQLite + sqlite-vec | Proven reliability, no separate vector DB required |
| WAL mode without advisory lock | Performance for single-process; isolation required for multi-agent |
Separate AUDREY_DATA_DIR per tenant | Hard isolation boundary; prevents cross-tenant contamination |
| REST sidecar defaulting to loopback | Security by default; non-loopback requires explicit opt-in |
| Embedding warmup at boot | Eliminates cold-start penalty (~18.7x improvement) |
| Closed-loop validation | Closed feedback loop lifts autopilot ALIVE dimension |
Source: https://github.com/Evilander/Audrey / Human Manual
Memory Model
Related topics: System Architecture, Core Memory Operations, Preflight and Reflexes
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Core Memory Operations, Preflight and Reflexes
Memory Model
Audrey's Memory Model is a cognitive-inspired system that provides AI agents with persistent, evolving memory capabilities. Unlike simple vector databases, it implements a multi-layered memory architecture that mirrors human memory structures—episodic, semantic, and procedural—while incorporating affect, salience, and decay mechanisms to ensure memories remain relevant and actionable.
Architecture Overview
The Memory Model consists of several interconnected subsystems that work together to store, retrieve, consolidate, and forget information over time.
graph TD
A[User/Agent Input] --> B[Episodic Memory]
B --> C[Consolidation Process]
C --> D[Semantic Memory]
C --> E[Procedural Memory]
D --> F[Confidence Scoring]
E --> F
B --> G[Affect Module]
F --> G
G --> H[Salience Calculation]
H --> I[Recall Ranking]
I --> J[Preflight Check]
J --> K[Guard Decision]
L[Interference] -.->F
L -.->I
M[Decay Engine] -.->D
M -.->ESources: README.md
Memory Types
Audrey distinguishes between three primary memory types, each serving a distinct role in agent cognition.
Episodic Memory
Episodic memory stores specific observations, tool results, preferences, and session facts. These are the raw recordings of events and interactions that agents experience directly.
| Property | Description |
|---|---|
memory_type | episode |
source | direct-observation, told-by-user, retrieved |
confidence | Initial high confidence that decays over time |
retrieval_count | Number of times this memory was recalled |
Sources: src/capsule.ts
Semantic Memory
Semantic memory represents consolidated principles extracted from repeated evidence. These memories encode general knowledge and learned rules that persist beyond specific sessions.
| Property | Description |
|---|---|
memory_type | semantic |
confidence | Derived from supporting episode frequency |
supporting_count | Number of episodes supporting this principle |
challenge_count | Number of contradictory episodes |
Sources: src/causal.ts
Procedural Memory
Procedural memory stores remembered ways to act, avoid, retry, or verify. These encode action patterns and procedures that agents have learned through experience.
| Property | Description |
|---|---|
memory_type | procedural |
tags | procedure, retry, avoid, verify |
confidence | Reinforced by successful outcomes |
Sources: src/capsule.ts
Confidence System
The confidence system is the foundational mechanism that determines memory reliability and recall priority. It incorporates multiple signals including recency, reinforcement, and affect.
Confidence Calculation
graph LR
A[Base Confidence] --> B[Recency Decay]
B --> C[Reinforcement Boost]
C --> D[Affect Adjustment]
D --> E[Interference Penalty]
E --> F[Final Confidence]Sources: src/confidence.ts
Recency Decay
Memory confidence decreases over time through a half-life decay mechanism. Memories become less authoritative unless reinforced through retrieval or validation.
// From src/confidence.ts
recencyDecay(halfLifeDays: number, createdAt: Date): number
| Parameter | Type | Description |
|---|---|---|
halfLifeDays | number | Days until confidence halves |
createdAt | Date | Memory creation timestamp |
The decay function throws RangeError when halfLifeDays <= 0 to prevent NaN or Infinity results.
Sources: src/confidence.ts
Reinforcement Formula
Validation outcomes reinforce or diminish memory confidence through the feedback loop:
| Outcome | Effect |
|---|---|
helpful | Increases salience, bumps retrieval_count for semantic/procedural |
wrong | Decreases salience, bumps challenge_count for semantic |
used | Neutral signal with smaller salience delta |
The math reuses the existing reinforcement formula from confidence.ts.
Sources: CHANGELOG.md
Consolidation System
Consolidation transforms episodic memories into semantic and procedural knowledge through periodic processing, often called "dream" mode.
Consolidation Workflow
graph TD
A[Nightly Dream Process] --> B[Identify Repeated Episodes]
B --> C[Extract Common Patterns]
C --> D[Generate Semantic Principles]
C --> E[Extract Procedures]
D --> F[Create New Semantic Memory]
E --> G[Create/Update Procedural Memory]
F --> H[Link Supporting Episodes]
G --> HSources: README.md
Consolidation Implementation
The consolidation process runs through memory_dream and is scheduled to ensure that consolidation and decay remain current.
// From src/consolidate.ts - conceptual interface
async function consolidate(audrey: Audrey, options?: ConsolidateOptions): Promise<ConsolidationResult>
Consolidation moves SELECTs inside the surrounding transaction to prevent concurrent writers from slipping rows in or out between read and write.
Sources: CHANGELOG.md
Decay Engine
The decay engine implements forgetting curves that reduce memory authority over time, ensuring stale information doesn't dominate recall.
Decay Mechanism
graph LR
A[Time Passes] --> B{Still Being Used?}
B -->|Yes| C[Decay Paused]
B -->|No| D[Gradual Decay]
D --> E[Confidence Decreases]
E --> F[Memory Becomes Less Authoritative]Sources: src/decay.ts
Decay Parameters
| Parameter | Default | Purpose |
|---|---|---|
halfLifeDays | Configurable | Base decay rate |
minConfidence | 0.1 | Floor value |
decayEnabled | true | Global on/off |
Decay applies to semantic and procedural memories differently, with semantic memories decaying faster unless reinforced.
Sources: src/decay.ts
Affect and Salience
Affect (emotional weight and importance) influences salience, determining which memories demand attention and which fade into background knowledge.
Affect Module
graph TD
A[Memory Event] --> B[Detect Emotional Signals]
B --> C[Calculate Valence]
B --> D[Calculate Arousal]
C --> E[Determine Mood State]
D --> E
E --> F[Affect Boost/Penalty]
F --> G[Effective Salience]Sources: src/affect.ts
Salience Calculation
Effective salience is clamped to the range [0, 1] to prevent unbounded values from extreme arousal boosts. The formula considers:
- Memory type (episodic, semantic, procedural)
- Confidence level
- Recency
- Emotional valence and arousal
// From src/affect.ts
effectiveSalience(baseSalience: number, arousalBoost: number): number
The timeDeltaDays function no longer propagates NaN from invalid created_at timestamps.
Sources: src/affect.ts
Interference Handling
Interference prevents conflicting or competing memories from silently overwriting each other, maintaining an accurate picture of contradictory knowledge.
Interference Types
graph TD
A[New Memory] --> B{Conflicting Memory Exists?}
B -->|Yes| C[Track Contradiction]
B -->|No| D[Normal Storage]
C --> E[Disputed State]
E --> F[Monitor Both]
F --> G[Resolution Through Validation]Sources: src/interference.ts
Memory States for Contradictions
| State | Description |
|---|---|
active | Default stable state |
disputed | Competing claims detected |
context_dependent | Truth depends on context |
superseded | Older knowledge replaced |
When memories have contradictory content, both are preserved with appropriate states rather than silently overwriting.
Sources: src/capsule.ts
Causal Inference
The causal module extracts cause-effect relationships from episodic memory patterns, enabling agents to understand why certain actions lead to certain outcomes.
Causal Analysis
// From src/causal.ts - conceptual interface
async function analyzeCausalLinks(episodes: Episode[]): Promise<CausalRelationship[]>
The causal module validates LLM response shapes before reading fields and rejects non-finite confidence values.
Sources: src/causal.ts
Causal Memory Properties
| Property | Description |
|---|---|
cause_id | Memory that triggers outcome |
effect_id | Resulting memory |
confidence | Causal link strength |
evidence_count | Episodes supporting this link |
Validation Feedback Loop
The closed-loop feedback system enables continuous improvement of memory accuracy through agent validation.
Validation Flow
graph TD
A[Memory Recall] --> B[Agent Uses Memory]
B --> C[Validation Request]
C --> D{Helpful?}
D -->|Yes| E[Reinforce: helpful]
D -->|No| F{Wrong?}
F -->|Yes| G[Diminish: wrong]
F -->|No| H[Mark: used]
E --> I[Update Salience & Stats]
G --> I
H --> ISources: CHANGELOG.md
Validation API
| Endpoint | Method | Description |
|---|---|---|
/v1/validate | POST | Canonical validation endpoint |
/v1/mark-used | POST | Legacy alias (defaults to outcome=used) |
The memory_validate MCP tool accepts outcomes: helpful, wrong, or used.
Sources: CHANGELOG.md
Recall and Retrieval
Memory recall uses hybrid retrieval combining vector similarity and full-text search to balance precision and recall.
Retrieval Modes
| Mode | Description |
|---|---|
hybrid | Vector similarity + FTS (default) |
vector | FTS-bypass fast path |
The hybrid mode was the default since v0.22.0, replacing the removed hybrid_strict mode (which was a silent alias with no behavioral difference).
Sources: CHANGELOG.md
Recall Factors
When ranking results, Audrey considers:
- Semantic similarity - Vector distance from query
- Recency - Time since creation or last retrieval
- Confidence - Current confidence score
- Salience - Effective importance (affect-adjusted)
- Agent relevance - Scope and ownership
Tool-Trace Learning
Audrey learns from tool execution traces, converting tool results into memory events that inform future actions.
Tool-Trace Memory Cycle
graph TD
A[Tool Execution] --> B[Capture Tool Trace]
B --> C[Extract Results & Errors]
C --> D{Successful?}
D -->|Yes| E[Encode Success Pattern]
D -->|No| F[Encode Failure Pattern]
E --> G[Episodic Memory]
F --> G
G --> H[Consolidation]
H --> I[Procedural Memory]The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes.
Sources: README.md
Memory Capsule
The Memory Capsule provides a turn-sized memory packet containing categorized sections relevant to the current context.
Capsule Sections
| Section | Content |
|---|---|
must_follow | Trusted rules and critical constraints |
risks | Identified dangers and warnings |
procedures | Known action procedures |
user_preferences | Stated and inferred preferences |
uncertain_or_disputed | Contested or low-confidence knowledge |
recent_changes | Freshly updated memories |
project_facts | Default for semantic/episodic |
Sources: src/capsule.ts
Capsule Generation
Capsule sections are determined by memory type, tags, source trust level, state, confidence, and recency:
// From src/capsule.ts
determineSections(
entry: MemoryEntry,
result: RecallResult,
tags: string[],
recentWindowMs: number
): Array<keyof MemoryCapsule['sections']>
Trusted sources include direct-observation and told-by-user; these can populate must_follow sections.
Sources: src/capsule.ts
Guard Integration
The Memory Guard uses the Memory Model to enforce pre-action checks, returning allow, warn, or block decisions with evidence.
Guard Decision Flow
graph TD
A[Action Request] --> B[Preflight Check]
B --> C[Recall Relevant Memory]
C --> D[Apply Reflexes]
D --> E{Blocking Reflex?}
E -->|Yes| F[BLOCK]
E -->|No| G{Warning Reflex?}
G -->|Yes| H[ WARN]
G -->|No| I[ALLOW]The Guard decision reuses existing preflight and reflex machinery without performing two independent recall passes.
Sources: CHANGELOG.md
Summary
Audrey's Memory Model provides a comprehensive cognitive architecture for AI agents:
- Multi-type storage with episodic, semantic, and procedural memories
- Dynamic confidence that evolves through use and validation
- Consolidation that transforms experience into knowledge
- Decay that prevents stale information from dominating
- Affect that weights memories by emotional importance
- Interference tracking that maintains truth in the face of contradictions
- Causal inference that extracts cause-effect relationships
- Closed-loop validation that continuously improves accuracy
This architecture ensures agents remember what matters, forget what doesn't, and maintain coherent, actionable knowledge across sessions.
Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)
Audrey Guard
Related topics: Core Memory Operations, Preflight and Reflexes
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Memory Operations, Preflight and Reflexes
Audrey Guard
Overview
Audrey Guard is the headline memory loop in the Audrey system—a memory-before-action enforcement mechanism that checks AI agents' intended operations against accumulated memory before execution. It serves as a firewall layer that can allow, warn, or block tool invocations based on historical evidence, prior failures, project rules, and risk patterns.
The Guard operates by retrieving relevant memories through semantic recall, evaluating them against the proposed action, and returning a structured decision with supporting evidence. This enables agents to avoid repeating past mistakes, respect project-specific rules, and make informed decisions grounded in durable context.
Sources: README.md
Purpose and Scope
Audrey Guard addresses a fundamental problem: agents forget the exact mistakes they made yesterday. They repeat broken commands, lose project-specific rules, miss contradictions, and treat every new session like a cold start.
Guard's scope encompasses:
| Concern | Description |
|---|---|
| Failure Prevention | Block or warn on repeated failures identified through memory_recall |
| Risk Awareness | Surface prior failures, risks, and warnings as preflight evidence |
| Rule Enforcement | Check must-follow rules and procedures before action |
| Evidence Generation | Return structured decisions with provenance metadata |
| Closed-Loop Validation | Validate whether the memory helped after action execution |
Sources: README.md
Architecture
High-Level Components
graph TD
A[Agent Tool Call] --> B[Audrey Guard]
B --> C[memory_preflight]
C --> D[memory_recall]
D --> E[SQLite Store<br/>Episodic + Semantic + Procedural]
C --> F[Rule Evaluation]
F --> G[Reflex Pattern Matching]
C --> H[Decision Engine]
H --> I[block<br/>warn<br/>allow]
I --> J[Evidence Capsule]
J --> K[Agent Action Execution]
K --> L[memory_validate]
L --> M[Outcome: helpful<br/>used<br/>wrong]
M --> EGuard Decision Flow
The Guard evaluates incoming tool actions through a multi-stage pipeline:
- Action Canonicalization - Normalize the tool name and action string
- Semantic Recall - Query memory store for relevant past experiences
- Risk Assessment - Evaluate prior failures, warnings, and risks
- Rule Matching - Check against must-follow rules and procedures
- Decision Synthesis - Combine signals into block/warn/allow verdict
- Evidence Packaging - Return decision with provenance and references
Sources: src/reflexes.ts
CLI Interface
Command Syntax
audrey guard --tool <tool_name> "<action_command>" [options]
Core Options
| Option | Description | Default |
|---|---|---|
--tool <name> | The tool category (e.g., Bash, Write, Edit) | Required |
<action> | The specific action string to evaluate | Required |
--cwd <path> | Working directory for context | Current directory |
--session-id <id> | Session identifier for event correlation | Auto-generated |
--hook | Run in hook mode (for agent integration) | false |
--fail-on-warn | Treat warnings as errors (exit code non-zero) | false |
--strict | Enable strict preflight evaluation | false |
--json | Output results as JSON | false |
--explain | Include detailed explanation in output | false |
--include-capsule | Embed full memory capsule in response | false |
Sources: mcp-server/index.ts
Example Usage
# Block a repeated failed deploy
audrey guard --tool Bash "npm run deploy"
# Warn on risky file operations
audrey guard --tool Write --strict "database.sql"
# Hook mode for Claude Code integration
audrey guard --tool Bash --hook "rm -rf node_modules"
SDK Integration
Sync Client
import Audrey from 'audrey-memory';
const brain = new Audrey({
base_url: 'http://127.0.0.1:7437',
agent: 'support-agent',
});
const decision = await brain.beforeAction({
tool: 'Bash',
action: 'npm run deploy',
});
console.log(decision.decision); // 'block' | 'warn' | 'allow'
console.log(decision.evidence); // Array of MemoryEvidence
brain.close();
Preflight Options
| Option | Type | Description | |
|---|---|---|---|
tool | string | Tool category being evaluated | |
action | string | Action string to preflight | |
sessionId | string | Correlation ID for event tracking | |
mode | `'standard' \ | 'strict'` | Evaluation strictness |
includeCapsule | boolean | Include full memory capsule | |
failureWindowHours | number | Hours to look back for failures | |
recentChangeWindowHours | number | Hours for recent-change rules |
Sources: src/routes.ts
Decision Outcomes
Verdict Types
| Decision | Description | Agent Behavior |
|---|---|---|
block | Action is prohibited based on memory | Must not execute |
warn | Action has risk indicators | Should pause and confirm |
allow | No memory conflicts detected | May proceed |
Decision Display Mapping
function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
if (result.decision === 'block') return 'block';
if (result.decision === 'caution') return 'warn';
return 'allow';
}
Sources: mcp-server/index.ts
Memory Preflight
The memory_preflight function checks prior failures, risks, rules, and relevant procedures before an action executes. It builds a structured preflight report containing:
Capsule Sections
| Section | Content Source | Trigger Condition |
|---|---|---|
recent_changes | Memories within recent-change window | Created or reinforced recently |
must_follow | Must-follow rules | Tagged as must-follow |
procedures | Procedural memories + procedures | Matching query or tagged |
user_preferences | User-stated preferences | User-told or tagged |
risks | Risk-tagged memories + recent failures | Tagged risk or 7-day failures |
uncertain_or_disputed | Low-confidence or disputed memories | Low confidence or disputed state |
Sources: src/capsule.ts
Reflex System
Memory reflexes convert remembered evidence into trigger-response guidance that agents can follow.
Reflex Response Types
| Type | Description |
|---|---|
block | Strict prohibition based on evidence |
warn | Caution signal with context |
guide | Recommended action or approach |
Reflex Report Generation
function summarizeReflexes(decision: PreflightDecision, reflexes: MemoryReflex[]): string {
const blocks = reflexes.filter(r => r.response_type === 'block').length;
const warnings = reflexes.filter(r => r.response_type === 'warn').length;
const guides = reflexes.filter(r => r.response_type === 'guide').length;
// Returns human-readable summary
}
Sources: src/reflexes.ts
Validation Loop
After action execution, agents validate whether the memory helped:
Outcome Types
| Outcome | Meaning | Effect |
|---|---|---|
helpful | Memory was correct and beneficial | Increases salience |
used | Memory was referenced | Updates usage metrics |
wrong | Memory was incorrect | Triggers decay or dispute |
Validation Endpoint
POST /v1/event
{
"outcome": "helpful",
"receipt_id": "receipt-from-preflight",
"evidence_feedback": {
"evidence-id-1": "used",
"evidence-id-2": "helpful"
}
}
Sources: src/routes.ts
Failure Decay
Starting from version 1.0.1, Audrey Guard implements failure decay to prevent stale blocks:
| Configuration | Default | Behavior |
|---|---|---|
failureDecayDays | 7 | Same-action failures older than window treated as stale |
To restore pre-1.0.1 blocking behavior (permanent blocks):
const controller = new MemoryController({
failureDecayDays: 0,
});
Sources: CHANGELOG.md
Security Considerations
HTTP API Security
- Default bind address changed from
0.0.0.0to127.0.0.1 - Refuses to start on non-loopback without
AUDREY_API_KEYorAUDREY_ALLOW_NO_AUTH=1 - API key comparison uses
crypto.timingSafeEqualto prevent timing attacks /v1/recalland/v1/capsuleno longer body-spread caller options
Sources: CHANGELOG.md
Hook Configuration Safety
The audrey promote --yes command refuses to write .claude/rules/*.md outside process.cwd() unless the target path is in AUDREY_PROMOTE_ROOTS. This prevents prompt-injection attacks via malicious MCP callers.
Tool Trace Handling
Tool traces are recorded through PostToolUse hooks with redaction applied:
- Redaction - Sensitive fields (API keys, tokens, credentials) are masked
- Action Key Generation - Deterministic ID for trace correlation
- Event Recording - Tool inputs/outputs stored with session context
Sources: mcp-server/index.ts
Demo Scenario: Repeated Failure
The repeated-failure demo demonstrates Guard's blocking behavior:
npx audrey demo --scenario repeated-failure
This no-key, no-network demo:
- Creates a temporary memory store
- Records a failed deploy with the fix
- Teaches Audrey the failure pattern
- Shows Guard blocking the repeated attempt with evidence
Sources: README.md
See Also
- Memory Recall - Semantic retrieval system
- Memory Reflexes - Trigger-response guidance
- Impact Reporting - Memory effectiveness metrics
- Audrey Doctor - Runtime health verification
Source: https://github.com/Evilander/Audrey / Human Manual
Core Memory Operations
Related topics: Memory Model, Audrey Guard, Preflight and Reflexes, Data Storage
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Memory Model, Audrey Guard, Preflight and Reflexes, Data Storage
Core Memory Operations
This page documents the fundamental memory operations in Audrey: encoding, recall, hybrid retrieval, capsule generation, and impact tracking. Together, these operations form the core pipeline that enables agents to store, retrieve, and learn from persistent memory across sessions.
Overview
Audrey's Core Memory Operations handle the complete lifecycle of memory within the system. The operations are designed around a local-first, SQLite-backed architecture that provides semantic search capabilities without requiring external vector databases or hosted services.
graph LR
A[Encode] -->|store| B[(SQLite)]
B -->|recall| C[Recall]
C -->|hybrid| D[Hybrid Search]
D -->|compose| E[Capsule]
E -->|track| F[Impact]
F -->|reinforce| AThe primary design goals are:
- Durability: All memories persist in local SQLite storage
- Semantic Search: Vector embeddings enable similarity-based recall
- Hybrid Retrieval: Combines vector and keyword search for accuracy
- Feedback Loop: Impact tracking enables continuous memory reinforcement
Memory Types
Audrey distinguishes between three primary memory types that influence retrieval behavior and storage strategy.
| Memory Type | Description | Typical Content |
|---|---|---|
episodic | Specific observations and session facts | Tool results, error messages, user feedback |
semantic | Consolidated principles extracted from evidence | Learned rules, best practices, project conventions |
procedural | Remembered ways to act, avoid, or verify | Deployment procedures, recovery steps, verification commands |
Each memory type has distinct promotion criteria. Procedural memories can be promoted to rules with lower evidence thresholds, while semantic memories require higher confidence and evidence counts before promotion.
Sources: src/recall.ts:15-17
Memory Encoding
The encoding operation transforms raw observations into persistent memory entries. When encoding, Audrey generates embeddings, assigns salience scores, and stores metadata that enables future retrieval.
Encode Process Flow
graph TD
A[Input: Raw Text] --> B[Generate Embedding]
B --> C[Calculate Salience]
C --> D[Assign Memory Type]
D --> E[Tag Analysis]
E --> F[Store in SQLite]
F --> G[Update Vector Index]Encode Options
The encode operation accepts several configuration parameters:
| Parameter | Type | Default | Purpose |
|---|---|---|---|
source | string | 'direct-observation' | Origin of the memory |
memory_type | string | 'episodic' | Classification of memory content |
tags | string[] | [] | Categorical labels for filtering |
wait_for_consolidation | boolean | false | Opt-in read-after-write semantics |
Sources: src/encode.ts
Source Types
Memory sources indicate provenance and affect how memories are treated during recall:
| Source | Trust Level | Description |
|---|---|---|
direct-observation | High | Agent's own observations from tool execution |
told-by-user | High | Explicit user-provided information |
inferred | Medium | AI-inferred conclusions |
external | Low | Information from external systems |
Trusted sources (direct-observation, told-by-user) can populate must-follow sections in capsules, while untrusted sources are flagged as uncertain or disputed.
Sources: src/capsule.ts:18-20
Memory Recall
Recall is the primary mechanism for retrieving relevant memories based on semantic similarity and keyword matching. The recall operation searches across all memory types using configurable retrieval strategies.
Retrieval Modes
Audrey supports three retrieval modes that determine how search results are computed:
| Mode | Description | Use Case |
|---|---|---|
hybrid | Combines vector similarity with FTS keyword matching (default) | Balanced accuracy for general queries |
vector | Pure semantic similarity using embeddings | When keywords are ambiguous |
keyword | Full-text search only, bypasses vector index | Fast, keyword-exact matching |
Sources: src/recall.ts:12-14
Recall Architecture
graph TD
A[Query Input] --> B{Mode Check}
B -->|hybrid| C[Vector Pass]
B -->|hybrid| D[Keyword Pass]
B -->|vector| E[Vector Pass Only]
B -->|keyword| F[Keyword Pass Only]
C --> G[Merge & Score]
D --> G
G --> H[Filter by Confidence]
H --> I[Apply Filters]
I --> J[Return Results]Recall Options
The recall operation accepts a comprehensive set of filtering and result-shaping options:
| Parameter | Type | Default | Purpose | |
|---|---|---|---|---|
minConfidence | number | 0 | Minimum confidence threshold (0-1) | |
types | MemoryType[] | ['episodic', 'semantic', 'procedural'] | Memory types to search | |
limit | number | 10 | Maximum results to return | |
includeProvenance | boolean | false | Include source metadata | |
includeDormant | boolean | false | Include decayed/inactive memories | |
tags | string[] | undefined | Filter by tags | |
sources | string[] | undefined | Filter by source type | |
after | Date | undefined | Filter memories created after date | |
before | Date | undefined | Filter memories created before date | |
includePrivate | boolean | false | Include agent-private memories | |
retrieval | string | 'hybrid' | Retrieval mode selection | |
scope | `'agent' \ | 'shared'` | 'agent' | Memory scope filter |
Sources: src/recall.ts:5-22
RecallFilters Structure
interface RecallFilters {
tags?: string[];
sources?: string[];
after?: Date;
before?: Date;
agent?: string; // Filtered by scope when scope === 'agent'
}
Filters are combined with AND logic—memories must match all specified filters to be included in results.
Agent and Scope Filtering
The scope parameter determines which memories are accessible:
agent(default): Only memories associated with the requesting agentshared: Memories marked as shared across agents
When scope is 'agent', the agent filter is automatically set to the requesting agent's identity. This ensures memory isolation between different agents.
Hybrid Search
Hybrid search combines vector similarity and full-text search to achieve more accurate recall results than either method alone.
Hybrid Recall Pipeline
graph LR
A[Query] --> B[Embedding Model]
A --> C[FTS Index]
B --> D[Vector Scores]
C --> E[Keyword Scores]
D --> F[Score Normalization]
E --> F
F --> G[Weighted Merge]
G --> H[Ranked Results]Score Merging Strategy
The hybrid approach normalizes scores from both vector and keyword passes before merging. This ensures that memories matched by keywords are not overshadowed by high vector similarity scores, and vice versa.
Sources: src/hybrid-recall.ts
Full-Text Search Integration
The FTS module provides keyword-based search capabilities:
interface FTSResult {
memory_id: string;
rank: number;
snippet?: string;
}
FTS uses SQLite's built-in FTS5 extension for fast keyword matching. The FTS index is updated synchronously during encoding to ensure keyword search reflects current memory state.
Sources: src/fts.ts
Memory Capsule
The capsule is a turn-sized memory packet that organizes relevant memories into actionable sections for agent consumption. It synthesizes recall results into a structured format optimized for quick agent review.
Capsule Sections
| Section | Purpose | Trigger Conditions |
|---|---|---|
must_follow | High-priority directives | Trusted source + must-follow tags |
uncertain_or_disputed | Flagged content requiring verification | Low confidence, disputed state, or untrusted source |
risks | Known risks and hazards | Risk-related tags |
procedures | Step-by-step instructions | Procedural memory type or procedure tags |
user_preferences | User-specific preferences | Preference tags or told-by-user source |
project_facts | Consolidated project knowledge | Semantic memories with no other section match |
recent_changes | Recently updated information | Memories within recent time window |
Sources: src/capsule.ts:22-38
Section Determination Logic
graph TD
A[Memory Entry] --> B{Source Trusted?}
B -->|Yes| C{Has Must-Follow Tags?}
B -->|No| D[Uncertain/Disputed]
C -->|Yes| E[Must-Follow Section]
C -->|No| F{Has Risk Tags?}
D --> F
F -->|Yes| G[Risks Section]
F -->|No| H{Procedural Type?}
H -->|Yes| I[Procedures Section]
H -->|No| J{Has Preference Tags?}
J -->|Yes| K[User Preferences]
J -->|No| L{Uncertain State?}
L -->|Yes| M[Uncertain/Disputed]
L -->|No| N[Project Facts]Capsule Structure
interface MemoryCapsule {
generated_at: string;
sections: {
must_follow?: MemorySection;
uncertain_or_disputed?: MemorySection;
risks?: MemorySection;
procedures?: MemorySection;
user_preferences?: MemorySection;
project_facts?: MemorySection;
recent_changes?: MemorySection;
};
}
Tag-Based Section Assignment
Capsule generation uses predefined tag sets to categorize memories:
| Tag Set | Matching Tags |
|---|---|
MUST_FOLLOW_TAGS | Critical directives that must be followed |
RISK_TAGS | Risk-related keywords |
PROCEDURE_TAGS | Procedure-related keywords |
PREFERENCE_TAGS | User preference keywords |
Sources: src/capsule.ts:12-15
Impact Tracking
Impact tracking closes the feedback loop by recording whether recalled memories proved useful. This enables continuous reinforcement of valuable memories and decay of misleading ones.
Outcome Types
| Outcome | Description | Effect on Memory |
|---|---|---|
helpful | Memory drove a correct action | Increases salience, bumps retrieval_count |
wrong | Memory was misleading | Decreases salience, bumps challenge_count |
used | Memory was referenced | Small positive salience delta |
Sources: src/impact.ts
Impact Report Structure
interface ImpactReport {
generatedAt: string;
windowDays: number;
totals: {
episodic: number;
semantic: number;
procedural: number;
};
validatedTotal: number;
validatedInWindow: number;
byType: {
episodic: { validated: number; recent: number };
semantic: { validated: number; recent: number; challenged: number };
procedural: { validated: number; recent: number };
};
outcomeBreakdownInWindow: {
helpful: number;
wrong: number;
used: number;
};
topUsed: MemoryStat[];
weakest: MemoryStat[];
recentActivity: MemoryStat[];
}
Impact Metrics
The impact system tracks several key metrics:
- usage_count: Number of times a memory was successfully used
- salience: Computed importance score based on reinforcement history
- validation events: Recorded outcomes linked to specific recall events
- challenge_count: Number of times a memory was marked as wrong
This data feeds into consolidation and decay processes, ensuring that frequently useful memories remain prominent while stale or misleading memories lose authority over time.
Data Flow Summary
graph LR
subgraph Encode
A1[Text Input] --> A2[Embed]
A2 --> A3[Salience]
A3 --> A4[Store]
end
subgraph Recall
B1[Query] --> B2[Hybrid Search]
B2 --> B3[Score & Rank]
B3 --> B4[Filter]
B4 --> B5[Recall Results]
end
subgraph Capsule
C1[Recall Results] --> C2[Section Analysis]
C2 --> C3[Tag Matching]
C3 --> C4[Capsule Output]
end
subgraph Impact
D1[Agent Feedback] --> D2[Outcome Recording]
D2 --> D3[Reinforce/Decay]
D3 --> A3
end
A4 --> B5
B5 --> C1
C4 --> D1Configuration Considerations
When deploying Audrey's core memory operations, consider these configuration points:
| Setting | Recommendation | Impact |
|---|---|---|
AUDREY_EMBEDDING_PROVIDER | Pin explicitly | Determines embedding quality |
AUDREY_LLM_PROVIDER | Pin explicitly | Affects consolidation quality |
AUDREY_DATA_DIR | Separate per tenant/environment | Ensures isolation and backup simplicity |
| Retrieval mode | Use hybrid for most cases | Balances precision and recall |
wait_for_consolidation | Enable for critical writes | Guarantees read-after-write consistency |
Related Operations
The core memory operations interact with several supporting systems:
- Guard: Uses preflight checks before tool execution
- Reflexes: Trigger-response patterns derived from memory
- Consolidation: Extracts semantic memories from episodic evidence
- Decay: Reduces authority of stale memories over time
- Promotion: Converts high-value memories to Claude rules
Sources: [src/recall.ts:15-17](https://github.com/Evilander/Audrey/blob/main/src/recall.ts)
Preflight and Reflexes
Related topics: Audrey Guard, Core Memory Operations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Audrey Guard, Core Memory Operations
Preflight and Reflexes
Overview
Preflight and Reflexes form Audrey's core decision-making loop for AI agents. Before any tool action executes, Audrey performs a preflight check that consults memory to determine whether the action should be allowed, warned about, or blocked entirely.
The system operates as Audrey's "memory firewall"—a security and guidance layer that prevents agents from repeating mistakes, reinforces learned behaviors, and surfaces relevant context before sensitive operations. This mechanism transforms episodic and semantic memories into actionable guidance that agents can evaluate in real-time.
Architecture
graph TD
A[Agent Action Request] --> B[Preflight Check]
B --> C{Memory Recall}
C --> D[Episodic Memory]
C --> E[Semantic Memory]
C --> F[Procedural Memory]
D --> G[Memory Reflexes]
E --> G
F --> G
G --> H{Decision?}
H -->|Match Found| I[Return Reflex Result]
H -->|No Match| J[Allow Action]
I --> K{block}
I --> L[warn]
I --> M[guide]
K --> N[Block with Evidence]
L --> O[Warn with Guidance]
M --> P[Proceed with Hints]Memory Reflexes
Memory Reflexes are the atomic decision units within the Preflight system. Each reflex contains a trigger condition, a response type, and optional guidance content.
Response Types
| Response Type | Decision | Description |
|---|---|---|
block | block | Prevents the action entirely; returns blocking evidence |
warn | caution | Allows action but presents warning with recommendations |
guide | allow | Provides informational guidance without blocking |
Sources: src/reflexes.ts:1-50
Reflex Structure
interface MemoryReflex {
response_type: 'block' | 'warn' | 'guide';
triggered_by: string; // Memory tag or rule identifier
message: string; // Human-readable explanation
recommended_action?: string; // Suggested alternative
memory_ids: string[]; // Source memories that triggered this reflex
confidence: number; // Reflex confidence score
}
Preflight Process
Decision Flow
The preflight process evaluates incoming actions against three memory types and returns a consolidated decision:
graph LR
A[Action + Context] --> B[Tag Extraction]
B --> C{Must-Follow Rules?}
C -->|Yes| D[BLOCK or UNCERTAIN]
C -->|No| E{Risk Tags?}
E -->|Yes| F[Add to WARN]
E -->|No| G{Procedures?}
G -->|Yes| H[Add GUIDANCE]
G -->|No| I{Preferences?}
I -->|Yes| J[Include in Capsule]
I -->|No| K[Default ALLOW]Decision Types
| Decision | Meaning | Exit Code Behavior |
|---|---|---|
allow | Action proceeds normally | Continue execution |
caution | Action proceeds with warning | Log warning, continue |
block | Action is prevented | Return error, halt |
Sources: mcp-server/index.ts:80-95
Memory Capsule Integration
Preflight builds a Memory Capsule—a structured context bundle that aggregates relevant memories by category. The capsule sections determine which memories appear in the agent's context window.
interface MemoryCapsule {
sections: {
must_follow: MemoryReflex[]; // Critical rules
recent_changes: MemoryReflex[]; // New learnings
procedures: MemoryReflex[]; // How-to guidance
user_preferences: MemoryReflex[]; // Stated preferences
risks: MemoryReflex[]; // Warnings and hazards
uncertain_or_disputed: MemoryReflex[]; // Low-confidence or contested
project_facts: MemoryReflex[]; // Relevant facts
};
triggered_by: string;
generated_at: string;
}
Sources: src/capsule.ts:1-50
Tag-Based Classification
Memory reflexes are classified using tag matching against predefined tag sets:
| Tag Set | Purpose | Associated Section |
|---|---|---|
MUST_FOLLOW_TAGS | Critical rules that must be obeyed | must_follow |
RISK_TAGS | Potential hazards or warnings | risks |
PROCEDURE_TAGS | Step-by-step guidance | procedures |
PREFERENCE_TAGS | User-stated preferences | user_preferences |
Sources: src/capsule.ts:50-100
Building Reflex Reports
Report Generation
The buildReflexReport function constructs a complete preflight report from an action:
export async function buildReflexReport(
audrey: Audrey,
action: string,
options: ReflexOptions = {},
): Promise<MemoryReflexReport>
Report Structure
interface MemoryReflexReport {
decision: 'allow' | 'caution' | 'block';
reflexes: MemoryReflex[];
capsule: MemoryCapsule;
summary: string; // Human-readable summary
triggered_at: string; // ISO timestamp
session_id?: string; // Optional session context
}
Sources: src/reflexes.ts:80-120
Summarization Logic
The summarizeReflexes function generates human-readable summaries:
function summarizeReflexes(
decision: PreflightDecision,
reflexes: MemoryReflex[],
): string {
const blocks = reflexes.filter(r => r.response_type === 'block').length;
const warnings = reflexes.filter(r => r.response_type === 'warn').length;
const guides = reflexes.filter(r => r.response_type === 'guide').length;
// Returns format: "Stop: 2 blocking, 1 warning, 3 guidance matched."
// Or: "Slow down: ..." or "Proceed: ..."
}
Validation Layer
Before reflexes are applied, the validation layer ensures response integrity:
Response Validation
// From src/validate.ts
// Validates LLM response shape before reading fields
// - Rejects non-object/array conditions
// - Only counts new evidence toward supporting_count
// - Throws on malformed response shapes
Sources: src/validate.ts
Validation Behavior
| Check | Invalid Condition | Behavior |
|---|---|---|
| Response Shape | Non-object/array | Reject and throw |
| Evidence Count | Missing supporting_count | Skip from count |
| Confidence | Non-finite value | Reject in causal module |
Sources: CHANGELOG.md
CLI Integration
Guard Command
The audrey guard command exposes preflight checks via terminal:
audrey guard --tool Bash "npm run deploy"
Guard Options
| Option | Description |
|---|---|
--tool <name> | Tool name being invoked |
--action <command> | Specific action/command |
--cwd <path> | Working directory |
--session-id <id> | Session identifier |
--files <paths> | Files affected by action |
--json | Output results as JSON |
--strict | Fail on warnings |
--include-capsule | Include full memory capsule |
--explain | Show reasoning breakdown |
Sources: mcp-server/index.ts:40-70
Display Mapping
The CLI maps internal decisions to display messages:
function guardDisplayDecision(result: GuardCliResult): 'allow' | 'warn' | 'block' {
if (result.decision === 'block') return 'block';
if (result.decision === 'caution') return 'warn';
return 'allow';
}
Configuration Options
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
AUDREY_CONTEXT_BUDGET_CHARS | 4000 | Memory capsule character budget |
AUDREY_ENABLE_ADMIN_TOOLS | 0 | Enable export/import/forget routes |
AUDREY_DEBUG | 0 | Print MCP info logs |
Runtime Options
interface ReflexOptions {
agent?: string; // Agent identifier
sessionId?: string; // Session context
includeCapsule?: boolean; // Include full capsule
includePreflight?: boolean; // Include preflight details
context?: Record<string, string>; // Additional context
mood?: MoodConfig; // Emotional context
}
Memory Types and Section Assignment
Assignment Logic
graph TD
A[Memory Entry] --> B{Source Trusted?}
B -->|Yes + Must-Follow Tags| C[must_follow section]
B -->|No + Must-Follow Tags| D[uncertain_or_disputed]
B -->|Risk Tags| E[risks section]
B -->|Procedural Type/Tags| F[procedures section]
B -->|Preference Tags| G[user_preferences section]
A --> H{State or Low Confidence?}
H -->|disputed/context_dependent/confidence<0.55| I[uncertain_or_disputed]
A --> J{Recent Window?}
J -->|Yes| K[recent_changes section]
J -->|No| L[Default: project_facts]Threshold Values
| Condition | Threshold | Section Assignment |
|---|---|---|
| Confidence (disputed) | < 0.55 | uncertain_or_disputed |
| Recent Window | 7 days | recent_changes |
| Tool Failure | 7 days | risks |
Data Flow Example
sequenceDiagram
participant Agent
participant MCP as MCP Server
participant Audrey
participant Memory
participant Reflex
Agent->>MCP: tool_use(Bash, "rm -rf /")
MCP->>Audrey: buildPreflight(audrey, action)
Audrey->>Memory: recall(action, context)
Memory-->>Audrey: MemoryReflex[]
Audrey->>Reflex: classifyAndScore(reflexes)
Reflex-->>Audrey: MemoryReflexReport
Audrey-->>MCP: PreflightDecision
MCP-->>Agent: block/caution/allow responseSecurity Considerations
HTTP Endpoint Protection
The preflight system includes security hardening for HTTP access:
- REST endpoints default to loopback-only binding
- API key comparison uses
crypto.timingSafeEqualto prevent timing attacks - Options like
includePrivate: truecannot be passed via HTTP bodies - Non-loopback binding requires explicit
AUDREY_API_KEY
Sources: src/routes.ts
Recall Sanitization
HTTP /v1/recall and /v1/capsule endpoints sanitize input through sanitizeRecallOptions():
// Allowed keys only
const ALLOWED_KEYS = ['limit', 'agent', 'tags', 'sources', 'after', 'before', 'context', 'mood', 'retrieval', 'scope'];
Any keys not in the allowlist are silently dropped before processing.
Related Components
| Component | File | Role |
|---|---|---|
| Rules Compiler | src/rules-compiler.ts | Compiles memories into Claude rules |
| Validation | src/validate.ts | Validates LLM response integrity |
| Impact Tracking | src/impact.ts | Tracks reflex effectiveness over time |
| Memory Capsule | src/capsule.ts | Structures context bundles |
See Also
Sources: [src/reflexes.ts:1-50](https://github.com/Evilander/Audrey/blob/main/src/reflexes.ts)
Data Storage
Related topics: System Architecture, Core Memory Operations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Core Memory Operations
Data Storage
Overview
Audrey's data storage layer is built as a local-first, SQLite-backed persistence system designed for AI agent memory continuity. The storage architecture eliminates external database dependencies while providing vector similarity search capabilities through sqlite-vec, enabling semantic memory retrieval without cloud infrastructure.
The storage system serves as the foundation for Audrey's multi-type memory model, supporting episodic, semantic, and procedural memory with built-in confidence tracking, contradiction handling, and temporal decay mechanisms.
Storage Architecture
Core Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Primary Database | SQLite | Structured memory storage, ACID transactions |
| Vector Search | sqlite-vec | Semantic similarity search on embeddings |
| Data Directory | AUDREY_DATA_DIR | Tenant/environment isolation boundary |
The storage backend runs entirely locally, requiring no hosted database services. Each tenant or environment should use a dedicated AUDREY_DATA_DIR to maintain isolation boundaries.
Sources: README.md
Database Schema Design
Audrey maintains three primary memory tables that correspond to its memory model:
erDiagram
MEMORIES ||--o{ VECTORS : contains
MEMORIES {
string id PK
string content
string memory_type
float confidence
float salience
string state
int evidence_count
int usage_count
timestamp created_at
timestamp last_used_at
}
VECTORS {
int rowid
float vector
text content
}#### Memory Type Storage
| Memory Type | Description | Key Attributes |
|---|---|---|
| episodic | Specific observations, tool results, session facts | source, tags, created_at |
| semantic | Consolidated principles from repeated evidence | evidence_count, supporting_count, contradicting_count |
| procedural | Remembered procedures, actions to avoid or retry | usage_count, failure_prevented, tags |
Sources: src/promote.ts
Memory Model Implementation
Episodic Memory Storage
Episodic memories capture specific observations and session-level facts. These entries are created during direct agent interactions and tool executions.
Key storage characteristics:
- High-volume insertion during active sessions
- Temporal ordering via
created_attimestamps - Tag-based categorization for filtered retrieval
- Source attribution (
direct-observation,told-by-user)
Semantic Memory Storage
Semantic memories represent consolidated principles extracted from accumulated episodic evidence. The promotion system converts episodic memories into semantic rules when confidence thresholds are met.
The promotion criteria for semantic memories include:
- Minimum evidence count threshold (default: 3)
- Zero contradicting evidence
- State must be
active
Sources: src/promote.ts:78-92
Procedural Memory Storage
Procedural memories track action sequences and their outcomes. These are distinguished by usage tracking and failure prevention metrics.
Procedural candidates are promoted when:
evidence_count >= minEvidencecontradicting_count === 0retrieval_count > 0ORfailure_prevented > 0
Confidence and Salience Tracking
Confidence Scoring
Confidence scores are computed from supporting versus contradicting evidence:
confidence = supporting / max(evidence, 1)
The confidence value is clamped to the range [0, 1] to prevent invalid states. Negative salience values from malformed arousal calculations are also clamped.
Sources: CHANGELOG.md
Salience System
Salience represents the importance and emotional weight of memories, influencing recall priority. The effectiveSalience calculation factors in:
- Base salience from evidence strength
- Temporal decay over time
- Arousal/affect resonance from recent memories
- Validation feedback (
helpful,wrong,usedoutcomes)
Validation Feedback Loop
The closed-loop validation system updates salience based on memory utility:
| Outcome | Salience Effect | Counts Updated |
|---|---|---|
helpful | Increases | retrieval_count, salience |
wrong | Decreases | challenge_count (semantic only) |
used | Neutral/slight | usage_count |
Sources: CHANGELOG.md
Retrieval and Search
Hybrid Recall Architecture
Audrey implements a hybrid retrieval strategy combining vector similarity with keyword matching:
graph TD
A[Query Input] --> B[Embedding Provider]
B --> C[Vector Similarity Search]
A --> D[Full-Text Search FTS]
C --> E[Confidence Scoring]
D --> E
E --> F[Memory Filtering]
F --> G[Ranked Results]#### Retrieval Modes
| Mode | Behavior |
|---|---|
hybrid (default) | Combines vector + FTS for balanced recall |
vector | Pure semantic similarity, bypasses FTS |
keyword | Skips vector pass, uses FTS only |
The vector mode serves as a fast path when FTS overhead is unacceptable.
Sources: src/recall.ts
Filtering Capabilities
Recall operations support multiple filter dimensions:
- tags: Array of tag values to match
- sources: Array of source identifiers
- after/before: Temporal bounds via ISO timestamps
- scope:
sharedoragent-scoped memories - types: Filter by memory type (episodic/semantic/procedural)
Private Memory Isolation
The includePrivate flag controls access to agent-specific private memories. The HTTP API implements an allowlist-based sanitizer (sanitizeRecallOptions()) that prevents bypassing private-memory ACL controls through body options.
Sources: src/routes.ts
Data Lifecycle
Consolidation and Decay
The audrey dream command triggers memory consolidation:
- Episodic memories are evaluated for principle extraction
- Low-confidence or conflicting memories undergo decay
- Stale memories lose retrieval authority over time
- Contradicting claims are tracked rather than silently overwritten
Sources: README.md
Rollback Operations
The rollback system (src/rollback.ts) updates memories with verification:
- Checks
.changesto confirm affected rows - Aggregates real counts rather than assuming success
- Reports failure when targeted IDs don't exist
Sources: CHANGELOG.md
Reembedding
When embedding models or dimensions change, reembedding regenerates all vector representations:
- Chunks embeddings into 256-row batches
- Labels failures by kind and row range
- Provides clear error messages for partial failures
Sources: CHANGELOG.md
Import and Export
Data Portability
Audrey supports full data export and import for:
- Snapshot restoration to fresh stores
- Backup before configuration changes
- Migration between environments
Exported snapshots should only be restored into empty Audrey stores with fresh AUDREY_DATA_DIR to prevent data corruption.
Sources: python/README.md
Export Process
Export operations create portable snapshots containing:
- All memory records (episodic, semantic, procedural)
- Associated metadata (timestamps, confidence scores)
- Configuration state
Import Validation
Import operations verify store emptiness before restoration:
isDatabaseEmpty() // Checks both records and vector tables
Sources: CHANGELOG.md
Security Considerations
Credential Protection
Raw credentials and API keys must be excluded from encoded memory content. Audrey provides redaction functionality to prevent sensitive data exposure:
const SENSITIVE_KEY_PATTERN = /(password|secret|api[_-]?key|auth[_-]?token|...)$/i;
Sources: src/redact.ts
API Security
- Audrey serve defaults to binding
127.0.0.1(previously0.0.0.0) - Non-loopback hosts require
AUDREY_API_KEYor explicitAUDREY_ALLOW_NO_AUTH=1 - HTTP API key comparison uses
crypto.timingSafeEqualto prevent timing attacks
Sources: CHANGELOG.md
Production Recommendations
| Recommendation | Rationale |
|---|---|
Set one AUDREY_DATA_DIR per tenant | Isolation boundary |
| Pin embedding and LLM providers | Reproducibility |
| Backup before provider changes | Data integrity |
| Keep credentials out of memory content | Security |
Use AUDREY_API_KEY for network exposure | Access control |
Configuration
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
AUDREY_DATA_DIR | - | Data directory path (required for isolation) |
AUDREY_EMBEDDING_PROVIDER | - | Embedding model provider |
AUDREY_LLM_PROVIDER | - | LLM provider for memory operations |
AUDREY_API_KEY | - | API authentication key |
AUDREY_HOST | 127.0.0.1 | Network binding address |
AUDREY_ALLOW_NO_AUTH | 0 | Allow unauthenticated access |
Sources: README.md
Related Documentation
- Memory Model - Multi-type memory architecture
- Recall System - Retrieval and search mechanisms
- Guard Loop - Pre-action memory checking
- Impact Analysis - Memory effectiveness tracking
Sources: [README.md](https://github.com/Evilander/Audrey/blob/main/README.md)
MCP Server
Related topics: System Architecture, REST API
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, REST API
MCP Server
Overview
The Audrey MCP Server is a Model Context Protocol (MCP) stdio server that provides a local-first memory layer for AI agents. It enables agents to encode experiences into persistent memory, recall relevant context before actions, and maintain a durable memory state across sessions. Sources: README.md
The server exposes 20+ tools plus status, recent, and principles resources, along with briefing, recall, and reflection prompts. It communicates via stdio (standard input/output), making it compatible with MCP-compatible hosts like Claude Code, Claude Desktop, Cursor, Windsurf, and VS Code. Sources: README.md
Architecture
System Context
graph TD
subgraph "MCP Hosts"
A[Claude Code]
B[Claude Desktop]
C[Cursor]
D[Windsurf]
E[VS Code]
F[Other MCP Clients]
end
subgraph "Audrey MCP Server"
G[MCP stdio Server]
H[Tool Handlers]
I[Resource Providers]
J[Prompt Templates]
end
subgraph "Audrey Core"
K[Memory Store<br/>SQLite + sqlite-vec]
L[Embedding Engine<br/>ONNX]
M[Retrieval Engine]
end
A --> G
B --> G
C --> G
D --> G
E --> G
F --> G
G --> H
G --> I
G --> J
H --> K
I --> K
J --> K
K --> L
K --> MServer Initialization Flow
sequenceDiagram
participant Host as MCP Host
participant Server as McpServer
participant Audrey as Audrey Instance
participant Store as Memory Store
Host->>Server: Start Process
Server->>Audrey: Initialize with config
Audrey->>Store: Open SQLite + sqlite-vec
alt Warmup Enabled
Audrey->>Store: Pre-compute embeddings
end
Server->>Server: registerHostResources()
Server->>Server: registerHostPrompts()
Server->>Server: Register Tools
Server->>Host: Ready (stdio)Server Components
Core Server Setup
The MCP server is initialized with a name and version, then configured with resources, prompts, and tools: Sources: mcp-server/index.ts:101-106
const server = new McpServer({
name: SERVER_NAME,
version: VERSION,
});
registerHostResources(server, audrey);
registerHostPrompts(server);
Tool Registry
The server registers the following tool categories:
| Category | Tools | Purpose |
|---|---|---|
| Memory Operations | memory_encode, memory_recall | Store and retrieve memories |
| Memory Management | memory_import, memory_export, memory_forget | Data management |
| Impact Tracking | mark_used, impact_report | Memory utility tracking |
| Promotion | promote_memory, rule_review | Memory-to-rules conversion |
Tools Reference
memory_encode
Encodes new information into the memory store with diagnostic support.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
content | string | Yes | The memory content to encode |
source | string | No | Source identifier (e.g., "direct-observation", "told-by-user") |
tags | string[] | No | Classification tags |
salience | number | No | Importance weight (0-1) |
private | boolean | No | Mark as private memory |
context | object | No | Additional context metadata |
affect | object | No | Emotional/valence metadata |
wait_for_consolidation | boolean | No | Opt-in read-after-write semantics (default: false) |
Returns: Tool result with memory ID, content, source, and optionally diagnostics. Sources: mcp-server/index.ts:108-133
memory_recall
Retrieves relevant memories based on a query with filtering options.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Search query |
limit | number | No | Maximum results (default: 10) |
types | string[] | No | Filter by memory types |
min_confidence | number | No | Minimum confidence threshold |
tags | string[] | No | Filter by tags |
sources | string[] | No | Filter by sources |
after | string | No | ISO timestamp lower bound |
before | string | No | ISO timestamp upper bound |
context | object | No | Context filtering |
mood | string | No | Mood-based filtering |
Returns: Array of recall results with confidence scores and metadata. Sources: mcp-server/index.ts:135-142
Additional Tools
| Tool | Purpose |
|---|---|
memory_import | Import memory snapshots |
memory_export | Export memory snapshots |
memory_forget | Delete specific memories |
mark_used | Record memory utility |
impact_report | Generate impact analytics |
promote_memory | Convert memory to rule |
rule_review | Review promotion candidates |
Command Line Interface
Command Routing
The MCP server entry point (mcp-server/index.ts) handles CLI subcommands before starting the stdio loop: Sources: mcp-server/index.ts:200-240
graph TD
A[audrey CLI] --> B{Subcommand?}
B -->|--help / -h / help| C[printHelp]
B -->|--version / -v / version| D[printVersion]
B -->|install| E[install]
B -->|uninstall| F[uninstall]
B -->|mcp-config| G[printMcpConfig]
B -->|hook-config| H[printHookConfig]
B -->|demo| I[runDemoCommand]
B -->|reembed| J[reembed]
B -->|dream| K[dream]
B -->|greeting| L[greeting]
B -->|NONE| M[Start MCP Server]
C --> N[Exit 0]
D --> N
E --> N
F --> N
G --> N
H --> N
I --> N
J --> N
K --> N
L --> N
M --> O[stdio Loop]Available Subcommands
| Command | Description |
|---|---|
audrey install | Register Audrey with host MCP configuration |
audrey uninstall | Remove Audrey from host configuration |
audrey mcp-config | Print MCP server configuration |
audrey hook-config | Generate Claude Code hook configurations |
audrey demo | Run interactive demonstration |
audrey reembed | Regenerate embeddings |
audrey dream | Generate reflection/memory consolidation |
audrey greeting | Display greeting message |
audrey doctor | Run diagnostic checks |
audrey guard | Check memory before action |
audrey status | Show memory system status |
audrey promote | Promote memories to rules |
audrey impact | Generate impact reports |
audrey observe-tool | Monitor tool execution |
Help and Version Short-Circuit
Help and version flags MUST short-circuit before falling through to the MCP server. A user running audrey --help should see help, not be dropped into a stdio loop: Sources: mcp-server/index.ts:201-206
if (subcommand === '--help' || subcommand === '-h' || subcommand === 'help') {
printHelp();
process.exit(0);
} else if (subcommand === '--version' || subcommand === '-v' || subcommand === 'version') {
printVersion();
process.exit(0);
}
Configuration Management
MCP Host Configuration
The config.ts module provides functions to generate host-specific MCP configurations: Sources: mcp-server/config.ts:1-50
export function formatMcpHostConfig(
host: string | undefined = 'generic',
env: Record<string, string | undefined> = process.env,
): string
Supported Hosts:
| Host | Config Format | Notes |
|---|---|---|
codex | TOML | GitHub MCP config style |
claude-code | JSON | Claude Code MCP settings |
claude-desktop | JSON | Claude Desktop config |
cursor | JSON | Cursor MCP config |
windsurf | JSON | Windsurf MCP config |
vscode | JSON | VS Code MCP config |
jetbrains | JSON | JetBrains MCP config |
generic | JSON | Generic MCP fallback |
Installation Arguments
The buildInstallArgs() function generates CLI arguments for installing the MCP server: Sources: mcp-server/config.ts:52-66
export function buildInstallArgs(
env: Record<string, string | undefined> = process.env,
options: McpEnvOptions = {},
): string[]
Generated Output Example:
mcp add -s user audrey -e AUDREY_AGENT=claude-code -e AUDREY_DATA_DIR=... -- node /path/to/mcp-entrypoint
Install Guide Generation
The formatInstallGuide() function generates human-readable installation instructions: Sources: mcp-server/index.ts:18-48
export function formatInstallGuide(
host: string,
env: Record<string, string | undefined> = process.env,
dryRun = false,
): string
Output Sections:
- Title (with dry-run or config-only indicator)
- No-modification notice
- Generated MCP config
- Generated Claude Code hook config (for Claude Code host)
- Next steps
Host-Specific Resources
Resource Registration
Resources are registered per-host to provide context: Sources: mcp-server/index.ts:103
registerHostResources(server, audrey);
registerHostPrompts(server);
Available Resources
| Resource | Type | Description |
|---|---|---|
status | Resource | Current system status |
recent | Resource | Recent memory activity |
principles | Resource | Core operational principles |
Available Prompts
| Prompt | Purpose |
|---|---|
briefing | Get current session briefing |
recall | Perform focused recall |
reflection | Generate self-reflection |
Error Handling
Tool Error Response
Tool handlers return structured error responses: Sources: mcp-server/index.ts:100
function toolError(err: unknown): CallToolResult {
return {
content: [{ type: 'text', text: `[audrey] error: ${err}` }],
isError: true,
};
}
Tool Success Response
Tool handlers return structured success responses with optional diagnostics: Sources: mcp-server/index.ts:99
function toolResult(data: unknown, diagnostics?: unknown): CallToolResult {
return {
content: [{ type: 'text', text: JSON.stringify(data) }],
_meta: diagnostics ? { diagnostics } : undefined,
};
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
AUDREY_AGENT | claude-code | Host agent identifier |
AUDREY_DATA_DIR | Platform-specific | Data directory path |
AUDREY_PROFILE | 0 | Enable profiling diagnostics |
AUDREY_DEBUG | 0 | Enable debug logging |
AUDREY_DISABLE_WARMUP | 0 | Skip embedding warmup |
AUDREY_API_KEY | unset | REST API authentication |
AUDREY_HOST | 127.0.0.1 | REST bind address |
AUDREY_PORT | 7437 | REST server port |
Performance Characteristics
v0.22.0 Performance Metrics
| Operation | Before | After | Improvement |
|---|---|---|---|
| Encode response (p50) | 24.7ms | 15.2ms | ~40% faster |
| Cold-start first encode | 525ms | 28ms (with warmup) | ~18.7x faster |
| Hybrid recall (p50) | 30.2ms | 14.3ms | ~2.1x faster |
Optimization Details
- Eliminated 3 of 4 redundant embedding calls during encode
- Validation, interference, and affect resonance reuse the main content vector
- Background embedding warmup at MCP boot reduces cold-start latency Sources: CHANGELOG.md
Security
API Key Timing Safety
HTTP API key comparison uses crypto.timingSafeEqual to prevent timing attacks: Sources: CHANGELOG.md
Recall Options Sanitization
HTTP /v1/recall and /v1/capsule sanitize request bodies to prevent ACL bypass: Sources: CHANGELOG.md
Promote Path Restrictions
audrey promote --yes restricts writes to process.cwd() unless target is in AUDREY_PROMOTE_ROOTS, preventing prompt-injection attacks. Sources: CHANGELOG.md
Profiling Mode
When AUDREY_PROFILE=1, tools return diagnostic metadata: Sources: mcp-server/index.ts:110-120
if (profileEnabled) {
const { id, diagnostics } = await audrey.encodeWithDiagnostics({
content,
source,
tags,
salience,
private: isPrivate,
context,
affect,
waitForConsolidation: wait_for_consolidation,
});
return toolResult({ id, content, source, private: isPrivate ?? false }, diagnostics);
}
Diagnostic Data Includes:
- Per-stage timing information
- Embedding generation time
- Retrieval latency breakdown
Related Documentation
- README.md - Main project documentation
- CHANGELOG.md - Version history and release notes
- src/capsule.ts - Memory capsule generation
- src/rules-compiler.ts - Memory-to-rules promotion
- src/impact.ts - Impact analytics reporting
Source: https://github.com/Evilander/Audrey / Human Manual
REST API
Related topics: System Architecture, MCP Server
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, MCP Server
REST API
The Audrey REST API provides a local-first HTTP interface for memory management operations, enabling external agents and services to interact with Audrey's memory system without direct database access.
Overview
Audrey's REST API is built on Hono, a lightweight, high-performance web framework for Edge environments. The API serves as a sidecar service that wraps the core memory engine with HTTP endpoints for encoding, recalling, and managing memory entries.
Key characteristics:
- Local-first design with no external database dependencies
- SQLite + sqlite-vec for storage and vector search
- Bearer token authentication for non-loopback access
- Type-safe request/response handling
Sources: README.md:60
Architecture
graph TD
A[Client<br>Python/JS SDK] --> B[REST API<br>Hono Server]
B --> C[Audrey Core Engine]
C --> D[SQLite<br>Memory Store]
C --> E[sqlite-vec<br>Vector Index]
B --> F[/health]
B --> G[/v1/recall]
B --> H[/v1/capsule]
B --> I[/v1/encode]
B --> J[Admin Routes<br>/v1/import<br>/v1/export]Server Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
AUDREY_HOST | 127.0.0.1 | REST sidecar bind address. Set to 0.0.0.0 only with AUDREY_API_KEY. |
AUDREY_PORT | 7437 | Port for the REST server to listen on. |
AUDREY_API_KEY | unset | Bearer token required for non-loopback REST traffic. |
AUDREY_ALLOW_NO_AUTH | 0 | Set to 1 to allow non-loopback bind without an API key. Not recommended. |
AUDREY_ENABLE_ADMIN_TOOLS | 0 | Set to 1 to enable export, import, and forget routes/tools. Disabled by default. |
AUDREY_PRAGMA_DEFAULTS | 1 | Set to 0 to revert SQLite PRAGMA tuning to better-sqlite3 defaults. |
AUDREY_DEBUG | 0 | Set to 1 to print MCP info logs. Errors always log. |
Sources: README.md:44-52
Starting the Server
# Default (loopback only)
npx audrey serve
# With explicit port
AUDREY_PORT=8080 npx audrey serve
# Network-exposed with API key
AUDREY_HOST=0.0.0.0 AUDREY_API_KEY=secret npx audrey serve
Sources: python/README.md:18
API Endpoints
Health Check
| Method | Path | Description |
|---|---|---|
GET | /health | Returns server health status |
Response:
{
"status": "ok",
"version": "0.22.1",
"timestamp": "2026-04-30T12:00:00.000Z"
}
Sources: README.md:58
Memory Operations
| Method | Path | Description |
|---|---|---|
POST | /v1/encode | Store a new memory entry |
POST | /v1/recall | Retrieve relevant memories by query |
POST | /v1/capsule | Get a turn-sized memory packet |
POST | /v1/mark-used | Mark memory as used with outcome feedback |
POST | /v1/observe-tool | Record tool execution results |
POST | /v1/before-action | Preflight check before tool execution |
POST | /v1/validate | Validate memory helpfulness |
Sources: README.md:58, src/routes.ts:1-50
Request Body Schema
The REST API accepts a unified RouteBody type with optional fields:
type RouteBody = {
action?: string;
query?: string;
tool?: string;
session_id?: string;
sessionId?: string;
cwd?: string;
files?: string[];
strict?: boolean;
limit?: number;
budget_chars?: number;
budgetChars?: number;
mode?: PreflightOptions['mode'];
failure_window_hours?: number;
recent_failure_window_hours?: number;
recentFailureWindowHours?: number;
recent_change_window_hours?: number;
recentChangeWindowHours?: number;
include_capsule?: boolean;
includeCapsule?: boolean;
include_status?: boolean;
includeStatus?: boolean;
record_event?: boolean;
recordEvent?: boolean;
include_preflight?: boolean;
includePreflight?: boolean;
receipt_id?: string;
receiptId?: string;
input?: unknown;
output?: unknown;
outcome?: EventOutcome;
error_summary?: string;
errorSummary?: string;
metadata?: Record<string, unknown>;
retain_details?: boolean;
retainDetails?: boolean;
evidence_feedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
evidenceFeedback?: Record<string, 'used' | 'helpful' | 'wrong'>;
};
Sources: src/routes.ts:5-46
Security Model
Authentication
Non-loopback REST traffic requires a Bearer token:
curl -H "Authorization: Bearer your-secret-token" \
http://localhost:7437/v1/recall \
-d '{"query": "deploy failures"}'
Security measures:
- HTTP API key comparison uses
crypto.timingSafeEqualinstead of string!==to prevent timing attacks Sources: README.md:41 - Server defaults to binding
127.0.0.1(was0.0.0.0) Sources: README.md:40 - Refuses to start on non-loopback host without
AUDREY_API_KEYunlessAUDREY_ALLOW_NO_AUTH=1
Recall Options Sanitization
HTTP /v1/recall and /v1/capsule no longer body-spread caller options into internal calls. The sanitizeRecallOptions() function implements an allowlist that drops anything not in a known-safe key set:
export function sanitizeRecallOptions(options: unknown): SanitizedRecallOptions {
// Only allows: budget_chars, limit, retrieval, includePrivate, sessionId
}
This prevents bypassing private-memory ACL and integrity controls via includePrivate: true or confidenceConfig overrides in HTTP bodies.
Sources: README.md:39-40, CHANGELOG.md:0.22.1
Admin Tools
Admin routes (/v1/import, /v1/export, /v1/forget) are disabled by default. Enable with:
AUDREY_ENABLE_ADMIN_TOOLS=1 npx audrey serve
Sources: README.md:48
Client Integration
Python SDK
from audrey_memory import Audrey
brain = Audrey(
base_url="http://127.0.0.1:7437",
api_key="secret",
agent="support-agent",
)
# Encode a memory
memory_id = brain.encode(
"Stripe returns HTTP 429 above 100 req/s",
source="direct-observation",
tags=["stripe", "rate-limit"],
)
# Recall relevant memories
results = brain.recall("stripe rate limits", limit=5)
# Create snapshot for backup
snapshot = brain.snapshot()
brain.close()
Async usage:
import asyncio
from audrey_memory import AsyncAudrey
async def main() -> None:
async with AsyncAudrey(base_url="http://127.0.0.1:7437") as brain:
await brain.health()
await brain.encode("Deploy failed due to OOM", source="direct-observation")
await brain.recall("deploy failure", limit=3)
asyncio.run(main())
Sources: python/README.md:22-45
Connection URL Correction
Note: Python clientDEFAULT_BASE_URLwas corrected fromhttp://127.0.0.1:3487tohttp://127.0.0.1:7437in v0.22.1 to match the TS server's default port.
Sources: CHANGELOG.md:0.22.1
Impact Reporting
The REST API exposes impact analytics through the audrey impact CLI, which calls internal Audrey methods:
| Endpoint | Description |
|---|---|
| Total memories by type | episodic, semantic, procedural counts |
| All-time validated count | Memories validated as helpful/wrong |
| Recent validations | Validation activity in time window |
| Top-N most-used memories | Memories with highest usage_count |
| Weakest-N memories | Lowest salience candidates for forgetting |
| Recent activity timeline | last_used_at based activity log |
# Basic report
audrey impact
# JSON output for automation
audrey impact --json
# Custom window and limits
audrey impact --window 30 --limit 20
Sources: src/impact.ts:1-50, CHANGELOG.md:0.22.1
Deployment
Docker
# docker-compose.yml
services:
audrey:
image: ghcr.io/evilander/audrey:latest
ports:
- "7437:7437"
environment:
- AUDREY_API_KEY=your-secret-token
- AUDREY_HOST=0.0.0.0
volumes:
- audrey-data:/data
Doctor Check
The audrey doctor command validates REST server configuration:
npx audrey doctor
Checks performed:
| Check | Description |
|---|---|
serve-bind-safety | Validates bind address with auth configuration |
node-runtime | Node.js version >= 20 |
entrypoint-exists | MCP stdio entrypoint file exists |
data-dir | Data directory accessibility |
embedding | Embedding provider configuration |
llm | LLM provider configuration |
graph TD
A[audrey doctor] --> B{Is bind loopback?}
B -->|Yes| C[✅ loopback only]
B -->|No| D{Has AUDREY_API_KEY?}
D -->|Yes| E[✅ non-loopback with API key]
D -->|No| F{Has AUDREY_ALLOW_NO_AUTH?}
F -->|Yes| G[⚠️ warning - network exposure]
F -->|No| H[❌ error - refuse to start]Sources: mcp-server/index.ts:100-130
Error Handling
Common Issues
| Error | Cause | Solution |
|---|---|---|
| Connection refused | Wrong port or host | Check AUDREY_PORT and AUDREY_HOST |
| 401 Unauthorized | Missing/invalid API key | Provide Authorization: Bearer <token> header |
| 404 Not Found | Wrong endpoint | Use /v1/* routes, not /openapi.json or /docs |
| Validation error | Malformed request body | Check RouteBody schema |
Status Codes
| Code | Meaning |
|---|---|
200 | Success |
400 | Bad request (malformed body) |
401 | Unauthorized (missing/invalid API key) |
404 | Endpoint not found |
500 | Internal server error |
Note:/openapi.jsonand/docsroutes are not currently wired. The README matches the actual surface (/health+/v1/*).
Sources: CHANGELOG.md:0.22.1, README.md:58
Sources: [README.md:60]()
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
Doramagic Pitfall Log
Doramagic extracted 14 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors
- Severity: medium
- Finding: Installation risk is backed by a source signal: Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured validate errors. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.1
2. Installation risk: Audrey Guard 0.23.0
- Severity: medium
- Finding: Installation risk is backed by a source signal: Audrey Guard 0.23.0. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.23.0
3. Installation risk: v0.16.0
- Severity: medium
- Finding: Installation risk is backed by a source signal: v0.16.0. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.0
4. Installation risk: v0.16.1 — Windows MCP fix
- Severity: medium
- Finding: Installation risk is backed by a source signal: v0.16.1 — Windows MCP fix. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.16.1
5. Installation risk: v0.17.0
- Severity: medium
- Finding: Installation risk is backed by a source signal: v0.17.0. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.17.0
6. Configuration risk: Configuration risk needs validation
- Severity: medium
- Finding: Configuration risk is backed by a source signal: Configuration risk needs validation. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.host_targets | github_repo:1161444210 | https://github.com/Evilander/Audrey | host_targets=mcp_host, claude, claude_code
7. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:1161444210 | https://github.com/Evilander/Audrey | README/documentation is current enough for a first validation pass.
8. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:1161444210 | https://github.com/Evilander/Audrey | last_activity_observed missing
9. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium
10. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:1161444210 | https://github.com/Evilander/Audrey | no_demo; severity=medium
11. Security or permission risk: Audrey 1.0.0
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: Audrey 1.0.0. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v1.0.0
12. Security or permission risk: v0.22.2 — correctness pass + legitimate benchmarking
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: v0.22.2 — correctness pass + legitimate benchmarking. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/Evilander/Audrey/releases/tag/v0.22.2
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Audrey with real data or production workflows.
- Audrey 1.0.1 — honest GuardBench gate, Guard time decay, structured vali - github / github_release
- Audrey 1.0.0 - github / github_release
- Audrey Guard 0.23.0 - github / github_release
- v0.22.2 — correctness pass + legitimate benchmarking - github / github_release
- v0.17.0 - github / github_release
- v0.16.1 — Windows MCP fix - github / github_release
- v0.16.0 - github / github_release
- Configuration risk needs validation - GitHub / issue
Source: Project Pack community evidence and pitfall evidence