Doramagic Project Pack · Human Manual

Velune-CLI

VELUNE CLI is an open-source AI engineering CLI that unifies local LLMs (Ollama), cloud AI providers, MCP servers, tools, memory, and project context into a single developer workflow. Build, code, automate, and orchestrate AI with one extensible, provider-agnostic command-line interface.

Repository Overview & System Architecture

Related topics: Council Orchestration, Memory Tiers & Retrieval, Providers, MCP & Plugin Extensibility, CLI Reference, Session Modes & Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Source: https://github.com/Surya-Hariharan/Velune-CLI / Human Manual

Council Orchestration, Memory Tiers & Retrieval

Related topics: Repository Overview & System Architecture, Providers, MCP & Plugin Extensibility, CLI Reference, Session Modes & Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Orchestration Flow

Continue reading this section for the full explanation and source context.

Section Failure Modes

Continue reading this section for the full explanation and source context.

Related topics: Repository Overview & System Architecture, Providers, MCP & Plugin Extensibility, CLI Reference, Session Modes & Operations

Council Orchestration, Memory Tiers & Retrieval

Overview

Velune is a terminal-first AI developer CLI whose central abstraction is the Reasoning Council — a coordinated ensemble of role-specialized agents that together translate a natural-language prompt into a planned, executed, and reviewed change set. Around the Council, a five-tier memory system and a hybrid retrieval pipeline supply long-horizon context (repository structure, prior sessions, semantic recall) so prompts like *"fix the auth issue from yesterday"* can be grounded without the user re-explaining intent. Source: README.md

This page documents how those subsystems fit together: the agents that compose the Council, the memory tiers that feed them, the retrieval pipeline that selects context, and the CLI command surface (velune ask and velune run) that drives them.

Council Architecture & Agent Roles

The Council is composed of specialized agents that all derive from a shared BaseCouncilAgent and communicate via strongly-typed messages such as PlannerMessage and ReviewerMessage. Each agent is bound to a ModelDescriptor and a ModelProvider so orchestration can route different roles to different models based on capability profiles. Source: velune/cognition/agents/planner.py, velune/cognition/agents/reviewer.py, velune/cognition/council/base.py, velune/cognition/council/messages.py

Two agents are visible in the current source tree:

AgentRoleOutputConstraints
PlannerDecomposes the task into a strict JSON DAG of TaskStep recordsTaskPlan written into CouncilStateWall-clock budget enforcement; raises TimeoutError if planner_timeout_seconds is exceeded
ReviewerAudits proposed code changes for logical flaws, security, performance, and alignmentJSON {passed, critical_issues, suggestions, confidence_rating}Decision logic applied to determine retry vs. escalate

The Planner's system prompt mandates raw JSON (no Markdown fences) describing a TaskPlan with a task_id and a list of steps, each carrying id, description, target_files, expected_outcome, and agent_role. Source: velune/cognition/agents/planner.py, velune/core/types/task.py

All agents read and write a shared CouncilState object that tracks wall-clock budget, plan progress, and review decisions, giving the orchestration loop a single source of truth. Source: velune/cognition/state.py

Orchestration Flow

flowchart LR
    U[User prompt] --> CLI["velune ask / velune run"]
    CLI --> FW[CognitiveFirewall]
    FW --> R["Retrieval: BM25 + Vector + Graph"]
    R --> P[Planner]
    P --> C["Coder (agent_role)"]
    C --> RV[Reviewer]
    RV -->|pass| OUT[Result surfaced to user]
    RV -->|fail| P

The firewall guards every prompt before it reaches the Council; the Planner consumes retrieved context to build a plan; the Coder (selected via agent_role) executes steps; the Reviewer either approves the diff or loops it back to the Planner. Source: velune/cognition/firewall.py, velune/cli/commands/run.py

Memory Tiers

Velune maintains five memory tiers across sessions so context survives a single REPL turn. Source: README.md

TierScopeStoragePurpose
WorkingCurrent conversationIn-processTTL-evicted turn buffer for the active loop
EpisodicSession historySQLite (~/.velune/)"What did I do last run?"
SemanticPast interactionsLanceDB / Qdrant (opt-in [rag] extra)Vector recall over earlier work
GraphRepository symbolsLocal graph storeSymbol relationships and structural queries
LineageDecision historyPersistedWhat was tried, why, and outcome

The [rag] extra was introduced in the v0.9.2 release notes to keep pip install velune-cli lean by default; it pulls in lancedb, pyarrow, and qdrant-client only on demand. Source: README.md (release notes for v0.9.2)

Retrieval Pipeline

Before the Council deliberates, the prompt is enriched through a hybrid retrieval pass that fuses three signals:

  • BM25 lexical scoring over the indexed repository
  • Vector similarity over the semantic memory store
  • Graph traversal over the repository structure (symbol references, imports, call edges)

The result is packaged as retrieved_context and forwarded to the Planner, which embeds it into the system prompt that drives generate_plan. Source: velune/cognition/agents/planner.py, velune/cognition/firewall.py

A TaskClassifier upstream of retrieval routes prompts by keyword families — coding (refactor, rewrite, query, schema), reasoning (explain, analyze, compare), summarization (tldr, summary, digest), and quick-question patterns (what is, define) — into a TaskProfile that influences which memory tiers are consulted and how aggressive the budget guard becomes. Source: velune/providers/task_classifier.py

Command Surface & Execution Modes

Two CLI entry points feed the Council:

  • velune ask <prompt> — Interactive question path; routes natural language through the firewall and Council but does not execute sandboxed tools. Supports --council-tier override (instant, standard, full). Source: velune/cli/commands/ask.py
  • velune run <task> — Autonomous path; the Council plans, writes code, and executes it in a sandbox. Flags --dry-run, --force, and --yes control write/execute permissions and confirm prompts. Source: velune/cli/commands/run.py

Both commands hand the async work off to velune.core.event_loop.submit() so the synchronous Typer boundary stays responsive. Source: velune/core/event_loop.py

Session-mode toggles (/normal, /optimus, /godly) and cognition-depth toggles (/cognition quick|standard|deep) adjust the council tier and context cap (4k → 128k tokens) without restarting the REPL. The v0.9.3-beta.1 release notes further describe the move to "explicit, on-demand cognition" — the REPL no longer runs automatic repository indexing on launch; cognition is user-driven via slash commands. Source: README.md

Failure Modes

  • Budget exhaustionPlannerAgent.generate_plan raises ValueError("Wall-clock budget exhausted before Planner could run") if state.is_budget_exhausted() is true. Source: velune/cognition/agents/planner.py
  • Planner timeout — The call is wrapped in an asyncio timeout; raises TimeoutError if planner_timeout_seconds is exceeded. Source: velune/cognition/agents/planner.py
  • Reviewer rejection — A passed=false review loops the diff back to the Planner for revision rather than surfacing it to the user. Source: velune/cognition/agents/reviewer.py
  • Sandbox writes blockedvelune run without --force or --yes may halt on human-confirm thresholds before code is written. Source: velune/cli/commands/run.py

See Also

  • CLI Commands & REPL — Interactive prompt and tab-completion surface
  • Provider Adapters — How ModelProvider and ModelDescriptor back each Council agent
  • Cognitive Firewall — Pre-Council prompt guard
  • Repository Cognition — How the Graph memory tier is built

Source: https://github.com/Surya-Hariharan/Velune-CLI / Human Manual

Providers, MCP & Plugin Extensibility

Related topics: Repository Overview & System Architecture, Council Orchestration, Memory Tiers & Retrieval, CLI Reference, Session Modes & Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Adapters and Model Catalogs

Continue reading this section for the full explanation and source context.

Section Discovery vs. Static Catalogs

Continue reading this section for the full explanation and source context.

Section Task Classification and Routing

Continue reading this section for the full explanation and source context.

Related topics: Repository Overview & System Architecture, Council Orchestration, Memory Tiers & Retrieval, CLI Reference, Session Modes & Operations

Providers, MCP & Plugin Extensibility

Velune is a terminal-first AI developer CLI whose value is delivered through three orthogonal extension surfaces: model providers, the Model Context Protocol (MCP), and a plugin loader. This page documents how each surface is structured in the source tree, how they are wired together, and what configuration knobs they expose.

Provider Subsystem

Adapters and Model Catalogs

Velune ships dedicated adapter modules under velune/providers/adapters/, one per upstream inference vendor. Each adapter is responsible for two things: enumerating a static catalog of ModelDescriptor records (capability levels, context length, cost, tags) and translating a generic InferenceRequest into the vendor-specific HTTP payload.

AdapterNotable models cataloguedSource
NVIDIA NIMLlama 3.1 70B, Mistral Large 2, Nemotron 70Bvelune/providers/adapters/nvidia.py
CohereCommand models with chat-history translationvelune/providers/adapters/cohere.py
Together AIQwen Coder, DeepSeek R1, Mistral 7Bvelune/providers/adapters/together.py
GroqLlama 3.x, Mixtral, Gemma 2 (free tier)velune/providers/adapters/groq.py
GoogleGemini 2.0 Flash, Gemini 2.0 Flash Thinkingvelune/providers/adapters/google.py

Capabilities are scored on a CapabilityLevel enum (BASIC → INTERMEDIATE → ADVANCED → EXPERT) across dimensions such as coding, reasoning, planning, summarization, instruction_following, tool_use, and long_context. The Groq catalog, for example, marks gemma2-9b-it with free_tier=True, cost_per_1k_tokens=0.0, and speed_tier="fast" so the router can prefer it when cost matters. Source: velune/providers/adapters/groq.py.

Discovery vs. Static Catalogs

Two parallel directories coexist under velune/providers/:

  • adapters/ — hand-curated, opinionated catalogs with capability ratings already attached.
  • discovery/ — runtime probe logic that infers capabilities from model identifiers. Source: velune/providers/discovery/openai.py shows pattern-matching against substrings such as "gpt-4", "gpt-4o", and "gpt-3.5" to assign capability profiles when the vendor's own metadata is missing.

Discovery is used when the user enables a new model at runtime that the adapter has not pre-registered, keeping the system forward-compatible with new vendor releases without code changes.

Task Classification and Routing

The router does not pick models arbitrarily. Before inference, velune/providers/task_classifier.py tags each prompt with keyword sets (CODING_KEYWORDS, REASONING_KEYWORDS, SUMMARIZATION_KEYWORDS, QUICK_PATTERNS) and returns a TaskProfile carrying task type, complexity, latency-sensitivity, and a long-context flag (total_tokens > 8000). The router then matches the profile against ModelDescriptor.capabilities to select the smallest, cheapest model that still meets the bar. Source: velune/providers/task_classifier.py.

Council Agent Integration

Selected providers back the specialized council roles. The Planner (velune/cognition/agents/planner.py) and Reviewer (velune/cognition/agents/reviewer.py) agents each accept a ModelDescriptor and ModelProvider at construction time and emit strict JSON via a system prompt that forbids Markdown wrapping. Planner requests a DAG-style TaskPlan; Reviewer returns a passed boolean plus critical_issues and a confidence_rating. This uniform contract is what allows the same router to mix-and-match council members across providers.

MCP Integration

Server and Client Surface

The README positions Velune as both an MCP server (velune mcp serve) and an MCP client, allowing it to expose its tools to other agents and consume tools from third-party servers. The CLI entry points live in velune/cli/commands/mcp.py. The connect subcommand accepts an SSE server_url and a name, reads the operator-configured mcp.allowed_hosts allowlist via ConfigLoader, and instantiates VeluneMCPClient(server_url, name, allowed_hosts=...) before printing the discovered tool list. Source: velune/cli/commands/mcp.py.

Trust and Project-Level Config

MCP servers can ship project-level configuration. To prevent silent execution of untrusted code, Velune requires the operator to explicitly opt in per directory. The velune trust add, velune trust forget, and velune trust list subcommands in velune/cli/commands/trust.py maintain a persistent allowlist. Until a directory is trusted, the REPL prints a hint pointing the user to velune trust add. Source: velune/cli/commands/trust.py.

Community Acknowledgement

The community audit captured in Issue #9 describes Velune as having "one of the most mature MCP implementations" among open-source AI coding assistants, noting full client/server coverage. Subsequent releases (0.9.x line) have continued hardening the MCP path: 0.9.1 closed Windows path-handling defects and 0.9.3-beta.1 re-architected startup so cognition is on-demand rather than automatic, which directly affects when MCP tools become reachable inside a session.

Plugin Loader

The third extensibility surface is the plugin system. Per the README project layout, the plugins/ directory contains the declarative plugin loader, SKILL.md injection logic, and hook wiring. Plugins are how third parties contribute tools and skills without forking the core CLI. Source: README.md.

The 0.9.0 release notes are explicit that the plugin sandbox remains unimplemented or disabled for standard CLI operations, so plugins currently run with the same privileges as the Velune process itself. Operators should treat plugin installation with the same caution they apply to pip install-style trust decisions.

How the Three Surfaces Compose

flowchart LR
    A[User Prompt] --> B[task_classifier.py]
    B --> C{Provider Router}
    C --> D[Adapter: NVIDIA / Cohere / Together / Groq / Google]
    D --> E[InferenceResponse]
    C --> F[Council Agents: Planner / Reviewer]
    E --> G[Trust-aware REPL]
    F --> G
    G --> H[MCP Server outbound]
    G --> I[MCP Client inbound]
    G --> J[Plugin Loader / SKILL.md]

A user prompt is classified, routed to a provider-backed agent (optionally wrapped by the council), and the resulting action is mediated by the trust-aware REPL before it can reach MCP endpoints, inbound MCP tools, or locally installed plugins.

Configuration and Failure Modes

  • Lean install: Heavy provider-agnostic extras such as [rag], [parsing], [telemetry], [git], [gguf], and [docker] are opt-in. Removing them never breaks chat or tool use; it only degrades the affected feature. Source: README.md.
  • HTTP failures: Adapters surface upstream httpx.HTTPStatusError exceptions directly (see velune/providers/adapters/nvidia.py), so a 429 from a cloud vendor will bubble up unless a retry layer wraps it.
  • Untrusted MCP config: Until velune trust add <path> is invoked, project-level MCP servers are ignored. This is the primary defense for multi-repo workspaces.
  • Missing capabilities: When discovery/ cannot infer a profile for a new model identifier, the router falls back to a conservative capability set, which may route it to a more expensive model than necessary.

See Also

Source: https://github.com/Surya-Hariharan/Velune-CLI / Human Manual

CLI Reference, Session Modes & Operations

Related topics: Repository Overview & System Architecture, Council Orchestration, Memory Tiers & Retrieval, Providers, MCP & Plugin Extensibility

Section Related Pages

Continue reading this section for the full explanation and source context.

Section velune ask — Interactive Prompt Routing

Continue reading this section for the full explanation and source context.

Section velune run — Autonomous Council Execution

Continue reading this section for the full explanation and source context.

Section velune session — Session Lifecycle

Continue reading this section for the full explanation and source context.

Related topics: Repository Overview & System Architecture, Council Orchestration, Memory Tiers & Retrieval, Providers, MCP & Plugin Extensibility

CLI Reference, Session Modes & Operations

The Velune command-line interface is a terminal-first entry point for routing natural-language tasks and questions to the Reasoning Council. This page documents the user-facing commands, the three session modes, and the operational flow that connects the CLI to the planner, coder, and reviewer agents.

Overview

Velune exposes its capabilities through a Typer-based CLI that dispatches to async command implementations via the centralized event loop helper velune.core.event_loop.submit. Every command resolves a CLIContext from ctx.obj, then submits a coroutine to the loop. Source: velune/cli/commands/ask.py:43-50, velune/cli/commands/run.py:24-30.

The CLI is partitioned into four primary sub-commands:

Sub-commandPurposeFile
velune askInteractive prompt routing (no code execution)velune/cli/commands/ask.py
velune runAutonomous Council deliberation + sandbox executionvelune/cli/commands/run.py
velune sessionList, resume, delete, or export chat sessionsvelune/cli/commands/session.py
velune workspaceInitialize and explain the indexed workspacevelune/cli/commands/workspace.py

Session Modes

Velune supports three session modes that balance speed, quality, and context budget. Source: README.md

ModeSlash CommandCouncil TierModelContext Cap
Normal/normalautocurrent16 k tokens
Optimus/optimusinstantsmallest4 k tokens
Godly/godlyfulllargest128 k tokens

Switching modes is done mid-session through slash commands; the prompt badge updates immediately to reflect the active mode. Optimus prioritizes low latency and small context, Godly activates the full multi-agent council and uses the largest available model, and Normal is the default auto-tiered middle ground.

Core Commands

`velune ask` — Interactive Prompt Routing

ask is the read-only entry point: it routes a natural-language question to the Council but never writes files or executes scripts. The command accepts an optional positional prompt and an optional --council-tier override (values: instant, standard, full). When the prompt is omitted, the command falls back to typer.prompt for interactive capture; in JSON mode it errors out instead. Source: velune/cli/commands/ask.py:18-42.

`velune run` — Autonomous Council Execution

run triggers the full Reasoning Council: the planner decomposes the task, the coder produces diffs, and the reviewer audits them before any modification is applied. The command exposes three options:

  • --dry-run / -d — deliberate but skip writes and execution
  • --force / -f — bypass human confirmation thresholds
  • --yes / -y — skip cost confirmation prompts for scripting

The --yes flag is propagated into cli_context.yes so downstream async helpers can read it. Source: velune/cli/commands/run.py:13-36.

`velune session` — Session Lifecycle

The session sub-command manages persisted chat history. The list subcommand supports --all/-a (show sessions from every workspace) and --limit/-n to cap the number of rows; it can emit JSON when cli_context.json_mode is active. Source: velune/cli/commands/session.py:18-52.

`velune workspace` — Repository Cognition

workspace explain runs the TechnologyDetector and ArchitectureDetector against the local index — without calling any AI provider — and renders a plain-English summary covering framework, routing, auth, state management, and entry points. It accepts --path/-p (default Path.cwd()) and --json for machine-readable output, and errors out with a non-zero exit code when no .velune directory exists. Source: velune/cli/commands/workspace.py:78-105.

Operation Flow

A velune run invocation flows through the Council in three stages, each guarded by the CouncilState budget. The planner decomposes the task into a JSON TaskPlan and respects state.remaining_budget_seconds(). Source: velune/cognition/agents/planner.py:55-75. The coder then drafts diffs using the model-family edit-format preference order (search_replacewhole_fileudiff) within state.budget.max_tokens_per_agent. Source: velune/cognition/agents/coder.py:120-145. Finally, the reviewer returns a ReviewDecision (passed/critical_issues/suggestions/confidence_rating) that gates downstream writes. Source: velune/cognition/agents/reviewer.py:30-45.

The TaskClassifier upstream of these agents inspects the prompt for keyword clusters (coding, reasoning, summarization, quick-question patterns) and emits a TaskProfile that the router uses to pick a council tier and check whether long context is required. Source: velune/providers/task_classifier.py:60-100.

flowchart LR
    A[velune run/ask] --> B[CLIContext]
    B --> C{TaskClassifier}
    C -->|profile| D[Council Tier]
    D --> E[PlannerAgent]
    E --> F[TaskPlan]
    F --> G[CoderAgent]
    G --> H[ReviewerAgent]
    H --> I[ReviewDecision]
    I -->|passed| J[Sandbox / Write]
    I -->|failed| K[Refine / Replan]

Common Failure Modes

See Also

  • Reasoning Council and agent roles
  • Memory tiers and persistent storage
  • Provider adapters (OpenAI, Anthropic, Google, Groq, Cohere, NVIDIA, Together)
  • MCP server and client integration

Per the latest community discussion, Velune is noted for having one of the most mature MCP implementations among open-source AI coding assistants, and the council-based orchestration model is the central abstraction users interact with through the commands documented above.

Source: https://github.com/Surya-Hariharan/Velune-CLI / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 9 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Configuration risk - Configuration risk requires verification.

1. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.host_targets | https://github.com/Surya-Hariharan/Velune-CLI

2. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/Surya-Hariharan/Velune-CLI

3. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: packet_text.keyword_scan | https://github.com/Surya-Hariharan/Velune-CLI

4. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/Surya-Hariharan/Velune-CLI

5. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/Surya-Hariharan/Velune-CLI

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/Surya-Hariharan/Velune-CLI

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/Surya-Hariharan/Velune-CLI/issues/9

8. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/Surya-Hariharan/Velune-CLI

9. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/Surya-Hariharan/Velune-CLI

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 10

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Velune-CLI with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence