# https://github.com/EvoAgentX/EvoAgentX Project Manual

Generated at: 2026-06-27 06:44:06 UTC

## Table of Contents

- [Introduction & Getting Started](#page-1)
- [Core Architecture: Agents, Workflows & Tools](#page-2)
- [Self-Evolution: Optimizers & Evaluation](#page-3)
- [Advanced Capabilities: Memory, RAG, HITL & Storage](#page-4)

<a id='page-1'></a>

## Introduction & Getting Started

### Related Pages

Related topics: [Core Architecture: Agents, Workflows & Tools](#page-2), [Self-Evolution: Optimizers & Evaluation](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/EvoAgentX/EvoAgentX/blob/main/README.md)
- [Wonderful_workflow_corpus/README.md](https://github.com/EvoAgentX/EvoAgentX/blob/main/Wonderful_workflow_corpus/README.md)
- [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py)
- [evoagentx/agents/customize_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/customize_agent.py)
- [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py)
- [evoagentx/models/model_configs.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/model_configs.py)
- [evoagentx/utils/utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/utils.py)
- [evoagentx/utils/mipro_utils/signature_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/signature_utils.py)
- [evoagentx/utils/mipro_utils/module_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/module_utils.py)
</details>

# Introduction & Getting Started

EvoAgentX is a self-evolving agent framework that enables developers, researchers, and AI enthusiasts to build, evaluate, and evolve agentic workflows. This page introduces the project's scope, core abstractions, and the fastest path to a running workflow.

## What EvoAgentX Is

EvoAgentX ships a foundation for assembling LLM-powered agents, wiring them into workflows, and then optimizing the resulting pipelines automatically. According to the v0.1.0 release notes, the framework introduces "the foundation of the EvoAgentX ecosystem," including reusable agents, tools, and an optimizer loop for evolving prompt configurations over time.

The project is structured around three pillars:

- **Agents** — modular units that wrap an LLM, a system prompt, and a set of `Action` objects.
- **Workflows** — collections of `workflow.json` and `tools.json` files, runnable via a universal executor.
- **Optimizers** — components such as EvoPrompt that mutate prompts in a `ParamRegistry` and re-evaluate the program.

## Core Abstractions

### The `Agent` Base Class

The central abstraction is defined in [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py). An `Agent` carries a unique `name`, a `description`, an optional `LLMConfig`, an `llm` instance, a `system_prompt`, an optional short/long-term memory pair, and a list of `actions`. Every agent has an auto-generated `agent_id` and a `version` field that is incremented as the agent evolves (Source: [evoagentx/agents/agent.py:18-37]()).

### Custom Agents via `CustomizeAgent`

For most user-facing use cases, EvoAgentX exposes `CustomizeAgent`, which lets you declare `inputs`, `outputs`, `parse_mode`, `parse_func`, `tools`, and a `custom_output_format` without subclassing. Its constructor instantiates a single internal `customize_action` that owns the prompt template and tool list (Source: [evoagentx/agents/customize_agent.py:1-25]()). The agent then serializes itself into a configuration dictionary containing the prompt, input/output field metadata, tool names, and parser settings — useful for snapshotting and restoring optimized states (Source: [evoagentx/agents/customize_agent.py:25-50]()).

### LLM Output Parsing

LLM responses are funneled through `LLMOutputParser`, which supports five `parse_mode` values: `str`, `json`, `xml`, `title` (Markdown-style headings), and `custom`. The `title` mode splits content on headings such as `## field_name` and uses the last fenced code block within each section to extract a typed value (Source: [evoagentx/models/base_model.py:1-30]()). Structured outputs can additionally be constrained by a JSON Schema attached to the parser's `model_config`; validation is performed via `Draft7Validator`, with an opt-in `fix_json_schema_error` flag that attempts to repair payloads rather than raise (Source: [evoagentx/models/base_model.py:30-60]()).

## Quick Start

### 1. Install and Configure

The fastest on-ramp is one of the curated workflows shipped under `Wonderful_workflow_corpus/`. Each workflow directory contains a `workflow.json` (logic) and `tools.json` (tool definitions), and can be executed via a universal entry point (Source: [Wonderful_workflow_corpus/README.md:1-15]()).

Set your provider key first:

```bash
export OPENAI_API_KEY=sk-xxxxxx
# or place it in a .env file
```

### 2. Run a Workflow

```bash
python Wonderful_workflow_corpus/execute_workflow.py \
  --workflow Wonderful_workflow_corpus/arxiv_daily_digest/workflow.json \
  --goal "Please recommend the latest papers on multi-agent systems in NLP." \
  --output arxiv_digest.md
```

The same script is reused for the Recipe Generator, Tetris Game Generator, Travel Recommendation, and Feng Shui Advisor workflows (Source: [Wonderful_workflow_corpus/README.md:20-80]()). For the Stock Analysis pipeline, a dedicated `stock_analysis.py` entry point is provided (Source: [Wonderful_workflow_corpus/README.md:80-100]()).

### 3. Build a Custom Agent

For programmatic use, instantiate a `CustomizeAgent` with your preferred `LLMConfig`. Available configuration classes include `OpenAILLMConfig`, `AzureOpenAIConfig`, `LiteLLMConfig`, `OpenRouterLLMConfig`, and `AliyunLLMConfig` (Source: [evoagentx/models/model_configs.py:1-50]()). After v0.1.2, `OpenAILLM` and `OpenRouterLLM` were updated to support sync/async generation and streaming usage tracking, while `AliyunLLM` and `SiliconFlowLLM` were refactored onto OpenAI-compatible clients — which simplifies credential management for those providers.

## Architecture at a Glance

```mermaid
flowchart LR
    A[Goal / Input] --> B[CustomizeAgent]
    B --> C[Action + Prompt Template]
    C --> D[LLM via BaseLLM]
    D --> E[LLMOutputParser]
    E --> F[Structured Output]
    F --> G[Tools / Memory]
    G --> H[Optimizers]
    H -->|mutate prompts| C
```

The diagram summarizes the request lifecycle: a goal flows into a `CustomizeAgent`, which dispatches it to an `Action` whose prompt is rendered, sent to the configured `BaseLLM`, and then parsed back into typed fields before optional tool calls or memory updates fire (Source: [evoagentx/agents/customize_agent.py:1-15](), [evoagentx/models/base_model.py:1-30]()). Optimizers such as EvoPrompt close the loop by tuning the underlying signatures, which are constructed dynamically from a `ParamRegistry` (Source: [evoagentx/utils/mipro_utils/signature_utils.py:1-40]()).

## Known Caveats From the Community

A few issues are worth knowing before you start:

- **Optimizer support is evolving.** As of issue #220, the community has requested `alphaevolve`-style optimizers; check release notes for newly added backends before designing around a specific algorithm.
- **Concurrent optimization can race.** Issue #252 reports that EvoPrompt temporarily mutates a shared `ParamRegistry` while evaluating combinations concurrently. If you run heavy optimization locally, consider serializing evaluation.
- **Generated-code tools are strict.** Issue #255 notes that `GeneratedCodeTool` parses the entire stdout as JSON, so any extra `print` statements in generated code will break structured parsing.
- **Documentation gaps remain.** Issue #93 highlights that `docs/api` is incomplete, especially for custom benchmarks and optimizers — when in doubt, prefer reading the source modules directly (e.g., [evoagentx/utils/utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/utils.py) for parameter validation helpers).

## Where to Go Next

After running a sample workflow, the natural next steps are: (1) author a custom `CustomizeAgent` for your domain, (2) instrument it with tools and long-term memory, and (3) layer an optimizer over it to evolve prompts. Each subsequent wiki page dives into one of those areas in depth.

## See Also

- LLM Providers & Configuration
- Agents & Actions
- Workflow Corpus
- Optimizers (EvoPrompt, MIPRO)
- Tools & Memory

---

<a id='page-2'></a>

## Core Architecture: Agents, Workflows & Tools

### Related Pages

Related topics: [Introduction & Getting Started](#page-1), [Self-Evolution: Optimizers & Evaluation](#page-3), [Advanced Capabilities: Memory, RAG, HITL & Storage](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py)
- [evoagentx/agents/agent_manager.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent_manager.py)
- [evoagentx/agents/action_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/action_agent.py)
- [evoagentx/agents/customize_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/customize_agent.py)
- [evoagentx/agents/task_planner.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/task_planner.py)
- [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py)
- [evoagentx/models/model_configs.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/model_configs.py)
- [evoagentx/utils/utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/utils.py)
- [evoagentx/utils/mipro_utils/signature_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/signature_utils.py)
- [Wonderful_workflow_corpus/README.md](https://github.com/EvoAgentX/EvoAgentX/blob/main/Wonderful_workflow_corpus/README.md)
</details>

# Core Architecture: Agents, Workflows & Tools

## Overview

EvoAgentX is a self-evolving agent framework whose runtime core is composed of three layered abstractions: **Agents** (units that decide or act), **Workflows** (graphs that orchestrate agents), and **Tools** (typed external capabilities attached to agents). The `evoagentx/agents/` package defines the agent hierarchy, `evoagentx/models/` supplies the LLM backends and the structured-output parser used by LLM-backed agents, and `Wonderful_workflow_corpus/` ships ready-made `workflow.json` + `tools.json` pairs that demonstrate how those layers compose at runtime.

The framework supports multiple agent variants (LLM-driven, deterministic, human) and a uniform `AgentManager` for registering them into a workflow graph. v0.1.1 highlights confirm the emphasis on prompt templating and structured output parsing with JSON Schema support, and v0.1.2 refactored core LLM providers (`OpenAILLM`, `OpenRouterLLM`, `AliyunLLM`, `SiliconFlowLLM`) — both of which are leveraged by the agent layer.

## Agent Hierarchy

All concrete agents inherit from `Agent` defined in [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py). The base class carries identity, LLM binding, prompt, and memory fields:

| Field | Type | Purpose |
| --- | --- | --- |
| `name`, `description` | `str` | Unique identity used by `AgentManager` |
| `llm_config`, `llm` | `LLMConfig`, `BaseLLM` | Model binding (provider configs live in `evoagentx/models/model_configs.py`) |
| `system_prompt` | `Optional[str]` | Behavior instructions |
| `short_term_memory` | `ShortTermMemory` | Per-workflow scratchpad |
| `use_long_term_memory` | `bool` | Opt-in persistent memory |
| `actions` | `List[Action]` | What this agent can do |
| `is_human` | `bool` | True → no LLM; acts as a human-proxy |
| `version` | `int` | Used by optimizer/evolver layers |

`init_module()` lazily resolves `self.llm`, optional long-term memory, and resets `actions = []`.

```mermaid
flowchart TD
    A[Agent<br/>evoagentx/agents/agent.py] --> B[CustomizeAgent<br/>customize_agent.py]
    A --> C[ActionAgent<br/>action_agent.py]
    A --> D[TaskPlanner<br/>task_planner.py]
    B -.add_tools.-> T[Toolkit/Tool]
    C -.execute_func.-> F[User Callable]
    D --> P[TaskPlanning action]
    A --> M[AgentManager<br/>agent_manager.py]
    M --> W[Workflow Graph]
```

### CustomizeAgent (LLM-driven)

Defined in [evoagentx/agents/customize_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/customize_agent.py). This is the primary class for declaratively defining an LLM agent from `inputs`, `outputs`, and a `prompt` template. It accepts `parse_mode` of `"title"` (default), `"str"`, `"json"`, plus `custom_output_format` and a custom `output_parser`. Tools are attached via `_add_tools(tools: List[Toolkit])`, and `customize_action_name` returns the first non-cext action.

### ActionAgent (deterministic / human)

Defined in [evoagentx/agents/action_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/action_agent.py). It accepts a required `execute_func: Callable` and an optional `async_execute_func`. `is_human` is automatically derived from the presence of `llm_config` (no config → human proxy). The constructor validates that both callables are real callables before storing them.

### TaskPlanner

Defined in [evoagentx/agents/task_planner.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/task_planner.py). It defaults to `name`/`description`/`system_prompt` from the `TASK_PLANNER` prompt registry and a single `TaskPlanning()` action. `task_planning_action_name` returns the registered action name; this is the agent used to decompose a high-level goal into sub-tasks before they are dispatched.

## AgentManager & Workflow Composition

`AgentManager` in [evoagentx/agents/agent_manager.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent_manager.py) is the registry that turns a workflow graph into running agents:

- `add_agent(agent, llm_config=None, **kwargs)` — skips re-registration if the agent name already exists, then calls `create_agent`, appends to `self.agents`, and sets `agent_states[name] = AgentState.AVAILABLE`.
- `add_agents(agents, llm_config=None, **kwargs)` — convenience wrapper.
- `add_agents_from_workflow(workflow_graph, llm_config=None, ...)` — bulk registration from a declarative graph.
- Each registered agent gets a `threading.Condition` stored in `_state_conditions[agent_name]` so the manager can wake specific agents when their state changes.

This state-machine design is the substrate that workflow nodes use to schedule agents; combined with `ShortTermMemory`, it lets multiple agents cooperate inside one workflow execution.

## LLM Integration & Structured Output

LLM configurations live in [evoagentx/models/model_configs.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/model_configs.py), which exposes subclasses such as `OpenAIConfig`, `AzureOpenAIConfig` (with `azure_endpoint`, `azure_key`, `api_version="2024-12-01-preview"`), and `LiteLLMConfig` (for local servers like Ollama via `api_base` and `is_local`). Per-request generation controls include `temperature`, `top_p`, `response_format`, `modalities`, `logprobs`, and `top_logprobs`.

When an LLM responds, `LMOutputParser` in [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py) converts the raw text into typed attributes. Supported `parse_mode` values are `'str'`, `'json'`, `'xml'`, `'title'` (default heading format `"## {title}"`), and `'custom'` (requires `parse_func(content) -> dict`). v0.1.1 added explicit JSON Schema validation on top of these modes. Tool-call normalization happens in [evoagentx/utils/utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/utils.py) via `format_tool_calls_to_eax_format`, which converts `ChatCompletionMessageToolCall` objects into `{id, function_name, function_args}` dicts that the workflow runtime can dispatch.

For prompt-optimization scenarios (MIPRO), [evoagentx/utils/mipro_utils/signature_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/signature_utils.py) builds typed `Signature` Pydantic models from a `MiproRegistry`, validating placeholders and emitting `InputField`/`OutputField` descriptors. Note the known issue from the community: EvoPrompt mutates a shared `ParamRegistry` while evaluating candidate combinations concurrently, which can contaminate state across workers ([Issue #252](https://github.com/EvoAgentX/EvoAgentX/issues/252)).

## Workflows & Tools in Practice

The `Wonderful_workflow_corpus/` directory (see [README](https://github.com/EvoAgentX/EvoAgentX/blob/main/Wonderful_workflow_corpus/README.md)) packages each workflow as a `workflow.json` plus `tools.json` pair and runs them through the universal `execute_workflow.py`:

```bash
python Wonderful_workflow_corpus/execute_workflow.py \
  --workflow Wonderful_workflow_corpus/tetris_game/workflow.json \
  --goal "Generate a playable Tetris game with scoring, level progression, and keyboard controls." \
  --output tetris.html
```

A workflow is a DAG of nodes; each node resolves to an agent registered in `AgentManager`, and tools attached to that agent are invoked through the same tool-call normalization path described above. The shipped corpus spans simple single-agent flows (Arxiv Daily Digest, Recipe Generator, Tetris, Travel Recommendation, Feng Shui Advisor) and the more complex multi-stage Invest / Stock Analysis pipeline. Community reports also surface that generated-code tools (e.g., Alita's `GeneratedCodeTool`) currently parse the entire executor stdout as JSON, which breaks structured outputs when the wrapped code prints any extra logging — see [Issue #255](https://github.com/EvoAgentX/EvoAgentX/issues/255). When authoring custom tools, prefer side-effect-free execution or push log output to stderr so the final `json.dumps(result)` line remains the only stdout payload.

## See Also

- [Agents API Reference](./agents.md)
- [Workflows & Graphs](./workflows.md)
- [Tools & Toolkits](./tools.md)
- [LLM Providers & Output Parsing](./models.md)
- [EvoAgentX v0.1.1 Release Notes](https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.1)
- [EvoAgentX v0.1.2 Release Notes](https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.2)

---

<a id='page-3'></a>

## Self-Evolution: Optimizers & Evaluation

### Related Pages

Related topics: [Core Architecture: Agents, Workflows & Tools](#page-2), [Advanced Capabilities: Memory, RAG, HITL & Storage](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [evoagentx/optimizers/optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/optimizer.py)
- [evoagentx/optimizers/optimizer_core.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/optimizer_core.py)
- [evoagentx/optimizers/textgrad_optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/textgrad_optimizer.py)
- [evoagentx/optimizers/mipro_optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/mipro_optimizer.py)
- [evoagentx/optimizers/aflow_optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/aflow_optimizer.py)
- [evoagentx/optimizers/evoprompt_optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/evoprompt_optimizer.py)
- [evoagentx/utils/mipro_utils/signature_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/signature_utils.py)
- [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py)
- [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py)
</details>

# Self-Evolution: Optimizers & Evaluation

## Purpose & Scope

Self-Evolution is the core loop that distinguishes EvoAgentX from a static agent framework. It continuously improves agent behavior — primarily **prompts**, **workflows**, and **hyper-parameters** — by running candidate variants against an evaluator and keeping the variants that score highest. The Optimizers & Evaluation subsystem packages this loop into reusable building blocks so that any developer can plug in their own benchmark and let the framework search for better configurations automatically.

The subsystem lives under `evoagentx/optimizers/` and is composed of an abstract base plus four concrete strategies. Each strategy wraps the same outer pattern: propose a candidate → execute the agent with that candidate → score the result on a benchmark → record metrics → repeat until a budget is exhausted. Source: [evoagentx/optimizers/optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/optimizer.py).

## Architecture Overview

The following diagram illustrates the data flow between an `Optimizer`, the underlying `Agent`, the `LLM`, and the evaluator. The optimizer mutates a candidate configuration, the agent executes a program, and the score returned by the evaluator feeds back into the optimizer's search state.

```mermaid
flowchart LR
    A[Optimizer] -->|candidate config| B[Agent / Workflow]
    B -->|prompts + tools| C[BaseLLM]
    C -->|raw text| B
    B -->|structured output| D[LLMOutputParser]
    D -->|parsed fields| E[Evaluator / Benchmark]
    E -->|score + feedback| A
    A -->|best config| F[(Storage / Registry)]
```

The `Agent` base class owns a list of `Action`s, an `llm_config`, and a `system_prompt`, all of which become optimization targets. Source: [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py). Parsed LLM output flows through `LLMOutputParser`, which supports `str`, `json`, `xml`, `title`, and `custom` modes; the `json` mode also accepts JSON-Schema definitions introduced in v0.1.1. Source: [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py).

## Optimizer Implementations

| Optimizer | Module | Strategy |
|-----------|--------|----------|
| TextGrad | `textgrad_optimizer.py` | Textual gradient descent — uses LLM-generated natural-language feedback to update prompts. |
| MIPRO | `mipro_optimizer.py` | Multi-step Instruction Proposal + Bayesian optimization over instructions and few-shot demos. |
| AFlow | `aflow_optimizer.py` | Genetic / tree-search style workflow optimization. |
| EvoPrompt | `evoprompt_optimizer.py` | Evolutionary prompt combination search. |

All four inherit from a common `Optimizer` abstraction defined in [evoagentx/optimizers/optimizer.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/optimizers/optimizer.py) and share the lifecycle implemented in `optimizer_core.py`.

### MIPRO Internals

MIPRO is the most configuration-rich optimizer. It maintains a `MiproRegistry` that maps each register key to either a raw instruction string or a `PromptTemplate`. The utility `signature_from_registry` walks this registry and builds a DSPy-style `Signature` class whose input/output fields match the registry's declared names and descriptions. Source: [evoagentx/utils/mipro_utils/signature_utils.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/utils/mipro_utils/signature_utils.py).

A lower-level `ParamRegistry` also exists and is shared with EvoPrompt for holding tunable hyper-parameters. The AST helper `_parse_type_node` converts signature strings such as `"x: List[int] -> y: str"` into real Python typing objects before constructing the Pydantic model.

### Customization Hooks

Developers can subclass `CustomizeAgent` to expose their own input/output fields and tool list to the optimizer, which then searches over the rendered prompt template rather than the raw text. Source: [evoagentx/agents/customize_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/customize_agent.py). The optimizer inspects the agent's declared `inputs`, `outputs`, `parse_mode`, and `parse_func` to know how to validate generated outputs before scoring them.

## Evaluation Pipeline

Evaluation is decoupled from generation. The optimizer hands a candidate configuration to the agent, the agent executes the workflow through the configured `BaseLLM`, and the returned text is parsed by `LLMOutputParser`. The parsed dict is then handed to a user-supplied metric function or to one of the built-in benchmark evaluators under `evoagentx/benchmark/`. The evaluator returns a scalar score and optional textual feedback; the optimizer uses both to drive its next proposal.

Because the parse layer is configurable, evaluators can rely on stable structured output even when the underlying model emits prose around the JSON payload — as long as a valid JSON object is present somewhere in the response, `parse_mode="json"` will extract it. Source: [evoagentx/models/base_model.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/base_model.py).

## Known Issues & Community Discussion

Several limitations have been raised by users and addressed across releases:

- **Registry contamination in EvoPrompt.** Concurrent evaluation of candidate prompt combinations can mutate the shared `ParamRegistry` between threads, causing scores to be attributed to the wrong candidate. Tracked in [#252](https://github.com/EvoAgentX/EvoAgentX/issues/252).
- **Structured-output loss in generated tools.** When Alita-generated code prints extra stdout, the wrapper's `json.dumps(result)` is no longer the only output and parsing fails. Tracked in [#255](https://github.com/EvoAgentX/EvoAgentX/issues/255).
- **API documentation gaps.** Issue [#93](https://github.com/EvoAgentX/EvoAgentX/issues/93) highlights that documentation for custom benchmarks and custom optimizers is sparse, making it hard to extend the framework. The `signature_utils.py` file shows one extension pattern: subclass `Signature` via `create_model`.
- **Algorithm coverage.** Issue [#220](https://github.com/EvoAgentX/EvoAgentX/issues/220) requests integration of the AlphaEvolve evolutionary algorithm, which is not yet shipped.
- **Async & streaming stability.** v0.1.1 and v0.1.2 release notes record fixes for async generation, streaming token accounting, and tool-call output formatting that directly affect long optimizer runs.

## See Also

- [LLM Provider Layer](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/models/) — provider-specific configuration used by every optimizer run.
- [Agents & Actions](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/) — the units of optimization.
- [Workflow Corpus](https://github.com/EvoAgentX/EvoAgentX/blob/main/Wonderful_workflow_corpus/README.md) — example workflows that can be optimized end-to-end.

---

<a id='page-4'></a>

## Advanced Capabilities: Memory, RAG, HITL & Storage

### Related Pages

Related topics: [Core Architecture: Agents, Workflows & Tools](#page-2), [Self-Evolution: Optimizers & Evaluation](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [evoagentx/agents/agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/agent.py)
- [evoagentx/memory/memory.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/memory/memory.py)
- [evoagentx/memory/long_term_memory.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/memory/long_term_memory.py)
- [evoagentx/memory/memory_manager.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/memory/memory_manager.py)
- [evoagentx/agents/long_term_memory_agent.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/agents/long_term_memory_agent.py)
- [evoagentx/rag/rag.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/rag/rag.py)
- [evoagentx/rag/rag_config.py](https://github.com/EvoAgentX/EvoAgentX/blob/main/evoagentx/rag/rag_config.py)
</details>

# Advanced Capabilities: Memory, RAG, HITL & Storage

EvoAgentX exposes a set of "advanced capabilities" that sit above the core agent-execution loop. They are not required for a single LLM call, but they enable persistent state, knowledge grounding, human oversight, and artifact persistence across multi-step workflows. This page documents the four primary capability surfaces — memory subsystems, retrieval-augmented generation (RAG), human-in-the-loop (HITL) interaction, and storage handling — as they are exposed through the `Agent` data model and its companion modules.

## 1. Agent-Level Capability Surface

The `Agent` BaseModel defines the integration points for every advanced capability. Memory, storage, and human-proxying are all first-class fields on the agent, meaning they are constructed alongside the agent rather than being added externally.

Source: [evoagentx/agents/agent.py]() defines the following fields:

| Field | Type | Default | Purpose |
|---|---|---|---|
| `short_term_memory` | `ShortTermMemory` | `ShortTermMemory()` | Stores the running conversation for a single workflow |
| `use_long_term_memory` | `bool` | `False` | Opt-in switch for cross-workflow memory |
| `long_term_memory` | `LongTermMemory` | `None` | Persistent memory backend |
| `long_term_memory_manager` | `MemoryManager` | `None` | Controller for long-term memory operations |
| `storage_handler` | `StorageHandler` | `None` | Artifact/file persistence backend |
| `n` | `int` | `None` | Number of recent messages fed back into action execution; `None` uses the entire short-term buffer |
| `is_human` | `bool` | `False` | Marks this agent as a human proxy (HITL surrogate) |

The lifecycle method `Agent.init_module()` is responsible for wiring these up. It calls `init_llm()` (unless `is_human` is set), then conditionally initializes `long_term_memory` only when `use_long_term_memory` is `True`, and finally resets `actions = []` so the subclass can populate them.

Source: [evoagentx/agents/agent.py:init_module]().

```python
def init_module(self):
    if not self.is_human:
        self.init_llm()
    if self.use_long_term_memory:
        self.init_long_term_memory()
    self.actions = []
```

This pattern — capability is a field, activation is a flag, initialization is gated — is consistent across all four capability surfaces.

## 2. Memory Subsystems

### 2.1 Short-Term Memory

`ShortTermMemory` is the default conversation buffer for an `Agent`. It is auto-instantiated via `Field(default_factory=ShortTermMemory)`, meaning every agent receives one even if the developer never references it. The `n` field on the agent controls how much of this buffer is replayed into the next action invocation; leaving `n=None` (the default) replays the full buffer.

Source: [evoagentx/agents/agent.py:short_term_memory]() and the `n` field description: *"number of latest messages used to provide context for action execution. It uses all the messages in short term memory by default."*

### 2.2 Long-Term Memory and Memory Manager

When `use_long_term_memory=True`, `Agent.init_module()` calls `init_long_term_memory()`, which constructs both a `LongTermMemory` store and a `MemoryManager` that mediates read/write/recall operations. The dedicated `LongTermMemoryAgent` (referenced as [evoagentx/agents/long_term_memory_agent.py]()) wraps an LLM around these stores to perform extraction, consolidation, and retrieval on behalf of a parent agent.

Source: [evoagentx/agents/agent.py:long_term_memory, long_term_memory_manager, use_long_term_memory]().

The `memory.py` and `long_term_memory.py` modules contain the concrete storage and recall implementations, while `memory_manager.py` provides the orchestration interface used by both `Agent` and `LongTermMemoryAgent`. The community release notes for v0.1.1 confirm that long-term memory is treated as a serialization-sensitive component: *"Fixed module serialization and config restoration issues for nested `BaseModule`s."* — a change relevant when persisting long-term-memory state across sessions.

### 2.3 Community Note on Concurrency

Memory and registry mutation is currently a known concern. Issue #252 ("EvoPrompt concurrent evaluation can contaminate registry state") documents that shared mutable registries — the same class of object used by `ParamRegistry` to drive prompt memory — are not yet safe under concurrent evaluation. Developers wiring long-term-memory managers into parallel evaluation pipelines should serialize registry access or run combinations serially until the upstream fix lands.

## 3. Retrieval-Augmented Generation (RAG)

RAG is exposed through the dedicated [evoagentx/rag/rag.py]() entry point with configuration centralized in [evoagentx/rag/rag_config.py](). The split mirrors the rest of EvoAgentX: behavior lives in `rag.py`, declarative settings live in `rag_config.py`. This lets developers version RAG pipelines (index choice, chunk size, retriever k, reranker model) alongside agent code without touching the retrieval logic.

In typical usage the RAG module is attached to an `Action` or to a custom agent's prompt, providing grounding context that the LLM can cite. The v0.1.2 release notes confirm continued investment in structured output handling — *"upgrades EvoAgentX core LLM provider support and improves structured output handling"* — which directly benefits RAG workflows where citations and chunk metadata must be parsed back into typed objects.

## 4. Human-in-the-Loop and Storage

### 4.1 Human-in-the-Loop

HITL is modeled as a first-class agent type rather than a callback. The `is_human: bool = False` flag on `Agent` marks an instance as a human proxy. The `init_module()` method short-circuits LLM initialization when this flag is set (`if not self.is_human: self.init_llm()`), so a human-proxy agent can be placed anywhere an `Agent` is expected — including the same workflow graph as LLM-backed agents — without spawning a model client.

Source: [evoagentx/agents/agent.py:is_human, init_module]().

This design lets workflows pause for human input by simply routing the next turn to the human-proxy agent, then continuing with LLM agents once the human response is recorded into `short_term_memory`.

### 4.2 Storage

`storage_handler: Optional[StorageHandler] = None` is the integration point for persistent artifacts — generated files, intermediate datasets, downloaded resources, or workflow outputs. Because it is `None` by default, it is fully opt-in; developers attach a concrete handler (file system, cloud bucket, etc.) only when the workflow needs to persist non-message state. Storage is intentionally decoupled from memory: memory holds conversational recall, storage holds durable artifacts, and the two can be combined or used independently.

## 5. Capability Interaction

The four capabilities compose through the agent's fields. A typical advanced workflow might enable long-term memory for cross-session recall, attach a RAG module to an action for grounded answering, route clarification turns to an `is_human=True` agent, and use `storage_handler` to persist any downloaded evidence. Because each capability is a field with its own initialization guard, disabling one never interferes with the others.

```mermaid
flowchart LR
    A[Agent] --> STM[ShortTermMemory]
    A -->|use_long_term_memory=True| LTM[LongTermMemory]
    A --> MM[MemoryManager]
    A --> RAG[RAG Module]
    A -->|is_human=True| HITL[Human Proxy]
    A --> SH[StorageHandler]
    MM --> LTM
    RAG -->|grounding context| A
    HITL -->|user input| STM
    SH -->|artifacts| A
```

## See Also

- Agent base model and lifecycle: [evoagentx/agents/agent.py]()
- Custom agent construction: [evoagentx/agents/customize_agent.py]()
- LLM provider layer (LiteLLM, OpenAI, OpenRouter, Aliyun, SiliconFlow): [evoagentx/models/litellm_model.py](), [evoagentx/models/base_model.py](), [evoagentx/models/model_configs.py]()
- Release context: [v0.1.0 release notes](https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.0), [v0.1.1 release notes](https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.1), [v0.1.2 release notes](https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.2)
- Known concurrency limitation: [Issue #252 — EvoPrompt concurrent evaluation can contaminate registry state](https://github.com/EvoAgentX/EvoAgentX/issues/252)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: EvoAgentX/EvoAgentX

Summary: Found 15 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

## 1. Capability evidence risk - Capability evidence risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/EvoAgentX/EvoAgentX/issues/220

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: v0.1.0 – Initial Release
- User impact: Upgrade or migration may change expected behavior: v0.1.0 – Initial Release
- Evidence: failure_mode_cluster:github_release | https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.0

## 3. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.host_targets | https://github.com/EvoAgentX/EvoAgentX

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: [Bug] EvoPrompt concurrent evaluation can contaminate registry state
- User impact: Developers may misconfigure credentials, environment, or host setup: [Bug] EvoPrompt concurrent evaluation can contaminate registry state
- Evidence: failure_mode_cluster:github_issue | https://github.com/EvoAgentX/EvoAgentX/issues/252

## 5. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: v0.1.1 Release
- User impact: Upgrade or migration may change expected behavior: v0.1.1 Release
- Evidence: failure_mode_cluster:github_release | https://github.com/EvoAgentX/EvoAgentX/releases/tag/v0.1.1

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/EvoAgentX/EvoAgentX/issues/255

## 7. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/EvoAgentX/EvoAgentX/issues/252

## 8. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/EvoAgentX/EvoAgentX

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/EvoAgentX/EvoAgentX

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/EvoAgentX/EvoAgentX

## 11. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/EvoAgentX/EvoAgentX

## 12. Capability evidence risk - Capability evidence risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this capability risk before relying on the project: [Bug] Alita generated tools lose structured result when code prints extra stdout
- User impact: Developers may hit a documented source-backed failure mode: [Bug] Alita generated tools lose structured result when code prints extra stdout
- Evidence: failure_mode_cluster:github_issue | https://github.com/EvoAgentX/EvoAgentX/issues/255

## 13. Capability evidence risk - Capability evidence risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this conceptual risk before relying on the project: [Question] Update Optimizer?
- User impact: Developers may hit a documented source-backed failure mode: [Question] Update Optimizer?
- Evidence: failure_mode_cluster:github_issue | https://github.com/EvoAgentX/EvoAgentX/issues/220

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/EvoAgentX/EvoAgentX

## 15. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/EvoAgentX/EvoAgentX

<!-- canonical_name: EvoAgentX/EvoAgentX; human_manual_source: deepwiki_human_wiki -->
