Doramagic Project Pack · Human Manual

testzeus-hercules

TestZeus Hercules serves as an intelligent testing agent that can:

Getting Started with TestZeus Hercules

TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-in...

Section Core Components

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section Setup Steps

Continue reading this section for the full explanation and source context.

Section Basic Options

Continue reading this section for the full explanation and source context.

TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-interactive modes for executing automated tests against web applications.

Overview

TestZeus Hercules serves as an intelligent testing agent that can:

  • Navigate web pages and interact with UI elements
  • Generate and execute Gherkin-style test scenarios
  • Perform API security scanning using Nuclei
  • Parse accessibility trees for element identification
  • Execute Python scripts in a sandboxed environment

Sources: CONTRIBUTING.md

Architecture

graph TD
    A[User Input] --> B[CLI / Main Entry]
    B --> C[Global Configuration]
    C --> D[Navigation Agent]
    D --> E[Browser Controller]
    E --> F[CDP Stream Renderer]
    F --> G[Accessibility Tree]
    G --> D
    D --> H[Python Sandbox Executor]
    H --> I[Test Results]
    D --> J[API Security Scanner]
    J --> K[Nuclei Integration]

Core Components

ComponentPurpose
testzeus_hercules/__main__.pyEntry point handling bulk test execution
testzeus_hercules/config.pyCommand-line argument parsing and configuration
testzeus_hercules/telemetry.pyInstallation tracking and error reporting
testzeus_hercules/core/agents/executor_nav_agent.pyNavigation agent for browser automation
frontend/*/index.htmlCDP stream rendering interfaces

Sources: testzeus_hercules/__main__.py:25-45 Sources: testzeus_hercules/config.py

Installation

Prerequisites

  • Python 3.x
  • Git
  • Make

Setup Steps

  1. Fork the repository

``bash git clone [email protected]:YOUR_GIT_USERNAME/testzeus-hercules.git cd testzeus-hercules git remote add upstream https://github.com/test-zeus-ai/testzeus-hercules ``

  1. Create virtual environment

``bash make virtualenv source .venv/bin/activate ``

  1. Install in development mode

``bash make install ``

Sources: CONTRIBUTING.md

Command-Line Interface

TestZeus Hercules provides extensive CLI options for configuration.

Basic Options

ParameterTypeDescription
--input-filestrPath to the input file
--output-pathstrPath to the output directory
--test-data-pathstrPath to the test data directory
--project-basestrPath to the project base directory

Sources: testzeus_hercules/config.py

LLM Configuration

ParameterTypeDescription
--llm-modelstrName of the LLM model
--llm-model-api-keystrAPI key for the LLM model
--llm-model-base-urlstrBase URL for the LLM API
--llm-model-api-typestrType of API (openai, anthropic, azure)
--llm-temperaturefloatTemperature for LLM sampling (0.0-1.0)
--agents-llm-config-filestrPath to agents LLM configuration file

Sources: testzeus_hercules/config.py

Browser Options

ParameterDescription
--browser-channelBrowser channel (e.g., chrome-beta, firefox-nightly)
--browser-pathCustom path to browser executable
--browser-versionSpecific browser version (e.g., '114', '115.0.1', 'latest')
--enable-ublockEnable uBlock Origin extension
--disable-ublockDisable uBlock Origin extension
--auto-accept-screen-sharingAutomatically accept screen sharing prompts

Sources: testzeus_hercules/config.py

Test Execution Options

ParameterDescription
--bulkExecute tests in bulk from tests directory
--reuse-vector-dbReuse existing vector DB instead of creating fresh one

Sources: testzeus_hercules/config.py Sources: testzeus_hercules/__main__.py:45-60

Portkey Integration

ParameterDescription
--enable-portkeyEnable Portkey integration for LLM routing
--portkey-api-keyAPI key for Portkey
--portkey-strategyRouting strategy (fallback or loadbalance)

Sources: testzeus_hercules/config.py

Sandbox Configuration

ParameterDescription
--sandbox-tenant-idTenant ID for sandbox isolation

Sources: testzeus_hercules/config.py

Running TestZeus Hercules

Interactive Mode

Run the interactive CDP stream renderer with user input capabilities:

make run-interactive

This launches the frontend at frontend/interactive/index.html which provides:

  • Real-time screencast display
  • Crosshair cursor for element selection
  • Input capture for typing into the remote page

Sources: frontend/interactive/index.html

Non-Interactive Mode

Run tests without user interaction:

make run

This uses the non-interactive frontend at frontend/non-interactive/index.html which displays:

  • Connection status
  • Screencast output only

Sources: frontend/non-interactive/index.html

Bulk Execution

Execute multiple tests from a tests directory:

python -m testzeus_hercules --bulk

The system checks for a tests directory in the project source root and processes each test folder:

if get_global_conf().should_execute_bulk():
    project_base = get_global_conf().get_project_source_root()
    tests_dir = os.path.join(project_base, "tests")

Sources: testzeus_hercules/__main__.py:45-55

Response Parsing

TestZeus Hercules includes a robust response parser for handling LLM outputs:

graph LR
    A[LLM Response] --> B{Is JSON?}
    B -->|Yes| C[Parse JSON]
    B -->|No| D[Extract Plan/Next Step]
    C --> E[Return Dict]
    D --> E

The parser handles:

  • JSON wrapped in ```json code blocks
  • Plain JSON responses
  • Fallback extraction for plan and next_step fields

Sources: testzeus_hercules/utils/response_parser.py

Telemetry and Installation Tracking

On first run, TestZeus Hercules generates a unique installation ID:

def get_installation_id(file_path: str = "installation_id.txt", is_manual_run: bool = True):
    if os.path.exists(file_path):
        # Load existing installation data
    else:
        # Generate new installation ID
        installation_id = str(uuid.uuid4())

Sources: testzeus_hercules/telemetry.py

Development Workflow

Code Quality

CommandPurpose
make fmtFormat code using black & isort
make lintRun pep8, black, mypy linters
make testRun tests and generate coverage report
make watchRun tests on every change

Sources: CONTRIBUTING.md

Testing Requirements

  • Code coverage must show 100% coverage
  • Add tests for all changes in your PR
make test

Sources: CONTRIBUTING.md

Release Process

  1. Make changes following the contribution guidelines
  2. Commit using conventional commit messages
  3. Run tests to ensure everything works
  4. Execute make release to create a new tag and push
CAUTION: The make release will change local changelog files and commit all unstaged changes.

Sources: CONTRIBUTING.md

Navigation Agent Execution

The executor navigation agent follows specific guidelines:

Execution Principles

  1. Error Review: Review previous step outcomes before proceeding
  2. Script Execution: Use execute_python_sandbox tool with access to page, browser, context, playwright_manager, logger, config
  3. Sequential Execution: Execute one script at a time and await results
  4. Validation: Check for successful execution status before proceeding

Sources: testzeus_hercules/core/agents/executor_nav_agent.py

API Security Scanning

TestZeus Hercules integrates with Nuclei for API security testing:

async def run_nuclei_command(
    is_open_api_spec: bool,
    open_api_spec_path: Optional[str],
    target_url: Optional[str],
    tag: str,
    output_file: Path,
    headers: Optional[List[Tuple[str, str]]] = None,
):

Sources: testzeus_hercules/core/tools/api_sec_calls.py

Helper Scripts

CDP Journey Script

Generate test cases from journey data:

python helper_scripts/cdp_journey_script.py --number_of_testcase 5

This produces Gherkin specifications and test data files from JSON journey definitions.

Sources: helper_scripts/cdp_journey_script.py

API Functional Gherkin Test Generator

Generate Gherkin test cases from OpenAPI specifications:

python helper_scripts/generate_api_functional_gherkin_test.py spec.yaml --output ./features --number_of_testcase 100

Sources: helper_scripts/generate_api_functional_gherkin_test.py

Accessibility Tree Processing

TestZeus Hercules processes DOM elements to generate accessibility trees:

  • Identifies interactive elements (buttons, links, inputs)
  • Detects draggable elements
  • Filters out non-interactive elements
  • Provides detailed element metadata for the AI agent

Sources: testzeus_hercules/utils/get_detailed_accessibility_tree.py

Summary

TestZeus Hercules provides a comprehensive end-to-end testing solution with:

  • AI-powered browser automation via LLM integration
  • Flexible deployment (interactive and non-interactive modes)
  • Extensive CLI configuration options
  • Built-in support for bulk test execution
  • API security scanning capabilities
  • Gherkin test generation from various sources

Sources: CONTRIBUTING.md

Sources: [CONTRIBUTING.md]()

System Architecture

Related topics: Agent System, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Tool System

System Architecture

Overview

TestZeus Hercules is an open-source AI agent framework designed for end-to-end testing of web applications. The system leverages Large Language Models (LLMs) to orchestrate browser automation through Playwright, enabling natural language-driven test execution without requiring users to write traditional test scripts.

The architecture follows a modular design pattern with clear separation between:

  • Core execution engine
  • Agent-based navigation and task handling
  • Python sandbox environment for script execution
  • Frontend visualization components
  • Configuration and telemetry systems

Sources: testzeus_hercules/__main__.py

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Agent System

Related topics: System Architecture, Memory Management, LLM Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Planner Agent

Continue reading this section for the full explanation and source context.

Section Browser Navigation Agent

Continue reading this section for the full explanation and source context.

Section API Navigation Agent

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Memory Management, LLM Configuration

Agent System

Overview

The Agent System is the core orchestration layer of the Hercules testing framework. It implements a multi-agent architecture where specialized agents collaborate to execute end-to-end testing scenarios across web browsers, APIs, databases, and other system components.

Sources: testzeus_hercules/core/agent_registry.py:1-50

Architecture Overview

The system follows a hierarchical agent design where a central planner coordinates specialized navigation agents, each responsible for a specific domain of interaction.

graph TD
    A[HighLevelPlannerAgent] --> B[BrowserNavAgent]
    A --> C[ApiNavAgent]
    A --> D[SqlNavAgent]
    A --> E[McpNavAgent]
    A --> F[SecNavAgent]
    A --> G[TimeKeeperNavAgent]
    B --> H[ExecutorNavAgent]
    C --> H
    D --> H
    E --> H
    F --> H
    G --> H
    
    H --> I[Browser/API/SQL/MCP/Security]

Sources: testzeus_hercules/core/simple_hercules.py:1-100

Agent Types

High-Level Planner Agent

The HighLevelPlannerAgent serves as the central coordinator that receives high-level test instructions and decomposes them into executable steps for specialized agents.

Key Responsibilities:

  • Parsing test instructions and generating execution plans
  • Routing tasks to appropriate specialized agents
  • Aggregating results and handling test completion
  • Managing assertions and validating expected outcomes

Sources: testzeus_hercules/core/agents/high_level_planner_agent.py:1-80

Browser Navigation Agent

The BrowserNavAgent handles all browser-based interactions including page navigation, element interaction, and DOM manipulation.

Capabilities:

  • Web page navigation and URL handling
  • Element clicking and text input
  • Screenshot capture and visual validation
  • Cookie and session management

Sources: testzeus_hercules/core/agents/browser_nav_agent.py:1-100

API Navigation Agent

The ApiNavAgent manages HTTP-based interactions for testing RESTful APIs and web services.

Capabilities:

  • HTTP request construction and execution
  • Response validation and assertion
  • Authentication handling (OAuth, API keys, Bearer tokens)
  • Multi-step API workflows

Sources: testzeus_hercules/core/agents/api_nav_agent.py:1-100

SQL Navigation Agent

The SqlNavAgent handles database interactions for data validation and setup during test execution.

Capabilities:

  • SQL query execution
  • Database connection management
  • Result set validation
  • Test data preparation and teardown

Sources: testzeus_hercules/core/agents/sql_nav_agent.py:1-100

MCP Navigation Agent

The McpNavAgent provides Model Context Protocol integration for interacting with external AI models and tools.

Capabilities:

  • MCP server connection management
  • Tool invocation through MCP protocol
  • Context propagation for AI-assisted testing

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-100

Security Navigation Agent

The SecNavAgent handles security-related testing scenarios including authentication flows, authorization checks, and vulnerability scanning.

Capabilities:

  • Authentication flow testing
  • Session security validation
  • Authorization boundary testing
  • Security header verification

Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-100

Time Keeper Navigation Agent

The TimeKeeperNavAgent manages time-related test scenarios including scheduling, delays, and time-based assertions.

Capabilities:

  • Time-based test scheduling
  • Delay and timeout management
  • Timestamp validation
  • Scheduled task execution

Sources: testzeus_hercules/core/agents/time_keeper_nav_agent.py:1-100

Executor Navigation Agent

The ExecutorNavAgent serves as the execution engine that runs Python scripts and commands within a sandboxed environment.

Key Features:

  • Python script execution in isolated sandbox
  • Dynamic module injection based on tenant configuration
  • Access to browser context, page objects, and configuration
  • Custom injection support for tenant-specific utilities

Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-150

Agent Registry

The AgentRegistry provides a centralized registration and lookup mechanism for all agents in the system.

Registry Operations

OperationDescription
register_agent(name, agent)Register a new agent with a unique name
get_agent(name)Retrieve an agent by name
list_agents()List all registered agents
remove_agent(name)Remove an agent from the registry

Sources: testzeus_hercules/core/agent_registry.py:50-100

Agent Creation Flow

The SimpleHercules class orchestrates agent creation with the following workflow:

sequenceDiagram
    participant SH as SimpleHercules
    participant Planner as HighLevelPlannerAgent
    participant Nav as Navigation Agents
    participant Exec as ExecutorNavAgent
    
    SH->>SH: Initialize configuration
    SH->>Planner: Create planner agent
    SH->>Nav: Create navigation agents (Browser, API, SQL, etc.)
    SH->>Exec: Create executor agent
    SH->>SH: Register all agents in registry
    Planner->>Nav: Route tasks based on type
    Nav->>Exec: Execute concrete actions

Sources: testzeus_hercules/core/simple_hercules.py:100-200

Message Flow

Agents communicate through a structured message passing system with the following message types:

Message TypePurpose
PLANInitial test plan and steps
STEPIndividual test step execution
INFOInformational messages
ASSERTAssertion results
COMPLETEDTask completion notification
TERMINATEDAgent termination signal

Sources: testzeus_hercules/core/simple_hercules.py:200-300

Configuration

LLM Model Configuration

Each agent supports individual LLM model configuration:

ParameterTypeDescription
modelstringModel name (e.g., gpt-4, claude-3)
temperaturefloatSampling temperature (0.0-1.0)
max_tokensintMaximum response tokens
api_keystringAPI authentication key
base_urlstringCustom API endpoint URL

Sources: testzeus_hercules/config.py:1-80

Agent-Specific Settings

# Example agent configuration structure
agent_config = {
    "model_config_params": {
        "model": "gpt-4",
        "temperature": 0.7,
        "max_tokens": 2000
    },
    "llm_config_params": {
        "timeout": 60,
        "retry_attempts": 3
    },
    "other_settings": {
        "system_prompt": "You are a testing agent...",
        "max_chat_rounds": 10
    }
}

Response Parsing

The system uses parse_response() from the response parser module to extract structured data from agent outputs:

def parse_response(message: str) -> dict[str, Any]:
    # Handles JSON extraction from markdown code blocks
    # Normalizes newlines and whitespace
    # Extracts plan and next_step fields

Sources: testzeus_hercules/utils/response_parser.py:1-60

Sandbox Execution

The ExecutorNavAgent provides a secure Python execution environment with configurable module injection:

Available Injections

ModuleDescription
playwrightBrowser automation library
requestsHTTP client library
beautifulsoup4HTML parsing
hercules_utilsProject utility functions
Custom packagesConfigured via SANDBOX_PACKAGES

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:1-100

Sandbox Context Variables

Scripts executed in the sandbox have automatic access to:

VariableTypeDescription
pagePlaywright PageCurrent browser page
browserPlaywright BrowserBrowser instance
contextPlaywright ContextBrowser context
playwright_managerManagerPlaywright management
loggerLoggerLogging interface
configConfigGlobal configuration

Command-Line Interface

The agent system supports configuration via command-line arguments:

ArgumentDescription
--llm-modelSpecify LLM model name
--llm-temperatureSet sampling temperature
--agents-llm-config-filePath to agent config file
--enable-portkeyEnable Portkey routing
--browser-channelBrowser channel selection
--reuse-vector-dbReuse existing vector database

Sources: testzeus_hercules/config.py:80-150

Error Handling

Agents implement robust error handling with:

  1. Termination Message Check: Each agent validates termination conditions via is_xxx_termination_message() functions
  2. Tool Call Monitoring: Tracks pending tool calls to prevent premature termination
  3. Graceful Degradation: Continues execution with alternative approaches on failure
  4. Logging: Comprehensive logging for debugging and audit trails

Sources: testzeus_hercules/core/simple_hercules.py:300-400

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Browser Automation

Related topics: Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Related topics: Tool System

Browser Automation

Overview

Browser Automation in TestZeus Hercules provides a comprehensive framework for controlling web browsers through Playwright, enabling autonomous agents to perform complex web interactions, testing, and data extraction tasks. The system acts as a bridge between LLM-powered agents and real browser instances, translating natural language instructions into precise DOM manipulations.

The automation layer handles multiple browser types (Chromium, Firefox, WebKit), manages browser contexts with device emulation, supports cloud-based testing platforms via CDP (Chrome DevTools Protocol) tunneling, and provides sophisticated DOM interaction tools including accessibility-aware element selection and real-time mutation observation.

Sources: testzeus_hercules/core/playwright_manager.py:1-50

Architecture

System Components

graph TD
    A[SimpleHercules Core] --> B[PlaywrightManager]
    B --> C[Browser Context]
    C --> D[Browser Instance<br/>Chromium|Firefox|WebKit]
    B --> E[Tool Registry]
    E --> F[Navigation Tools]
    E --> G[Interaction Tools]
    E --> H[Extraction Tools]
    B --> I[DOM Mutation Observer]
    B --> J[BrowserLogger]
    J --> K[Interaction Logs]
    K --> L[Accessibility Tree]
    L --> M[AccessibilityInfo]

Core Components

ComponentFilePurpose
PlaywrightManagercore/playwright_manager.pyCentral browser lifecycle management
BrowserLoggercore/browser_logger.pyInteraction logging and proof generation
DOMHelperutils/dom_helper.pyDOM state management and waiting
AccessibilityTreeutils/get_detailed_accessibility_tree.pyExtract and format accessibility information
ToolRegistrycore/tools/tool_registry.pyDynamic tool registration and routing

Sources: testzeus_hercules/core/playwright_manager.py Sources: testzeus_hercules/utils/dom_helper.py

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Tool System

Related topics: Browser Automation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Tool Categories

Continue reading this section for the full explanation and source context.

Section The @tool Decorator

Continue reading this section for the full explanation and source context.

Related topics: Browser Automation

Tool System

Overview

The Tool System is the core execution layer of TestZeus Hercules, providing AI agents with capabilities to interact with web pages through a unified, decorator-based interface. The system abstracts browser automation operations (clicking, typing, hovering, dragging, etc.) into discrete, callable tools that agents can invoke during test execution.

Tools serve as the fundamental building blocks that bridge natural language agent instructions with Playwright browser automation. Each tool encapsulates a specific browser interaction pattern, handles error cases gracefully, and returns structured results that agents can parse and respond to. Sources: testzeus_hercules/core/tools/click_using_selector.py:1-50

Architecture

System Components

graph TD
    subgraph "Agent Layer"
        A["Browser Nav Agent"]
        B["Executor Nav Agent"]
    end
    
    subgraph "Tool Registry"
        C["tool_registry.py"]
        D["Tool Decorator"]
    end
    
    subgraph "Core Browser Tools"
        E["click_using_selector"]
        F["enter_text_using_selector"]
        G["hover"]
        H["drag_and_drop_tool"]
        I["get_interactive_elements"]
    end
    
    subgraph "Extra Tools"
        J["browser_assist_tools"]
        K["accessibility_calls"]
        L["upload_file"]
    end
    
    subgraph "Browser Automation"
        M["Playwright Manager"]
        N["Page Object"]
    end
    
    A --> C
    B --> C
    C --> E
    C --> F
    C --> G
    C --> H
    C --> I
    C --> J
    J --> K
    J --> L
    E --> M
    F --> M
    G --> M
    M --> N

Tool Categories

CategoryPurposeLocationExample Tools
Core Browser ToolsPrimary page interactionstestzeus_hercules/core/tools/click, enter_text, hover, drag_drop
Extra ToolsAuxiliary functionalitytestzeus_hercules/core/extra_tools/accessibility, browser_assist
Upload ToolsFile handlingtestzeus_hercules/core/tools/upload_file

Tool Definition Pattern

The `@tool` Decorator

All tools in the system are defined using the @tool decorator, which registers the function with the tool registry and provides metadata for agent consumption.

from functools import wraps
from typing import Annotated, Any, Dict, List, Optional

def tool(agent_names: List[str], description: str, name: str):
    """Decorator to register a function as a callable tool."""
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            return await func(*args, **kwargs)
        wrapper._tool_config = {
            "agent_names": agent_names,
            "description": description,
            "name": name,
        }
        return wrapper
    return decorator

Tool Registration Metadata

ParameterTypeDescriptionRequired
agent_namesList[str]List of agent identifiers that can call this toolYes
descriptionstrHuman-readable description for the LLMYes
namestrUnique identifier for the toolYes

Example usage:

@tool(
    agent_names=["browser_nav_agent"],
    description="Click on an element using selector",
    name="click"
)
async def click_element(selector: Annotated[str, "CSS selector"]) -> dict:
    # Implementation
    pass

Sources: testzeus_hercules/core/tools/click_using_selector.py:1-30

Core Browser Tools

Click Tool (`click_using_selector`)

The primary interaction tool for clicking on page elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to click on an element in the DOM.",
    name="click"
)
async def click_using_selector(
    selector: Annotated[
        str,
        "md attribute value of the dom element to interact, md is an ID"
    ],
    click_type: Annotated[
        Optional[str],
        "type of click - left, right, double, mouseover, mouseenter, mouseleave, mouseexit, mousedown, mouseup"
    ] = "left",
    timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the click operation"]

Execution Flow:

sequenceDiagram
    participant Agent
    participant ClickTool
    participant PlaywrightManager
    participant Page
    participant SelectorLogger

    Agent->>ClickTool: click_using_selector(selector, click_type)
    ClickTool->>PlaywrightManager: find_element(selector)
    PlaywrightManager->>Page: locator(selector).first
    Page-->>PlaywrightManager: ElementHandle
    PlaywrightManager-->>ClickTool: element
    
    alt Element not found
        ClickTool->>SelectorLogger: log_selector_interaction(success=False)
        ClickTool-->>Agent: Error response
    end
    
    ClickTool->>Page: element.scroll_into_view_if_needed()
    ClickTool->>Page: element.is_visible()
    
    alt Element not visible
        ClickTool-->>Agent: Try another element response
    end
    
    ClickTool->>SelectorLogger: get_alternative_selectors()
    ClickTool->>SelectorLogger: get_element_attributes()
    ClickTool->>Page: element.click(click_type)
    
    alt Click success
        ClickTool->>SelectorLogger: log_selector_interaction(success=True)
        ClickTool-->>Agent: Success response
    else Click fails
        ClickTool->>SelectorLogger: log_selector_interaction(success=False)
        ClickTool-->>Agent: Error response
    end

Error Handling:

The click tool performs comprehensive error handling:

  1. Element Not Found: Logs selector interaction with success=False and raises ValueError
  2. Element Not Visible: Returns alternative suggestion response
  3. Scroll Failure: Gracefully continues (non-blocking)
  4. Click Execution Failure: Logs failure and returns error dictionary
if element is None:
    await selector_logger.log_selector_interaction(
        tool_name="click",
        selector=selector,
        action=type_of_click,
        selector_type="css" if "md=" in selector else "custom",
        success=False,
        error_message=f'Element with selector: "{selector}" not found',
    )
    raise ValueError(f'Element with selector: "{selector}" not found')

Sources: testzeus_hercules/core/tools/click_using_selector.py:40-80

Enter Text Tool (`enter_text_using_selector`)

Handles text input into form fields and contenteditable elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to enter the given text into an input field or a contenteditable element.",
    name="enter_text"
)
async def enter_text_using_selector(
    selector: Annotated[str, "md attribute value of the dom element to interact"],
    text_to_fill: Annotated[str, "text to enter into the element"],
    submit: Annotated[Optional[bool], "whether to submit after entering text"] = False,
    press_enter_after_input: Annotated[bool, "press Enter key after filling text"] = False,
) -> Annotated[Dict[str, str], "Result dictionary with summary and details"]

Hover Tool (`hover`)

Moves mouse cursor over an element without clicking.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to hover over an element in the DOM.",
    name="hover"
)
async def hover_using_selector(
    selector: Annotated[str, "md attribute value of the dom element to interact"],
    timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the hover operation"]

Drag and Drop Tool (`drag_and_drop_tool`)

Performs HTML5 drag-and-drop operations between elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to drag and drop an element.",
    name="drag_and_drop"
)
async def drag_and_drop(
    source: Annotated[str, "md attribute value of source element"],
    target: Annotated[str, "md attribute value of target element"],
) -> Annotated[Dict[str, str], "Result of the drag and drop operation"]

Get Interactive Elements (`get_interactive_elements`)

Retrieves all interactive elements from the current page DOM.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Get interactive elements from the current page",
    name="get_interactive_elements"
)
async def get_interactive_elements() -> Annotated[str, "JSON string of interactive elements"]

DOM Processing Logic:

The tool executes JavaScript in the browser context to identify interactive elements:

const isInteractive = (element) => {
    // Check input-related elements
    const inputRelatedTags = new Set(['input', 'textarea', 'select', ...]);
    const interactiveRoles = new Set(['button', 'link', 'checkbox', ...]);
    
    // Check for ARIA attributes
    const hasAriaProps = element.hasAttribute('aria-haspopup') ||
                         element.hasAttribute('aria-expanded') ||
                         element.hasAttribute('aria-checked');
    
    // Check cursor style
    const style = window.getComputedStyle(element);
    const hasPointerCursor = style.cursor === 'pointer';
    
    // Check draggable attribute
    const isDraggable = element.draggable;
    
    // Skip body and its direct children
    if (element.tagName.toLowerCase() === 'body') return false;
    
    return hasAriaProps || hasClickHandler || isDraggable;
};

Sources: testzeus_hercules/core/tools/get_interactive_elements.py:1-50

Upload File Tool (`upload_file`)

Handles file upload dialogs by setting file paths on input elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Upload a file to the page",
    name="upload_file"
)
async def upload_file(
    selector: Annotated[str, "md attribute value of file input element"],
    file_path: Annotated[str, "Path to the file to upload"],
) -> Annotated[Dict[str, str], "Result of the upload operation"]

Extra Tools System

Dynamic Tool Loading

The extra tools are loaded dynamically at runtime using Python's pkgutil module. This allows the system to extend functionality without modifying core code.

Loading Mechanism:

# testzeus_hercules/core/extra_tools/__init__.py
import importlib
import pkgutil
from pathlib import Path

from testzeus_hercules.config import get_global_conf

package_path = Path(__file__).parent

if get_global_conf().get_load_extra_tools().lower().strip() != "false":
    for _, module_name, _ in pkgutil.iter_modules([str(package_path)]):
        full_module_name = f"testzeus_hercules.core.extra_tools.{module_name}"
        module = importlib.import_module(full_module_name)
        
        # Export all public attributes to current namespace
        for attribute_name in dir(module):
            if not attribute_name.startswith("_"):
                globals()[attribute_name] = getattr(module, attribute_name)

Configuration:

Extra tools can be disabled via configuration:

python -m testzeus_hercules --load-extra-tools=false

Sources: testzeus_hercules/core/extra_tools/__init__.py:1-25

Accessibility Calls (`accessibility_calls`)

Provides accessibility testing using the axe-core library.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Check page accessibility using axe-core",
    name="check_accessibility"
)
async def check_accessibility(page_path: Annotated[str, "Page path or URL"]) -> str

Execution Flow:

  1. Fetches axe-core script from CDN
  2. Injects script into page
  3. Runs axe.run() in browser context
  4. Collects violations and incomplete checks
  5. Logs results in structured format
// Inject axe-core
await page.evaluate(
    `fetch("${AXE_SCRIPT_URL}").then(res => res.text())`
);

// Run accessibility checks
const results = await page.evaluate("async () => { return await axe.run(); }");

Response Format:

{
  "status": "success|failure",
  "message": "Human readable summary",
  "details": ["failure summary 1", "failure summary 2"]
}

Sources: testzeus_hercules/core/tools/accessibility_calls.py:1-80

Tool Return Format

All tools return a standardized dictionary structure:

FieldTypeDescription
summary_messagestrBrief status message for agent consumption
detailed_messagestrExtended information including errors if any
statusstrOperation status (success/failure)

Success Response Example:

return {
    "summary_message": f'Successfully clicked element with selector: "{selector}"',
    "detailed_message": f'Element with selector: "{selector}" clicked successfully. Tag: {element_tag_name}',
}

Error Response Example:

return {
    "summary_message": f'Element with selector: "{selector}" is not visible, Try another element',
    "detailed_message": f'Element with selector: "{selector}" is not visible, Try another element',
}

Selector Logging System

Every tool interaction is logged using the SelectorLogger for proof generation and debugging.

Logging Interface

from testzeus_hercules.utils.browser_logger import get_browser_logger

selector_logger = get_browser_logger(get_global_conf().get_proof_path())

# Log successful interaction
await selector_logger.log_selector_interaction(
    tool_name="click",
    selector=selector,
    action="left",
    selector_type="css",
    success=True,
)

# Log failed interaction
await selector_logger.log_selector_interaction(
    tool_name="click",
    selector=selector,
    action="left",
    selector_type="css",
    success=False,
    error_message="Element not found",
)

Captured Data

The logger captures:

  • Tool name and action type
  • Selector used
  • Selector type (CSS vs custom)
  • Success/failure status
  • Alternative selectors for the element
  • Element attributes
  • Error messages on failure

Agent Integration

Browser Nav Agent

The primary agent that consumes browser tools. It receives natural language instructions and translates them into tool calls.

Tool Invocation Pattern:

# From executor_nav_agent.py
async def execute_task(self, instruction: str):
    # Agent decides which tool to call
    # Tool receives selector from instruction
    result = await click_using_selector(
        selector="[md='submit-button']",
        click_type="left"
    )
    
    # Agent processes result
    if "error" in result.get("summary_message", "").lower():
        # Handle error or try alternative

Executor Nav Agent

Handles script execution and Python sandbox operations:

Script Execution Context:

VariableTypeDescription
pagePagePlaywright page object
browserBrowserPlaywright browser instance
contextBrowserContextBrowser context
playwright_managerPlaywrightManagerManager instance
loggerLoggerLogging utility
configGlobalConfGlobal configuration

Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-50

Bulk Operations

The tool system supports bulk execution for batch operations:

Bulk Slider Tool

@tool(
    agent_names=["browser_nav_agent"],
    description="used to set slider values in multiple sliders in single attempt.",
    name="bulk_set_slider"
)
async def bulk_set_slider(
    entries: Annotated[
        List[List[str]],
        "List of [selector, value] pairs",
    ],
) -> Annotated[List[Dict[str, str]], "List of results"]

Configuration

Command Line Arguments

The tool system behavior can be configured via CLI:

ArgumentTypeDescription
--llm-modelstrLLM model for agent decisions
--llm-temperaturefloatLLM sampling temperature (0.0-1.0)
--agents-llm-config-filestrPath to agents LLM config file
--enable-portkeyflagEnable Portkey LLM routing
--portkey-api-keystrPortkey API key
--browser-channelstrBrowser channel (chrome-beta, etc.)
--browser-versionstrSpecific browser version
--enable-ublockflagEnable uBlock Origin extension
--load-extra-toolsstrLoad extra tools (default: true)

Sources: testzeus_hercules/config.py:1-100

Summary

The Tool System provides:

  1. Unified Interface: All browser interactions follow the @tool decorator pattern
  2. Agent Compatibility: Tools specify which agents can invoke them
  3. Error Resilience: Graceful handling of element not found, not visible, and execution failures
  4. Proof Generation: Comprehensive logging of all selector interactions
  5. Extensibility: Dynamic loading of extra tools without core modifications
  6. Standardized Results: Consistent return format across all tools

This architecture enables AI agents to reliably control browser automation while maintaining clean separation of concerns and testable component boundaries.

Sources: [testzeus_hercules/core/tools/click_using_selector.py:1-30]()

LLM Configuration

Related topics: Agent System, Memory Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Component Architecture

Continue reading this section for the full explanation and source context.

Section Configuration Flow

Continue reading this section for the full explanation and source context.

Section AgentRegistry

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Memory Management

LLM Configuration

Overview

The LLM Configuration system in testzeus-hercules provides a flexible, multi-provider framework for configuring Large Language Models across different agent types. This system enables the framework to support various LLM providers (OpenAI, Anthropic, Ollama, Azure) while maintaining provider-specific configurations through a centralized registry mechanism.

The configuration architecture supports:

  • Multiple LLM providers simultaneously
  • Per-agent model selection
  • Dynamic parameter adaptation based on model family
  • External configuration file loading
  • Environment variable integration

Sources: testzeus_hercules/core/agents_llm_config.py:1-20

Architecture

High-Level Component Architecture

graph TD
    A[CLI Arguments / Environment] --> B[Global Config]
    C[agents-llm-config-file.json] --> D[AgentsLLMConfig]
    D --> E[AgentRegistry]
    E --> F[Provider Configs]
    F --> G[planner_agent]
    F --> H[nav_agent]
    F --> I[mem_agent]
    F --> J[helper_agent]
    B --> K[Model Utils]
    K --> L[adapt_llm_params_for_model]
    L --> G
    L --> H
    L --> I
    L --> J

Configuration Flow

sequenceDiagram
    participant CLI as CLI Arguments
    participant Config as Global Config
    participant JSON as LLM Config File
    participant Processor as AgentsLLMConfig
    participant Registry as AgentRegistry
    participant Agent as SimpleHercules
    
    CLI->>Config: Parse --llm-* arguments
    JSON->>Processor: load_from_file()
    Processor->>Registry: register_provider()
    Registry->>Registry: Store configs per provider
    Config->>Agent: Pass model configs
    Agent->>ModelUtils: adapt_llm_params_for_model()
    ModelUtils->>Agent: Adapted LLM params

Sources: testzeus_hercules/core/agents_llm_config.py:25-50

Core Components

AgentRegistry

The AgentRegistry manages configurations for multiple providers and supports switching between them.

class AgentRegistry:
    def __init__(self) -> None:
        self._providers: Dict[str, Dict[str, AgentConfig]] = {}
        self._active_provider: Optional[str] = None
MethodPurpose
register_provider(provider_key, configs)Register agent configs for a provider
get_active_provider()Retrieve currently active provider configuration
set_active_provider(provider_key)Switch active provider
get_all_providers()List all registered providers

Sources: testzeus_hercules/core/agents_llm_config.py:20-35

AgentsLLMConfig

The main configuration processor that handles loading and normalization of agent configurations.

class AgentsLLMConfig:
    def __init__(self) -> None:
        self.registry = AgentRegistry()
    
    def load_from_file(self, file_path: str, provider_key: Optional[str] = None) -> None
    def normalize_agent_config(self, agent_config: Dict[str, Any]) -> AgentConfig

Sources: testzeus_hercules/core/agents_llm_config.py:40-70

Agent Types

The framework defines four specialized agent roles, each with configurable LLM parameters:

Agent TypePurposeTypical Model Requirements
planner_agentStrategic task planning and decompositionHigh reasoning capability
nav_agentBrowser navigation and UI interactionVision-capable, fast response
mem_agentMemory and context managementBalanced performance
helper_agentUtility functions and data processingVariable based on task

Sources: testzeus_hercules/core/simple_hercules.py:1-30

Configuration File Format

JSON Schema Structure

{
  "provider_name": {
    "agent_type": {
      "model_name": "string",
      "model_api_key": "string",
      "model_api_type": "openai|anthropic|azure|ollama",
      "model_client_host": "string (optional)",
      "model_native_tool_calls": true|false,
      "model_hide_tools": "if_any_run|user|never",
      "llm_config_params": {
        "cache_seed": null|integer,
        "temperature": 0.0,
        "seed": 12345,
        "max_tokens": 4096,
        "presence_penalty": 0.0,
        "frequency_penalty": 0.0,
        "stop": []
      }
    }
  }
}

Example Configuration

{
  "openai": {
    "planner_agent": {
      "model_name": "gpt-4o",
      "model_api_key": "${OPENAI_API_KEY}",
      "model_api_type": "openai",
      "llm_config_params": {
        "cache_seed": null,
        "temperature": 0.0,
        "seed": 12345
      }
    }
  },
  "anthropic": {
    "nav_agent": {
      "model_name": "claude-3-5-haiku-latest",
      "model_api_key": "",
      "model_api_type": "anthropic",
      "llm_config_params": {
        "cache_seed": null,
        "temperature": 0.0
      }
    }
  }
}

Sources: agents_llm_config-example.json:1-60

CLI Configuration Options

The framework provides comprehensive command-line arguments for LLM configuration:

Basic LLM Parameters

ArgumentTypeDescription
--llm-modelstringName of the LLM model
--llm-model-api-keystringAPI key for the LLM model
--llm-model-base-urlstringBase URL for the LLM API
--llm-model-api-typestringAPI type (openai, anthropic, azure, etc.)
--llm-temperaturefloatTemperature for LLM sampling (0.0-1.0)

LLM Configuration File Options

ArgumentTypeDescription
--agents-llm-config-filestringPath to the agents LLM configuration file
--agents-llm-config-file-ref-keystringReference key for selecting provider

Parameter Configuration

ArgumentTypeDescription
LLM_MODEL_PRICINGenvModel pricing for cost tracking
LLM_MODEL_TEMPERATUREenvDefault temperature setting
LLM_MODEL_CACHE_SEEDenvCache seed for reproducible results
LLM_MODEL_SEEDenvRandom seed for generation
LLM_MODEL_MAX_TOKENSenvMaximum tokens in response
LLM_MODEL_PRESENCE_PENALTYenvPresence penalty (-2.0 to 2.0)
LLM_MODEL_FREQUENCY_PENALTYenvFrequency penalty (-2.0 to 2.0)
LLM_MODEL_STOPenvStop sequences

Sources: testzeus_hercules/config.py:1-100

Portkey Integration

The framework supports Portkey for advanced LLM routing with fallback and load balancing capabilities.

Portkey Configuration Options

ArgumentTypeDescription
--enable-portkeyflagEnable Portkey integration
--portkey-api-keystringAPI key for Portkey
--portkey-strategychoiceRouting strategy: fallback or loadbalance

Environment Variables

VariableDescription
ENABLE_PORTKEYEnable/disable Portkey
PORTKEY_API_KEYPortkey API key
PORTKEY_STRATEGYRouting strategy
PORTKEY_CACHE_ENABLEDEnable response caching
PORTKEY_TARGETSTarget models for routing
PORTKEY_GUARDRAILSEnable safety guardrails
PORTKEY_RETRY_COUNTNumber of retries on failure

Sources: testzeus_hercules/config.py:50-80

Model Parameter Adaptation

The model_utils module provides intelligent parameter adaptation based on the model family being used.

adapt_llm_params_for_model

def adapt_llm_params_for_model(model_name: str, llm_config_params: Dict) -> Dict

This function automatically adjusts LLM parameters based on the detected model family:

Model FamilyAdaptation Behavior
o1-seriesRemoves temperature, adjusts max_tokens handling
GPT-4oStandard parameters
ClaudeAdjusts for Anthropic API format
OllamaConfigures for local inference

Applied to All Agents

# In SimpleHercules initialization
planner_model = self.planner_agent_config["model_config_params"].get("model") or \
                self.planner_agent_config["model_config_params"].get("model_name")

self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
    planner_model, 
    self.planner_agent_config["llm_config_params"]
)

Sources: testzeus_hercules/core/simple_hercules.py:100-130

LLM Helper Utilities

The llm_helper module provides utility functions for LLM interactions:

FunctionPurpose
convert_model_config_to_autogen_format()Convert config to AutoGen format
create_multimodal_agent()Create agents with vision capabilities
extract_target_helper()Extract target information from responses
format_plan_steps()Format planning step outputs
parse_agent_response()Parse agent response structure
process_chat_message_content()Process chat message content
parse_response()General response parsing

Sources: testzeus_hercules/utils/llm_helper.py:1-30

Environment Variable Configuration

Full Configuration Matrix

Environment VariableTypeDefaultPurpose
LLM_MODEL_PRICINGdict-Model pricing information
LLM_MODEL_TEMPERATUREfloat0.0Default sampling temperature
LLM_MODEL_CACHE_SEEDintnullCaching seed
LLM_MODEL_SEEDint-Random seed
LLM_MODEL_MAX_TOKENSint4096Max response tokens
LLM_MODEL_PRESENCE_PENALTYfloat0.0Presence penalty
LLM_MODEL_FREQUENCY_PENALTYfloat0.0Frequency penalty
LLM_MODEL_STOPlist[]Stop sequences
TOKEN_VERBOSEboolfalseEnable token verbose logging
HF_HOMEpath-HuggingFace cache location
TOKENIZERS_PARALLELISMboolfalseParallel tokenizer config

Sources: testzeus_hercules/config.py:100-150

Multi-Provider Configuration

Provider Switching

The system supports runtime provider switching through the configuration file:

graph LR
    A[Config File] --> B["provider: openai"]
    A --> C["provider: anthropic"]
    B --> D["planner_agent: GPT-4"]
    B --> E["nav_agent: GPT-4o-mini"]
    C --> F["planner_agent: Claude-3"]
    C --> G["nav_agent: Claude-3-Haiku"]

Selecting Active Provider

python -m testzeus_hercules \
    --agents-llm-config-file ./config.json \
    --agents-llm-config-file-ref-key "anthropic"

Sources: testzeus_hercules/core/agents_llm_config.py:60-80

Integration with SimpleHercules

The SimpleHercules class integrates all LLM configurations:

class SimpleHercules:
    def __init__(
        self,
        planner_agent_config: Dict[str, Any],
        nav_agent_config: Dict[str, Any],
        mem_agent_config: Dict[str, Any],
        helper_agent_config: Dict[str, Any],
        planner_max_chat_round: int = 50,
        browser_nav_max_chat_round: int = 100,
    ):
        # Configuration processing
        self.planner_agent_config = planner_agent_config
        self.nav_agent_config = nav_agent_config
        self.mem_agent_config = mem_agent_config
        self.helper_agent_config = helper_agent_config
        
        # Parameter adaptation per agent
        from testzeus_hercules.utils.model_utils import adapt_llm_params_for_model
        
        self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
            planner_model, 
            self.planner_agent_config["llm_config_params"]
        )

Sources: testzeus_hercules/core/simple_hercules.py:50-100

Best Practices

1. Configuration File Organization

  • Group configurations by provider (openai, anthropic, etc.)
  • Use environment variables for sensitive API keys
  • Maintain consistent agent naming across providers

2. Model Selection Guidelines

Use CaseRecommended Models
Complex planningGPT-4, Claude-3-Opus
Fast navigationGPT-4o-mini, Claude-3-Haiku
Vision tasksGPT-4o, Claude-3-Sonnet
Local inferenceOllama models

3. Parameter Tuning

  • Use temperature: 0.0 for deterministic outputs
  • Set appropriate max_tokens based on expected response length
  • Enable model_native_tool_calls for better function calling

4. Cost Optimization

  • Use LLM_MODEL_PRICING for tracking
  • Enable Portkey caching with PORTKEY_CACHE_ENABLED
  • Consider fallback strategies for reliability

Troubleshooting

Common Issues

IssueSolution
Model not recognizedCheck model_name matches provider format
Temperature ignoredSome models (o1-series) ignore temperature parameter
API key errorsEnsure ${ENV_VAR} syntax or actual key in config
Provider not foundVerify provider key matches config file structure

Debug Configuration

export TOKEN_VERBOSE=true
export ENABLE_BROWSER_LOGS=true
python -m testzeus_hercules --llm-model gpt-4o --llm-temperature 0.7

See Also

Sources: [testzeus_hercules/core/agents_llm_config.py:1-20]()

Memory Management

Related topics: Agent System, LLM Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Purpose and Scope

Continue reading this section for the full explanation and source context.

Section Implementation

Continue reading this section for the full explanation and source context.

Section Data Consolidation

Continue reading this section for the full explanation and source context.

Related topics: Agent System, LLM Configuration

Memory Management

Overview

The Memory Management system in testzeus-hercules provides persistent and contextual memory capabilities for AI agents executing browser automation tests. The system consists of three primary components: Static LTM (Long Term Memory), Dynamic LTM, and State Handler, each serving distinct purposes in managing test execution context and data persistence.

The architecture follows a multi-layered approach where Static LTM loads pre-configured test data, Dynamic LTM manages vector-based retrieval augmented memory, and State Handler provides runtime state tracking for agent coordination.

graph TD
    A[Testzeus-Hercules] --> B[Static LTM]
    A --> C[Dynamic LTM]
    A --> D[State Handler]
    
    B --> E[Test Data Files]
    B --> F[Stored Data]
    B --> G[Run Data]
    
    C --> H[Vector Store]
    C --> I[RetrieveUserProxyAgent]
    
    D --> J[_state_string]
    D --> K[_state_dict]
    
    L[Agents] --> M[Memory Access]
    M --> B
    M --> C
    M --> D

Static LTM (Long Term Memory)

Purpose and Scope

Static LTM is responsible for loading and consolidating pre-configured test data at application initialization. It operates as a singleton pattern, ensuring that all test data is loaded once and made available throughout the test execution lifecycle.

Sources: testzeus_hercules/core/memory/static_ltm.py:1-47

Implementation

The StaticLTM class extends the singleton pattern to ensure only one instance exists:

class StaticLTM:
    _instance = None

    def __new__(cls) -> "StaticLTM":
        if cls._instance is None:
            cls._instance = super().__new__(cls)
            cls._instance._initialize()
        return cls._instance

Sources: testzeus_hercules/core/memory/static_ltm.py:17-22

Data Consolidation

During initialization, Static LTM consolidates three types of data sources:

Data SourceDescriptionMethod
Base Test DataLoaded from test_data.txt via load_data()StaticDataLoader
Stored DataUser-defined test artifactsget_stored_data()
Run DataPrevious test execution contextget_run_data()

Sources: testzeus_hercules/core/memory/static_ltm.py:26-34

The consolidated data is stored in self.consolidated_data and accessed via get_user_ltm():

def get_user_ltm(self) -> Optional[str]:
    return self.consolidated_data

Sources: testzeus_hercules/core/memory/static_ltm.py:40-47

Usage Pattern

Agents access Static LTM through a module-level function:

def get_user_ltm() -> Optional[str]:
    return StaticLTM().get_user_ltm()

Sources: testzeus_hercules/core/memory/static_ltm.py:50-54

Dynamic LTM

Purpose and Scope

Dynamic LTM provides runtime memory management with vector-based retrieval capabilities. It enables agents to store, retrieve, and utilize contextual information during test execution using a RetrieveUserProxyAgent backed by ChromaDB for vector storage.

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:1-40

Core Components

#### SilentRetrieveUserProxyAgent

A specialized agent that extends RetrieveUserProxyAgent with suppressed output to prevent console noise during agent conversations:

class SilentRetrieveUserProxyAgent(RetrieveUserProxyAgent):
    @suppress_prints
    def initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
        return super().initiate_chat(*args, **kwargs)

    @suppress_prints
    async def a_initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
        return await super().a_initiate_chat(*args, **kwargs)

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:37-46

#### Print Suppression Decorator

The @suppress_prints decorator redirects stdout to a StringIO buffer during function execution:

def suppress_prints(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        silent_stdout = io.StringIO()
        original_stdout, sys.stdout = sys.stdout, silent_stdout
        try:
            return func(*args, **kwargs)
        finally:
            sys.stdout = original_stdout
    return wrapper

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:17-29

Integration with Configuration

Dynamic LTM respects global configuration to enable or disable its functionality:

from testzeus_hercules.config import get_global_conf

def save_content(self, content: str) -> None:
    config = get_global_conf()
    if not config.should_use_dynamic_ltm():
        return  # Skip when disabled

Sources: testzeus_hercules/core/simple_hercules.py:89-94

External Dependencies

Dynamic LTM utilizes the unstructured library for document parsing:

from unstructured.documents.elements import NarrativeText, Text, Title
from unstructured.partition.auto import partition

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:13-14

State Handler

Purpose and Scope

State Handler provides lightweight runtime state management for coordinating data between agents during test execution. It maintains module-level dictionaries for storing string-based state and structured data.

Sources: testzeus_hercules/core/memory/state_handler.py:1-70

Module-Level State Storage

# Module-level state string
_state_string: Dict[str, str] = defaultdict(str)
_state_dict: Dict[str, Any] = defaultdict(deque)

Sources: testzeus_hercules/core/memory/state_handler.py:13-14

store_data Tool

The store_data function is registered as a tool for browser, API, and SQL navigation agents to persist information:

@tool(
    agent_names=["browser_nav_agent", "api_nav_agent", "sql_nav_agent"],
    description="Tool to store information.",
    name="store_data",
)
def store_data(
    text: Annotated[str, "The confirmation of stored value."],
) -> Annotated[Dict[str, Union[str, None]], "A dictionary containing a 'message' key..."]:
    global _state_string
    try:
        DynamicLTM().save_content(text)
        _state_string[get_global_conf().get_default_test_id()] += text
        return {"message": "Text appended successfully."}
    except Exception as e:
        return {"error": str(e)}

Sources: testzeus_hercules/core/memory/state_handler.py:23-47

Key Behaviors

BehaviorDescription
Test ID IsolationState is keyed by get_global_conf().get_default_test_id()
Dual StorageData propagates to both _state_string and Dynamic LTM
Error ResilienceReturns error dictionary instead of raising exceptions

Sources: testzeus_hercules/core/memory/state_handler.py:30-46

Memory Architecture Diagram

graph LR
    subgraph "Initialization Phase"
        A[Load Config] --> B[StaticLTM Singleton]
        B --> C[Load test_data.txt]
        C --> D[Consolidate Data]
    end
    
    subgraph "Runtime Phase"
        E[Agents Execute Tests]
        E --> F[store_data Tool Call]
        F --> G[State Handler]
        G --> H[DynamicLTM.save_content]
        H --> I[Vector Store Update]
        
        J[Test Query] --> K[RetrieveUserProxyAgent]
        K --> L[Vector Similarity Search]
        L --> M[Context Injection]
        M --> E
    end

Integration with SimpleHercules

The SimpleHercules class coordinates all memory components:

class SimpleHercules:
    def _save_to_memory(self, content: str) -> None:
        """Helper method to save content to memory."""
        config = get_global_conf()
        if not config.should_use_dynamic_ltm():
            return

        if self.memory:
            self.memory.save_content(content)
        else:
            logger.warning("Memory system not initialized")

Sources: testzeus_hercules/core/simple_hercules.py:85-97

Memory Initialization Flow

sequenceDiagram
    participant SH as SimpleHercules
    participant DLT as DynamicLTM
    participant CFG as Config
    participant LOG as Logger

    SH->>CFG: should_use_dynamic_ltm()
    CFG-->>SH: boolean
    SH->>DLT: save_content(content)
    alt LTM Enabled
        DLT->>DLT: Vector store update
        DLT-->>SH: success
    else LTM Disabled
        DLT-->>SH: skipped
    end

Configuration

Global Configuration Methods

MethodPurpose
should_use_dynamic_ltm()Check if Dynamic LTM is enabled
get_hf_home()Get HuggingFace cache directory for vector store
get_default_test_id()Get current test execution identifier

The configuration is managed through testzeus_hercules/config.py, which provides command-line arguments for memory-related settings including:

  • --reuse-vector-db: Reuse existing vector DB instead of creating fresh one
  • --sandbox-tenant-id: Python sandbox tenant configuration

Sources: testzeus_hercules/config.py:45-58

Data Flow Summary

LayerStorage TypeAccess PatternPersistence
Static LTMIn-memory stringSingleton get_user_ltm()Session-scoped
Dynamic LTMVector (ChromaDB)RetrieveUserProxyAgentPersistent
State HandlerIn-memory dictModule-level _state_stringTest execution-scoped

Error Handling

All memory components implement robust error handling:

try:
    DynamicLTM().save_content(text)
    _state_string[get_global_conf().get_default_test_id()] += text
    return {"message": "Text appended successfully."}
except Exception as e:
    traceback.print_exc()
    logger.error(f"An error occurred while appending to state: {e}")
    return {"error": str(e)}

Sources: testzeus_hercules/core/memory/state_handler.py:30-42

ComponentFile PathRole
PlannerAgentcore/agents/high_level_planner_agent.pyConsumes memory for test planning
ExecutorNavAgentcore/agents/executor_nav_agent.pyExecutes test steps with memory context
BaseNavAgentcore/agents/base_nav_agent.pyAgent base class with memory integration

Summary

The Memory Management system in testzeus-hercules implements a comprehensive multi-tier approach:

  1. Static LTM provides pre-loaded test data consolidation via singleton pattern
  2. Dynamic LTM offers vector-based retrieval augmented memory for contextual queries
  3. State Handler enables runtime state sharing between agents through the store_data tool

This architecture ensures agents have access to both static test fixtures and dynamic execution context, enabling sophisticated AI-driven browser automation testing.

Sources: [testzeus_hercules/core/memory/static_ltm.py:1-47]()

API Testing

Related topics: Security Testing, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Overview

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section OpenAPI Specification Processing

Continue reading this section for the full explanation and source context.

Related topics: Security Testing, Tool System

API Testing

Overview

API Testing in testzeus-hercules enables automated end-to-end testing of REST APIs through AI-driven agents. The system leverages LLM-powered agents to parse OpenAPI specifications, generate Gherkin test scenarios, execute API calls, and validate responses against expected outcomes. This module integrates with the broader Hercules testing framework to provide comprehensive API validation capabilities alongside browser-based UI testing.

The API Testing feature accepts OpenAPI specification files (YAML or JSON format) and automatically generates executable Gherkin test cases that can be run against live API endpoints. The generated tests follow behavior-driven development (BDD) conventions, making them readable for both technical and non-technical stakeholders.

Architecture

The API Testing module consists of several interconnected components that work together to provide end-to-end API testing capabilities.

Component Overview

graph TD
    A[OpenAPI Specification] --> B[generate_api_functional_gherkin_test.py]
    B --> C[Gherkin Test Cases]
    C --> D[API Navigation Agent]
    D --> E[API Calls Tool]
    E --> F[SQL Calls Tool]
    E --> G[Python Sandbox Executor]
    F --> H[Database Validation]
    G --> I[Custom Logic Validation]
    D --> J[Response Parser]
    J --> K[Test Results]

Core Components

ComponentFile PathPurpose
API Navigation Agenttestzeus_hercules/core/agents/api_nav_agent.pyOrchestrates API test execution using LLM-driven decision making
API Calls Tooltestzeus_hercules/core/tools/api_calls.pyExecutes HTTP requests to API endpoints
SQL Calls Tooltestzeus_hercules/core/tools/sql_calls.pyValidates API data against database state
Python Sandboxtestzeus_hercules/core/tools/execute_python_sandbox.pyExecutes custom validation logic
Gherkin Generatorhelper_scripts/generate_api_functional_gherkin_test.pyGenerates test cases from OpenAPI specs
Response Parsertestzeus_hercules/utils/response_parser.pyParses and validates API responses

Sources: helper_scripts/generate_api_functional_gherkin_test.py:1-80

Test Generation Workflow

OpenAPI Specification Processing

The test generation process begins with parsing OpenAPI specification files. The system accepts both YAML and JSON formatted OpenAPI specs through the generate_api_functional_gherkin_test.py helper script.

parser.add_argument(
    "input_files",
    metavar="input_files",
    type=str,
    nargs="+",
    help="One or more OpenAPI spec files (YAML or JSON).",
)

Sources: helper_scripts/generate_api_functional_gherkin_test.py:15-22

Gherkin Test Case Generation

The LLM generates test cases based on the OpenAPI specification content. The generation uses a specialized prompt that instructs the model to produce Gherkin-format scenarios covering functional test cases.

def generate_test_cases(prompt: str, model: str) -> str:
    """Generates test cases using the OpenAI API."""
    client = OpenAI()
    completion = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
    )
    return completion.choices[0].message.content

Sources: helper_scripts/generate_api_functional_gherkin_test.py:82-90

Generation Parameters

ParameterCLI FlagDefaultDescription
Model--modelo1-previewLLM model for test generation
Output Folder--output(required)Destination for generated feature files
Number of Test Cases--number_of_testcase100Maximum test cases to generate per endpoint

API Navigation Agent

The API Navigation Agent (api_nav_agent.py) serves as the orchestrator for executing API tests. It receives parsed test scenarios and coordinates execution across multiple tools to validate API behavior.

sequenceDiagram
    participant Test as Test Scenario
    participant Agent as API Nav Agent
    participant API as API Calls Tool
    participant SQL as SQL Calls Tool
    participant Sandbox as Python Sandbox
    
    Test->>Agent: Execute scenario
    Agent->>API: Send HTTP request
    API-->>Agent: Response data
    Agent->>SQL: Validate database state
    SQL-->>Agent: Validation result
    Agent->>Sandbox: Run custom assertions
    Sandbox-->>Agent: Assertion results
    Agent->>Test: Pass/Fail outcome

Sources: testzeus_hercules/core/agents/api_nav_agent.py

Execution Environment

Python Sandbox

API tests execute within a secured Python sandbox environment that provides controlled access to necessary resources while maintaining isolation.

def _get_config_driven_injections(config: Any) -> Dict[str, Any]:
    """
    Get injections defined in configuration.
    Allows dynamic configuration of available modules.
    """
    injections = {}
    
    # Read from config: SANDBOX_PACKAGES="requests,pandas,numpy"
    sandbox_packages = config.get_config().get("SANDBOX_PACKAGES", "").split(",")
    
    for package_name in sandbox_packages:
        package_name = package_name.strip()
        if package_name:
            try:
                injections[package_name] = __import__(package_name)
            except ImportError:
                logger.warning(f"Could not import configured package: {package_name}")
    
    return injections

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:80-100

Sandbox Access Variables

Scripts executing within the sandbox have automatic access to the following variables:

VariableTypeDescription
pagePlaywright PageCurrent browser page context
browserBrowser instanceActive browser session
contextBrowser ContextIsolated browsing context
playwright_managerPlaywrightManagerManages Playwright lifecycle
loggerLoggerLogging utility
configConfigurationGlobal configuration object

Additional tenant-specific modules can be injected based on the SANDBOX_TENANT_ID environment variable, and custom injections are available via the SANDBOX_CUSTOM_INJECTIONS environment variable.

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:40-55

Response Handling

JSON Response Parsing

The response parser handles API responses with multiple fallback strategies for extracting structured data:

def parse_response(message: str) -> dict[str, Any]:
    # Check if message is wrapped in ```json ``` blocks
    if "```json" in message:
        start_idx = message.find("```json") + 7
        end_idx = message.find("```", start_idx + 7)
        message = message[start_idx:end_idx]
    else:
        if message.startswith("```"):
            message = message[3:]
        if message.endswith("```):
            message = message[:-3]
        if message.startswith("json"):
            message = message[4:]

    message = message.strip()
    message = message.replace("\\n", "\n")
    
    json_response: dict[str, Any] = json.loads(message)

Sources: testzeus_hercules/utils/response_parser.py:9-35

Error Recovery

When JSON parsing fails, the response parser attempts to extract plan and next_step fields from unstructured responses, ensuring graceful degradation when APIs return non-standard response formats.

Configuration

LLM Configuration

API Testing relies on LLM configuration for test generation and agent decision-making. Configuration can be provided via command-line arguments or through a dedicated configuration file.

parser.add_argument(
    "--llm-model",
    type=str,
    help="Name of the LLM model.",
    required=False,
)
parser.add_argument(
    "--llm-temperature",
    type=float,
    help="Temperature for LLM sampling (0.0-1.0).",
    required=False,
)

Sources: testzeus_hercules/config.py:35-45

Agents LLM Config File

For multi-agent configurations, specify the configuration file path:

--agents-llm-config-file /path/to/agents_llm_config.json
--agents-llm-config-file-ref-key <key_name>

Portkey Integration

Enable Portkey for LLM routing with fallback or load balancing strategies:

--enable-portkey
--portkey-api-key <api_key>
--portkey-strategy fallback|loadbalance

Sources: testzeus_hercules/config.py:60-75

Usage Examples

Generate Gherkin Tests from OpenAPI Spec

python helper_scripts/generate_api_functional_gherkin_test.py \
    spec/openapi.yaml \
    --output tests/api/ \
    --model gpt-4 \
    --number_of_testcase 50

Run API Tests

Tests can be executed through the main Hercules CLI or integrated into CI/CD pipelines. The agent configuration file supports specifying different models for different agents:

{
    "openai": {
        "planner_agent": {
            "model_name": "gpt-4",
            "model_api_type": "openai"
        }
    }
}

Sources: agents_llm_config-example.json

Integration with Browser Testing

The API Testing module integrates seamlessly with browser-based testing capabilities. The API Navigation Agent can coordinate with the Browser Navigation Agent to perform scenarios that span both API validation and UI verification.

When executing multi-step workflows, the system can:

  1. Call API endpoints to set up test data
  2. Launch browsers to verify UI state reflects API changes
  3. Execute SQL queries to validate data persistence
  4. Run custom Python assertions for complex business logic

Security Considerations

API Key Management

API keys should be provided through environment variables or secure configuration management, never hardcoded in test files:

export OPENAI_API_KEY=<your_api_key>
export PORTKEY_API_KEY=<your_portkey_key>

Sandbox Isolation

The Python sandbox provides execution isolation for custom test logic. Configure allowed packages through the SANDBOX_PACKAGES configuration parameter to limit access to only required libraries.

Best Practices

  1. Organize OpenAPI specs by version - Maintain separate specification files for different API versions
  2. Use meaningful test case names - Generated tests should clearly describe the scenario being validated
  3. Combine with database validation - Use SQL Calls Tool to verify data consistency
  4. Leverage response parsing - Use the response parser for handling complex API response formats
  5. Configure appropriate LLM models - Use faster models for generation and more capable models for complex validation logic

Sources: [helper_scripts/generate_api_functional_gherkin_test.py:1-80]()

Security Testing

Related topics: API Testing, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Agent Configuration

Continue reading this section for the full explanation and source context.

Section Execution Flow

Continue reading this section for the full explanation and source context.

Related topics: API Testing, Tool System

Security Testing

Security Testing in TestZeus Hercules is an automated framework designed to validate API security by generating and executing Gherkin-based test scenarios. The system leverages LLM-powered agents to analyze OpenAPI specifications and produce comprehensive security validation tests that check for vulnerabilities, configuration weaknesses, and proper handling of sensitive data.

Overview

The Security Testing module provides an end-to-end solution for validating API security without requiring manual test case authoring. It integrates with the broader Hercules testing framework to execute security validation scenarios alongside functional and navigation tests.

Core Components

ComponentFile PathPurpose
Security Navigation Agenttestzeus_hercules/core/agents/sec_nav_agent.pyOrchestrates security test execution using LLM-driven agents
API Security Callstestzeus_hercules/core/tools/api_sec_calls.pyProvides low-level HTTP client operations for security validation
Gherkin Test Generatorhelper_scripts/generate_api_security_gherkin_test.pyGenerates security-focused Gherkin test cases from OpenAPI specs

Architecture

graph TD
    A[OpenAPI Spec Files] --> B[generate_api_security_gherkin_test.py]
    B --> C[LLM API - OpenAI]
    C --> D[Security Gherkin Test Cases]
    D --> E[Hercules Test Executor]
    E --> F[sec_nav_agent.py]
    F --> G[api_sec_calls.py]
    G --> H[Target API Endpoints]
    H --> I[Security Validation Results]
    
    J[Configuration] --> F
    K[LLM Config] --> B

Security Navigation Agent

The sec_nav_agent.py module implements the BrowserNavAgent pattern specialized for security testing scenarios. It follows the same agent architecture used by the browser navigation agent but focuses on API security validation.

Agent Configuration

The security agent inherits the core agent configuration structure from the Hercules framework, utilizing the same LLM integration patterns as the main navigation agent defined in simple_hercules.py. Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-50

Execution Flow

sequenceDiagram
    participant TestRunner
    participant SecNavAgent
    participant APISecCalls
    participant TargetAPI
    participant ResponseParser
    
    TestRunner->>SecNavAgent: Execute security test scenario
    SecNavAgent->>APISecCalls: Send HTTP request with security payload
    APISecCalls->>TargetAPI: Validated HTTP request
    TargetAPI->>APISecCalls: Response with headers/body
    APISecCalls->>ResponseParser: Parse response data
    ResponseParser->>SecNavAgent: Structured security results
    SecNavAgent->>TestRunner: Security validation report

API Security Calls Module

The api_sec_calls.py module provides the foundational HTTP client capabilities for executing security tests. It supports various HTTP methods and authentication schemes required for comprehensive API security testing.

Supported Security Test Operations

OperationDescriptionAuthentication Support
GET Security HeadersValidate presence and correctness of security headersBearer, API Key, Basic
POST Injection TestsExecute payload injection for XSS, SQLi validationBearer, API Key
Authentication BypassTest unauthorized access to protected endpointsToken validation
Rate LimitingVerify rate limiting mechanismsNone required
CORS ValidationCheck cross-origin resource sharing policiesNone required

Sources: testzeus_hercules/core/tools/api_sec_calls.py:1-100

Request Configuration

# Security test request structure
{
    "method": "GET|POST|PUT|DELETE|PATCH",
    "url": "https://api.target.com/endpoint",
    "headers": {
        "Authorization": "Bearer {token}",
        "Content-Type": "application/json"
    },
    "params": {},  # Query parameters
    "data": {},    # Request body for POST/PUT/PATCH
    "timeout": 30,
    "verify_ssl": true
}

Gherkin Test Generation

The generate_api_security_gherkin_test.py helper script uses LLM to automatically generate security-focused Gherkin test cases from OpenAPI specification files.

Input Processing

The generator accepts OpenAPI specifications in both YAML and JSON formats, parsing the specification to identify:

  • Endpoints and their HTTP methods
  • Security schemes defined in the spec
  • Request/response schemas
  • Authentication requirements

Sources: helper_scripts/generate_api_security_gherkin_test.py:1-60

Generation Prompt Strategy

The LLM prompt instructs the model to focus on generating tests that validate:

  1. Vulnerability Detection: Tests that check for common vulnerabilities
  2. Configuration Weaknesses: Validation of security configurations
  3. Sensitive Data Handling: Verification of proper data protection
  4. Authentication/Authorization: Access control testing
  5. Input Validation: Sanitization and validation checks

Output Format

Generated test cases follow this structure:

Feature: API Security Validation - {Endpoint_Name}
    Scenario: Validate security headers on {method} {path}
        Given the API endpoint "{path}" requires authentication
        When I send a {method} request without authorization
        Then the response should have status code 401
        And the response should include "WWW-Authenticate" header
    
    Scenario: Test for SQL injection vulnerability on {path}
        Given the API endpoint "{path}" accepts query parameters
        When I send a GET request with malicious payload in parameter "id"
        Then the response should have status code 400 or 422
        And no SQL error should be present in response body

Sources: helper_scripts/generate_api_security_gherkin_test.py:80-120

Command Line Interface

Running Security Tests

# Generate security tests from OpenAPI spec
python -m helper_scripts.generate_api_security_gherkin_test \
    --input spec.yaml \
    --output ./tests/security \
    --model gpt-4o \
    --number_of_testcase 50

# Execute security tests with Hercules
testzeus-hercules --input-file ./tests/security/api_security.feature

Generation Script Arguments

ArgumentTypeDefaultDescription
input_fileslist[str]requiredOne or more OpenAPI spec files (YAML or JSON)
--outputstrrequiredOutput folder for generated feature files
--modelstro1-previewOpenAI model for test generation
--number_of_testcaseint100Number of test cases to generate

Sources: helper_scripts/generate_api_security_gherkin_test.py:30-55

Integration with Hercules Framework

Agent Initialization

The security agent is initialized through the same mechanism as other Hercules agents, using configuration from the LLM configuration file specified via CLI:

testzeus-hercules \
    --input-file security_tests.feature \
    --agents-llm-config-file ./config/security_agent.yaml \
    --llm-model gpt-4o

Execution Context

Security tests execute within the same context as functional tests, providing:

  • Shared browser/playwright session management
  • Consistent logging and telemetry
  • Unified reporting and result aggregation
  • Access to shared utilities and helpers

Security Test Scenarios

Common Test Categories

CategoryTest FocusExample Validation
AuthenticationToken validation, session managementInvalid token returns 401
AuthorizationAccess control, privilege escalationUser cannot access admin endpoints
Input ValidationPayload sanitization, type checkingMalformed input returns 400
HeadersSecurity header presenceX-Frame-Options, CSP headers present
Rate LimitingRequest throttlingExcessive requests return 429
CORSCross-origin policyInvalid origins rejected

Best Practices

Test Data Management

  • Use dedicated security test environments
  • Isolate security tests from production data
  • Implement proper cleanup for test artifacts
  • Rotate API keys/tokens used in tests

Test Coverage

  • Aim for comprehensive endpoint coverage
  • Include both positive and negative test cases
  • Validate all security headers defined in your policy
  • Test authentication bypass scenarios

Configuration

Environment Variables

VariablePurpose
OPENAI_API_KEYRequired for LLM-powered test generation
SECURITY_TEST_API_KEYAPI key for testing authenticated endpoints
SECURITY_TEST_BASE_URLOverride target API base URL

LLM Configuration

The security agent uses the same LLM configuration structure as other agents, specified through:

# security_agent_config.yaml
llm_config:
  model: gpt-4o
  temperature: 0.7
  max_tokens: 4096

other_settings:
  system_prompt: "You are a security testing expert..."
  max_consecutive_auto_reply: 10

Reporting

Security test results are integrated into the standard Hercules reporting format:

  • Test pass/fail status per scenario
  • Detailed assertion results
  • HTTP request/response logging
  • Security-specific metrics (headers present, vulnerabilities detected)

Sources: [testzeus_hercules/core/tools/api_sec_calls.py:1-100]()

MCP Integration

Related topics: Agent System, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Component Relationship

Continue reading this section for the full explanation and source context.

Section Agent Configuration

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Tool System

MCP Integration

Overview

The MCP (Model Context Protocol) Integration in TestZeus Hercules enables the testing agent to discover, catalog, and execute tools exposed by external MCP servers. This integration allows Hercules to extend its capabilities by leveraging tools from multiple Model Context Protocol-compliant servers during end-to-end testing workflows.

MCP serves as a standardized communication layer that allows the testing framework to:

  • Enumerate and connect to configured MCP servers
  • List available tools and resource namespaces from each connected server
  • Execute remote tool calls with correct parameters
  • Retrieve resources by URI when required for test execution

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-10

Architecture

System Components

The MCP integration is built on three primary components:

ComponentFilePurpose
McpNavAgentcore/agents/mcp_nav_agent.pyMain navigation agent that orchestrates MCP server interactions
MCPHelperutils/mcp_helper.pyUtility class providing MCP client functionality
MCP Toolscore/tools/mcp_tools.pyTool implementations for MCP operations

Component Relationship

graph TD
    A[TestZeus Hercules Core] --> B[McpNavAgent]
    B --> C[MCPHelper]
    C --> D[MCP Servers]
    
    B --> E[get_configured_mcp_servers]
    B --> F[check_mcp_server_connection]
    B --> G[execute_mcp_tool]
    B --> H[read_mcp_resource]
    
    E --> I[Server Discovery]
    F --> J[Connection Status]
    G --> K[Tool Execution]
    H --> L[Resource Retrieval]

McpNavAgent

The McpNavAgent is the central agent responsible for all MCP-related operations. It inherits from BaseNavAgent and implements the Model Context Protocol interaction patterns.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:6-9

Agent Configuration

PropertyValueDescription
agent_namemcp_nav_agentUnique identifier for the agent
InheritsBaseNavAgentBase navigation agent functionality

Core Functions

The MCP Navigation Agent implements the following core functions:

  1. Server Discovery - Enumerate configured MCP servers and their connection status
  2. Capability Cataloging - List tools and resource namespaces for each connected server
  3. Tool Execution - Call tools with correct parameters and handle responses
  4. Resource Retrieval - Read resources by URI when required
  5. Result Summarization - Capture server, tool, arguments, outputs; include timings and status

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:14-28

Operational Rules

#### Rule 1: Previous Step Validation

Before any new action, explicitly review the previous step and its outcome. Do not proceed if the prior critical step failed; address it first.

graph TD
    A[Execute Action] --> B{Previous Step Succeeded?}
    B -->|No| C[Address Failure First]
    B -->|Yes| D[Continue to Next Action]
    C --> D

#### Rule 2: Server Scan First

The agent must call get_configured_mcp_servers and for each server, call check_mcp_server_connection before taking any other action.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:31-35

Agent Prompt

The agent uses a specialized system prompt that defines its role and behavioral guidelines:

### MCP Navigation Agent

You are an MCP (Model Context Protocol) Navigation Agent that assists the Testing Agent by discovering MCP servers, cataloging their exposed tools/resources, and executing the right tool calls to complete the task. Always begin by scanning all configured servers before taking any action.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:11-21

MCPHelper Utility

The MCPHelper class provides the underlying functionality for MCP server interactions. It is exported through the mcp_helper.py module and integrates with the agent system through set_mcp_agents.

Sources: testzeus_hercules/utils/mcp_helper.py

Key Functions

FunctionPurpose
MCPHelperMain helper class for MCP operations
set_mcp_agentsConfigures MCP agents within the testing framework

Configuration

Configuration File Format

MCP servers are configured using a JSON file. The example file mcp_servers.example.json demonstrates the expected format:

{
  "mcpServers": {
    "server_name": {
      "command": "command_to_run",
      "args": ["arg1", "arg2"],
      "env": {
        "KEY": "value"
      }
    }
  }
}

Command Line Arguments

The MCP integration can be configured through command line arguments:

ArgumentTypeDescription
--agents-llm-config-filestringPath to the agents LLM configuration file
--agents-llm-config-file-ref-keystringReference key for the agents LLM configuration file

Sources: testzeus_hercules/config.py:27-39

Workflow

Standard MCP Interaction Flow

graph TD
    A[Start Test Execution] --> B[Initialize McpNavAgent]
    B --> C[Call get_configured_mcp_servers]
    C --> D[For Each Server]
    D --> E[Call check_mcp_server_connection]
    E --> F{Server Connected?}
    F -->|No| G[Log Error / Skip Server]
    F -->|Yes| H[Catalog Tools & Resources]
    H --> I[Task Requires MCP Tool?]
    I -->|Yes| J[Call execute_mcp_tool]
    I -->|No| K[Continue with Other Tasks]
    J --> L[Process Tool Response]
    L --> M[Return Results to Testing Agent]
    G --> D
    K --> N[Complete Test]
    M --> N

Tool Execution Workflow

When executing MCP tools, the agent follows this sequence:

  1. Identify the target MCP server
  2. Verify server connection status
  3. Determine the correct tool and parameters
  4. Execute the tool call via MCP protocol
  5. Capture response including timing and status
  6. Return formatted results to the testing agent

Integration with Testing Framework

Agent Hierarchy

graph BT
    A[BrowserNavAgent] --> B[BaseNavAgent]
    C[ApiNavAgent] --> B
    D[SqlNavAgent] --> B
    E[McpNavAgent] --> B
    F[SecNavAgent] --> B
    
    B --> G[TestZeus Hercules Core]

All navigation agents, including McpNavAgent, inherit from BaseNavAgent, ensuring consistent behavior and integration with the core testing framework.

Sources: testzeus_hercules/core/agents/__init__.py

Available Navigation Agents

AgentPurpose
BrowserNavAgentWeb browser interaction and navigation
ApiNavAgentAPI testing and validation
SqlNavAgentDatabase query execution
McpNavAgentMCP server tool execution
SecNavAgentSecurity testing operations

Best Practices

Initialization

  1. Always ensure MCP servers are properly configured before test execution
  2. Verify server connectivity before attempting tool calls
  3. Use the configured servers list as the authoritative source of available MCP servers

Error Handling

  1. Check previous step outcomes before proceeding
  2. Log connection failures with server identification
  3. Handle tool execution errors with proper parameter validation
  4. Provide clear error messages when MCP operations fail

Task Focus

  • Execute only actions required by the primary testing task
  • Use extra information from MCP responses cautiously
  • Avoid unnecessary server scans after initial discovery

Security Considerations

The MCP integration supports sensitive operations requiring careful configuration:

  • API keys should be provided through secure environment variables
  • Server configurations should be validated before use
  • Tool execution permissions should be properly scoped
  • Resource access should follow least-privilege principles

Summary

The MCP Integration module provides TestZeus Hercules with the ability to extend its testing capabilities through external MCP servers. By implementing a dedicated McpNavAgent that follows standardized MCP protocols, the framework can seamlessly discover servers, catalog their capabilities, and execute tools as needed during end-to-end testing scenarios.

Key benefits include:

  • Extensibility: Add new testing capabilities without modifying core framework code
  • Standardization: Uses the Model Context Protocol for consistent server communication
  • Resource Management: Access remote resources via standardized URI-based retrieval
  • Comprehensive Logging: Captures server status, tool execution times, and results

Sources: [testzeus_hercules/core/agents/mcp_nav_agent.py:1-10]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium 0.1.1

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium README/documentation is current enough for a first validation pass.

The project should not be treated as fully validated until this signal is reviewed.

medium 0.0.40

Users cannot judge support quality until recent activity, releases, and issue response are checked.

medium 0.1.0

Users cannot judge support quality until recent activity, releases, and issue response are checked.

Doramagic Pitfall Log

Doramagic extracted 13 source-linked risk signals. Review them before installing or handing real data to the project.

1. Configuration risk: 0.1.1

  • Severity: medium
  • Finding: Configuration risk is backed by a source signal: 0.1.1. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.1

2. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | README/documentation is current enough for a first validation pass.

3. Maintenance risk: 0.0.40

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: 0.0.40. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.0.40

4. Maintenance risk: 0.1.0

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: 0.1.0. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.0

5. Maintenance risk: 0.1.2

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: 0.1.2. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.2

6. Maintenance risk: 0.1.6

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: 0.1.6. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.6

7. Maintenance risk: Maintainer activity is unknown

  • Severity: medium
  • Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | last_activity_observed missing

8. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: downstream_validation.risk_items | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium

9. Security or permission risk: no_demo

  • Severity: medium
  • Finding: no_demo
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: risks.scoring_risks | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium

10. Security or permission risk: 0.1.4

  • Severity: medium
  • Finding: Security or permission risk is backed by a source signal: 0.1.4. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.4

11. Security or permission risk: 0.2.2

  • Severity: medium
  • Finding: Security or permission risk is backed by a source signal: 0.2.2. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.2.2

12. Maintenance risk: issue_or_pr_quality=unknown

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | issue_or_pr_quality=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using testzeus-hercules with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence