testzeus-hercules Manual Preview

Doramagic Project Pack · Human Manual

testzeus-hercules

TestZeus Hercules serves as an intelligent testing agent that can:

Getting Started with TestZeus Hercules

TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-in...

Section Core Components

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section Setup Steps

Continue reading this section for the full explanation and source context.

Section Basic Options

Continue reading this section for the full explanation and source context.

TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-interactive modes for executing automated tests against web applications.

Overview

TestZeus Hercules serves as an intelligent testing agent that can:

Navigate web pages and interact with UI elements
Generate and execute Gherkin-style test scenarios
Perform API security scanning using Nuclei
Parse accessibility trees for element identification
Execute Python scripts in a sandboxed environment

Sources: CONTRIBUTING.md

Architecture

graph TD
    A[User Input] --> B[CLI / Main Entry]
    B --> C[Global Configuration]
    C --> D[Navigation Agent]
    D --> E[Browser Controller]
    E --> F[CDP Stream Renderer]
    F --> G[Accessibility Tree]
    G --> D
    D --> H[Python Sandbox Executor]
    H --> I[Test Results]
    D --> J[API Security Scanner]
    J --> K[Nuclei Integration]

Core Components

Component	Purpose
`testzeus_hercules/__main__.py`	Entry point handling bulk test execution
`testzeus_hercules/config.py`	Command-line argument parsing and configuration
`testzeus_hercules/telemetry.py`	Installation tracking and error reporting
`testzeus_hercules/core/agents/executor_nav_agent.py`	Navigation agent for browser automation
`frontend/*/index.html`	CDP stream rendering interfaces

Sources: testzeus_hercules/__main__.py:25-45 Sources: testzeus_hercules/config.py

Installation

Prerequisites

Python 3.x
Git
Make

Setup Steps

Fork the repository

``bash git clone [email protected]:YOUR_GIT_USERNAME/testzeus-hercules.git cd testzeus-hercules git remote add upstream https://github.com/test-zeus-ai/testzeus-hercules ``

Create virtual environment

``bash make virtualenv source .venv/bin/activate ``

Install in development mode

``bash make install ``

Sources: CONTRIBUTING.md

Command-Line Interface

TestZeus Hercules provides extensive CLI options for configuration.

Basic Options

Parameter	Type	Description
`--input-file`	str	Path to the input file
`--output-path`	str	Path to the output directory
`--test-data-path`	str	Path to the test data directory
`--project-base`	str	Path to the project base directory

Sources: testzeus_hercules/config.py

LLM Configuration

Parameter	Type	Description
`--llm-model`	str	Name of the LLM model
`--llm-model-api-key`	str	API key for the LLM model
`--llm-model-base-url`	str	Base URL for the LLM API
`--llm-model-api-type`	str	Type of API (openai, anthropic, azure)
`--llm-temperature`	float	Temperature for LLM sampling (0.0-1.0)
`--agents-llm-config-file`	str	Path to agents LLM configuration file

Sources: testzeus_hercules/config.py

Browser Options

Parameter	Description
`--browser-channel`	Browser channel (e.g., chrome-beta, firefox-nightly)
`--browser-path`	Custom path to browser executable
`--browser-version`	Specific browser version (e.g., '114', '115.0.1', 'latest')
`--enable-ublock`	Enable uBlock Origin extension
`--disable-ublock`	Disable uBlock Origin extension
`--auto-accept-screen-sharing`	Automatically accept screen sharing prompts

Sources: testzeus_hercules/config.py

Test Execution Options

Parameter	Description
`--bulk`	Execute tests in bulk from tests directory
`--reuse-vector-db`	Reuse existing vector DB instead of creating fresh one

Sources: testzeus_hercules/config.py Sources: testzeus_hercules/__main__.py:45-60

Portkey Integration

Parameter	Description
`--enable-portkey`	Enable Portkey integration for LLM routing
`--portkey-api-key`	API key for Portkey
`--portkey-strategy`	Routing strategy (fallback or loadbalance)

Sources: testzeus_hercules/config.py

Sandbox Configuration

Parameter	Description
`--sandbox-tenant-id`	Tenant ID for sandbox isolation

Sources: testzeus_hercules/config.py

Running TestZeus Hercules

Interactive Mode

Run the interactive CDP stream renderer with user input capabilities:

make run-interactive

This launches the frontend at frontend/interactive/index.html which provides:

Real-time screencast display
Crosshair cursor for element selection
Input capture for typing into the remote page

Sources: frontend/interactive/index.html

Non-Interactive Mode

Run tests without user interaction:

make run

This uses the non-interactive frontend at frontend/non-interactive/index.html which displays:

Connection status
Screencast output only

Sources: frontend/non-interactive/index.html

Bulk Execution

Execute multiple tests from a tests directory:

python -m testzeus_hercules --bulk

The system checks for a tests directory in the project source root and processes each test folder:

if get_global_conf().should_execute_bulk():
    project_base = get_global_conf().get_project_source_root()
    tests_dir = os.path.join(project_base, "tests")

Sources: testzeus_hercules/__main__.py:45-55

Response Parsing

TestZeus Hercules includes a robust response parser for handling LLM outputs:

graph LR
    A[LLM Response] --> B{Is JSON?}
    B -->|Yes| C[Parse JSON]
    B -->|No| D[Extract Plan/Next Step]
    C --> E[Return Dict]
    D --> E

The parser handles:

JSON wrapped in ```json code blocks
Plain JSON responses
Fallback extraction for plan and next_step fields

Sources: testzeus_hercules/utils/response_parser.py

Telemetry and Installation Tracking

On first run, TestZeus Hercules generates a unique installation ID:

def get_installation_id(file_path: str = "installation_id.txt", is_manual_run: bool = True):
    if os.path.exists(file_path):
        # Load existing installation data
    else:
        # Generate new installation ID
        installation_id = str(uuid.uuid4())

Sources: testzeus_hercules/telemetry.py

Development Workflow

Code Quality

Command	Purpose
`make fmt`	Format code using black & isort
`make lint`	Run pep8, black, mypy linters
`make test`	Run tests and generate coverage report
`make watch`	Run tests on every change

Sources: CONTRIBUTING.md

Testing Requirements

Code coverage must show 100% coverage
Add tests for all changes in your PR

make test

Sources: CONTRIBUTING.md

Release Process

Make changes following the contribution guidelines
Commit using conventional commit messages
Run tests to ensure everything works
Execute make release to create a new tag and push

CAUTION: The make release will change local changelog files and commit all unstaged changes.

Sources: CONTRIBUTING.md

The executor navigation agent follows specific guidelines:

Execution Principles

Error Review: Review previous step outcomes before proceeding
Script Execution: Use execute_python_sandbox tool with access to page, browser, context, playwright_manager, logger, config
Sequential Execution: Execute one script at a time and await results
Validation: Check for successful execution status before proceeding

Sources: testzeus_hercules/core/agents/executor_nav_agent.py

API Security Scanning

TestZeus Hercules integrates with Nuclei for API security testing:

async def run_nuclei_command(
    is_open_api_spec: bool,
    open_api_spec_path: Optional[str],
    target_url: Optional[str],
    tag: str,
    output_file: Path,
    headers: Optional[List[Tuple[str, str]]] = None,
):

Sources: testzeus_hercules/core/tools/api_sec_calls.py

Helper Scripts

CDP Journey Script

Generate test cases from journey data:

python helper_scripts/cdp_journey_script.py --number_of_testcase 5

This produces Gherkin specifications and test data files from JSON journey definitions.

Sources: helper_scripts/cdp_journey_script.py

API Functional Gherkin Test Generator

Generate Gherkin test cases from OpenAPI specifications:

python helper_scripts/generate_api_functional_gherkin_test.py spec.yaml --output ./features --number_of_testcase 100

Sources: helper_scripts/generate_api_functional_gherkin_test.py

Accessibility Tree Processing

TestZeus Hercules processes DOM elements to generate accessibility trees:

Identifies interactive elements (buttons, links, inputs)
Detects draggable elements
Filters out non-interactive elements
Provides detailed element metadata for the AI agent

Sources: testzeus_hercules/utils/get_detailed_accessibility_tree.py

Summary

TestZeus Hercules provides a comprehensive end-to-end testing solution with:

AI-powered browser automation via LLM integration
Flexible deployment (interactive and non-interactive modes)
Extensive CLI configuration options
Built-in support for bulk test execution
API security scanning capabilities
Gherkin test generation from various sources

Sources: CONTRIBUTING.md

Sources: [CONTRIBUTING.md]()

System Architecture

Related topics: Agent System, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Tool System

System Architecture

Overview

TestZeus Hercules is an open-source AI agent framework designed for end-to-end testing of web applications. The system leverages Large Language Models (LLMs) to orchestrate browser automation through Playwright, enabling natural language-driven test execution without requiring users to write traditional test scripts.

The architecture follows a modular design pattern with clear separation between:

Core execution engine
Agent-based navigation and task handling
Python sandbox environment for script execution
Frontend visualization components
Configuration and telemetry systems

Sources: testzeus_hercules/__main__.py

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Agent System

Related topics: System Architecture, Memory Management, LLM Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Planner Agent

Continue reading this section for the full explanation and source context.

Section Browser Navigation Agent

Continue reading this section for the full explanation and source context.

Section API Navigation Agent

Continue reading this section for the full explanation and source context.

Agent System

Overview

The Agent System is the core orchestration layer of the Hercules testing framework. It implements a multi-agent architecture where specialized agents collaborate to execute end-to-end testing scenarios across web browsers, APIs, databases, and other system components.

Sources: testzeus_hercules/core/agent_registry.py:1-50

Architecture Overview

The system follows a hierarchical agent design where a central planner coordinates specialized navigation agents, each responsible for a specific domain of interaction.

graph TD
    A[HighLevelPlannerAgent] --> B[BrowserNavAgent]
    A --> C[ApiNavAgent]
    A --> D[SqlNavAgent]
    A --> E[McpNavAgent]
    A --> F[SecNavAgent]
    A --> G[TimeKeeperNavAgent]
    B --> H[ExecutorNavAgent]
    C --> H
    D --> H
    E --> H
    F --> H
    G --> H
    
    H --> I[Browser/API/SQL/MCP/Security]

Sources: testzeus_hercules/core/simple_hercules.py:1-100

Agent Types

High-Level Planner Agent

The HighLevelPlannerAgent serves as the central coordinator that receives high-level test instructions and decomposes them into executable steps for specialized agents.

Key Responsibilities:

Parsing test instructions and generating execution plans
Routing tasks to appropriate specialized agents
Aggregating results and handling test completion
Managing assertions and validating expected outcomes

Sources: testzeus_hercules/core/agents/high_level_planner_agent.py:1-80

The BrowserNavAgent handles all browser-based interactions including page navigation, element interaction, and DOM manipulation.

Capabilities:

Web page navigation and URL handling
Element clicking and text input
Screenshot capture and visual validation
Cookie and session management

Sources: testzeus_hercules/core/agents/browser_nav_agent.py:1-100

The ApiNavAgent manages HTTP-based interactions for testing RESTful APIs and web services.

Capabilities:

HTTP request construction and execution
Response validation and assertion
Authentication handling (OAuth, API keys, Bearer tokens)
Multi-step API workflows

Sources: testzeus_hercules/core/agents/api_nav_agent.py:1-100

The SqlNavAgent handles database interactions for data validation and setup during test execution.

Capabilities:

SQL query execution
Database connection management
Result set validation
Test data preparation and teardown

Sources: testzeus_hercules/core/agents/sql_nav_agent.py:1-100

The McpNavAgent provides Model Context Protocol integration for interacting with external AI models and tools.

Capabilities:

MCP server connection management
Tool invocation through MCP protocol
Context propagation for AI-assisted testing

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-100

The SecNavAgent handles security-related testing scenarios including authentication flows, authorization checks, and vulnerability scanning.

Capabilities:

Authentication flow testing
Session security validation
Authorization boundary testing
Security header verification

Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-100

The TimeKeeperNavAgent manages time-related test scenarios including scheduling, delays, and time-based assertions.

Capabilities:

Time-based test scheduling
Delay and timeout management
Timestamp validation
Scheduled task execution

Sources: testzeus_hercules/core/agents/time_keeper_nav_agent.py:1-100

The ExecutorNavAgent serves as the execution engine that runs Python scripts and commands within a sandboxed environment.

Key Features:

Python script execution in isolated sandbox
Dynamic module injection based on tenant configuration
Access to browser context, page objects, and configuration
Custom injection support for tenant-specific utilities

Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-150

Agent Registry

The AgentRegistry provides a centralized registration and lookup mechanism for all agents in the system.

Registry Operations

Operation	Description
`register_agent(name, agent)`	Register a new agent with a unique name
`get_agent(name)`	Retrieve an agent by name
`list_agents()`	List all registered agents
`remove_agent(name)`	Remove an agent from the registry

Sources: testzeus_hercules/core/agent_registry.py:50-100

Agent Creation Flow

The SimpleHercules class orchestrates agent creation with the following workflow:

sequenceDiagram
    participant SH as SimpleHercules
    participant Planner as HighLevelPlannerAgent
    participant Nav as Navigation Agents
    participant Exec as ExecutorNavAgent
    
    SH->>SH: Initialize configuration
    SH->>Planner: Create planner agent
    SH->>Nav: Create navigation agents (Browser, API, SQL, etc.)
    SH->>Exec: Create executor agent
    SH->>SH: Register all agents in registry
    Planner->>Nav: Route tasks based on type
    Nav->>Exec: Execute concrete actions

Sources: testzeus_hercules/core/simple_hercules.py:100-200

Message Flow

Agents communicate through a structured message passing system with the following message types:

Message Type	Purpose
`PLAN`	Initial test plan and steps
`STEP`	Individual test step execution
`INFO`	Informational messages
`ASSERT`	Assertion results
`COMPLETED`	Task completion notification
`TERMINATED`	Agent termination signal

Sources: testzeus_hercules/core/simple_hercules.py:200-300

Configuration

LLM Model Configuration

Each agent supports individual LLM model configuration:

Parameter	Type	Description
`model`	string	Model name (e.g., gpt-4, claude-3)
`temperature`	float	Sampling temperature (0.0-1.0)
`max_tokens`	int	Maximum response tokens
`api_key`	string	API authentication key
`base_url`	string	Custom API endpoint URL

Sources: testzeus_hercules/config.py:1-80

Agent-Specific Settings

# Example agent configuration structure
agent_config = {
    "model_config_params": {
        "model": "gpt-4",
        "temperature": 0.7,
        "max_tokens": 2000
    },
    "llm_config_params": {
        "timeout": 60,
        "retry_attempts": 3
    },
    "other_settings": {
        "system_prompt": "You are a testing agent...",
        "max_chat_rounds": 10
    }
}

Response Parsing

The system uses parse_response() from the response parser module to extract structured data from agent outputs:

def parse_response(message: str) -> dict[str, Any]:
    # Handles JSON extraction from markdown code blocks
    # Normalizes newlines and whitespace
    # Extracts plan and next_step fields

Sources: testzeus_hercules/utils/response_parser.py:1-60

Sandbox Execution

The ExecutorNavAgent provides a secure Python execution environment with configurable module injection:

Available Injections

Module	Description
`playwright`	Browser automation library
`requests`	HTTP client library
`beautifulsoup4`	HTML parsing
`hercules_utils`	Project utility functions
Custom packages	Configured via SANDBOX_PACKAGES

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:1-100

Sandbox Context Variables

Scripts executed in the sandbox have automatic access to:

Variable	Type	Description
`page`	Playwright Page	Current browser page
`browser`	Playwright Browser	Browser instance
`context`	Playwright Context	Browser context
`playwright_manager`	Manager	Playwright management
`logger`	Logger	Logging interface
`config`	Config	Global configuration

Command-Line Interface

The agent system supports configuration via command-line arguments:

Argument	Description
`--llm-model`	Specify LLM model name
`--llm-temperature`	Set sampling temperature
`--agents-llm-config-file`	Path to agent config file
`--enable-portkey`	Enable Portkey routing
`--browser-channel`	Browser channel selection
`--reuse-vector-db`	Reuse existing vector database

Sources: testzeus_hercules/config.py:80-150

Error Handling

Agents implement robust error handling with:

Termination Message Check: Each agent validates termination conditions via is_xxx_termination_message() functions
Tool Call Monitoring: Tracks pending tool calls to prevent premature termination
Graceful Degradation: Continues execution with alternative approaches on failure
Logging: Comprehensive logging for debugging and audit trails

Sources: testzeus_hercules/core/simple_hercules.py:300-400

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Browser Automation

Overview

Browser Automation in TestZeus Hercules provides a comprehensive framework for controlling web browsers through Playwright, enabling autonomous agents to perform complex web interactions, testing, and data extraction tasks. The system acts as a bridge between LLM-powered agents and real browser instances, translating natural language instructions into precise DOM manipulations.

The automation layer handles multiple browser types (Chromium, Firefox, WebKit), manages browser contexts with device emulation, supports cloud-based testing platforms via CDP (Chrome DevTools Protocol) tunneling, and provides sophisticated DOM interaction tools including accessibility-aware element selection and real-time mutation observation.

Sources: testzeus_hercules/core/playwright_manager.py:1-50

Architecture

System Components

graph TD
    A[SimpleHercules Core] --> B[PlaywrightManager]
    B --> C[Browser Context]
    C --> D[Browser Instance<br/>Chromium|Firefox|WebKit]
    B --> E[Tool Registry]
    E --> F[Navigation Tools]
    E --> G[Interaction Tools]
    E --> H[Extraction Tools]
    B --> I[DOM Mutation Observer]
    B --> J[BrowserLogger]
    J --> K[Interaction Logs]
    K --> L[Accessibility Tree]
    L --> M[AccessibilityInfo]

Core Components

Component	File	Purpose
PlaywrightManager	`core/playwright_manager.py`	Central browser lifecycle management
BrowserLogger	`core/browser_logger.py`	Interaction logging and proof generation
DOMHelper	`utils/dom_helper.py`	DOM state management and waiting
AccessibilityTree	`utils/get_detailed_accessibility_tree.py`	Extract and format accessibility information
ToolRegistry	`core/tools/tool_registry.py`	Dynamic tool registration and routing

Sources: testzeus_hercules/core/playwright_manager.py Sources: testzeus_hercules/utils/dom_helper.py

Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual

Tool System

Overview

The Tool System is the core execution layer of TestZeus Hercules, providing AI agents with capabilities to interact with web pages through a unified, decorator-based interface. The system abstracts browser automation operations (clicking, typing, hovering, dragging, etc.) into discrete, callable tools that agents can invoke during test execution.

Tools serve as the fundamental building blocks that bridge natural language agent instructions with Playwright browser automation. Each tool encapsulates a specific browser interaction pattern, handles error cases gracefully, and returns structured results that agents can parse and respond to. Sources: testzeus_hercules/core/tools/click_using_selector.py:1-50

Architecture

System Components

graph TD
    subgraph "Agent Layer"
        A["Browser Nav Agent"]
        B["Executor Nav Agent"]
    end
    
    subgraph "Tool Registry"
        C["tool_registry.py"]
        D["Tool Decorator"]
    end
    
    subgraph "Core Browser Tools"
        E["click_using_selector"]
        F["enter_text_using_selector"]
        G["hover"]
        H["drag_and_drop_tool"]
        I["get_interactive_elements"]
    end
    
    subgraph "Extra Tools"
        J["browser_assist_tools"]
        K["accessibility_calls"]
        L["upload_file"]
    end
    
    subgraph "Browser Automation"
        M["Playwright Manager"]
        N["Page Object"]
    end
    
    A --> C
    B --> C
    C --> E
    C --> F
    C --> G
    C --> H
    C --> I
    C --> J
    J --> K
    J --> L
    E --> M
    F --> M
    G --> M
    M --> N

Tool Categories

Category	Purpose	Location	Example Tools
Core Browser Tools	Primary page interactions	`testzeus_hercules/core/tools/`	click, enter_text, hover, drag_drop
Extra Tools	Auxiliary functionality	`testzeus_hercules/core/extra_tools/`	accessibility, browser_assist
Upload Tools	File handling	`testzeus_hercules/core/tools/`	upload_file

Tool Definition Pattern

The `@tool` Decorator

All tools in the system are defined using the @tool decorator, which registers the function with the tool registry and provides metadata for agent consumption.

from functools import wraps
from typing import Annotated, Any, Dict, List, Optional

def tool(agent_names: List[str], description: str, name: str):
    """Decorator to register a function as a callable tool."""
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            return await func(*args, **kwargs)
        wrapper._tool_config = {
            "agent_names": agent_names,
            "description": description,
            "name": name,
        }
        return wrapper
    return decorator

Tool Registration Metadata

Parameter	Type	Description	Required
`agent_names`	`List[str]`	List of agent identifiers that can call this tool	Yes
`description`	`str`	Human-readable description for the LLM	Yes
`name`	`str`	Unique identifier for the tool	Yes

Example usage:

@tool(
    agent_names=["browser_nav_agent"],
    description="Click on an element using selector",
    name="click"
)
async def click_element(selector: Annotated[str, "CSS selector"]) -> dict:
    # Implementation
    pass

Sources: testzeus_hercules/core/tools/click_using_selector.py:1-30

Core Browser Tools

Click Tool (`click_using_selector`)

The primary interaction tool for clicking on page elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to click on an element in the DOM.",
    name="click"
)
async def click_using_selector(
    selector: Annotated[
        str,
        "md attribute value of the dom element to interact, md is an ID"
    ],
    click_type: Annotated[
        Optional[str],
        "type of click - left, right, double, mouseover, mouseenter, mouseleave, mouseexit, mousedown, mouseup"
    ] = "left",
    timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the click operation"]

Execution Flow:

sequenceDiagram
    participant Agent
    participant ClickTool
    participant PlaywrightManager
    participant Page
    participant SelectorLogger

    Agent->>ClickTool: click_using_selector(selector, click_type)
    ClickTool->>PlaywrightManager: find_element(selector)
    PlaywrightManager->>Page: locator(selector).first
    Page-->>PlaywrightManager: ElementHandle
    PlaywrightManager-->>ClickTool: element
    
    alt Element not found
        ClickTool->>SelectorLogger: log_selector_interaction(success=False)
        ClickTool-->>Agent: Error response
    end
    
    ClickTool->>Page: element.scroll_into_view_if_needed()
    ClickTool->>Page: element.is_visible()
    
    alt Element not visible
        ClickTool-->>Agent: Try another element response
    end
    
    ClickTool->>SelectorLogger: get_alternative_selectors()
    ClickTool->>SelectorLogger: get_element_attributes()
    ClickTool->>Page: element.click(click_type)
    
    alt Click success
        ClickTool->>SelectorLogger: log_selector_interaction(success=True)
        ClickTool-->>Agent: Success response
    else Click fails
        ClickTool->>SelectorLogger: log_selector_interaction(success=False)
        ClickTool-->>Agent: Error response
    end

Error Handling:

The click tool performs comprehensive error handling:

Element Not Found: Logs selector interaction with success=False and raises ValueError
Element Not Visible: Returns alternative suggestion response
Scroll Failure: Gracefully continues (non-blocking)
Click Execution Failure: Logs failure and returns error dictionary

if element is None:
    await selector_logger.log_selector_interaction(
        tool_name="click",
        selector=selector,
        action=type_of_click,
        selector_type="css" if "md=" in selector else "custom",
        success=False,
        error_message=f'Element with selector: "{selector}" not found',
    )
    raise ValueError(f'Element with selector: "{selector}" not found')

Sources: testzeus_hercules/core/tools/click_using_selector.py:40-80

Enter Text Tool (`enter_text_using_selector`)

Handles text input into form fields and contenteditable elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to enter the given text into an input field or a contenteditable element.",
    name="enter_text"
)
async def enter_text_using_selector(
    selector: Annotated[str, "md attribute value of the dom element to interact"],
    text_to_fill: Annotated[str, "text to enter into the element"],
    submit: Annotated[Optional[bool], "whether to submit after entering text"] = False,
    press_enter_after_input: Annotated[bool, "press Enter key after filling text"] = False,
) -> Annotated[Dict[str, str], "Result dictionary with summary and details"]

Hover Tool (`hover`)

Moves mouse cursor over an element without clicking.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to hover over an element in the DOM.",
    name="hover"
)
async def hover_using_selector(
    selector: Annotated[str, "md attribute value of the dom element to interact"],
    timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the hover operation"]

Drag and Drop Tool (`drag_and_drop_tool`)

Performs HTML5 drag-and-drop operations between elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="used to drag and drop an element.",
    name="drag_and_drop"
)
async def drag_and_drop(
    source: Annotated[str, "md attribute value of source element"],
    target: Annotated[str, "md attribute value of target element"],
) -> Annotated[Dict[str, str], "Result of the drag and drop operation"]

Get Interactive Elements (`get_interactive_elements`)

Retrieves all interactive elements from the current page DOM.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Get interactive elements from the current page",
    name="get_interactive_elements"
)
async def get_interactive_elements() -> Annotated[str, "JSON string of interactive elements"]

DOM Processing Logic:

The tool executes JavaScript in the browser context to identify interactive elements:

const isInteractive = (element) => {
    // Check input-related elements
    const inputRelatedTags = new Set(['input', 'textarea', 'select', ...]);
    const interactiveRoles = new Set(['button', 'link', 'checkbox', ...]);
    
    // Check for ARIA attributes
    const hasAriaProps = element.hasAttribute('aria-haspopup') ||
                         element.hasAttribute('aria-expanded') ||
                         element.hasAttribute('aria-checked');
    
    // Check cursor style
    const style = window.getComputedStyle(element);
    const hasPointerCursor = style.cursor === 'pointer';
    
    // Check draggable attribute
    const isDraggable = element.draggable;
    
    // Skip body and its direct children
    if (element.tagName.toLowerCase() === 'body') return false;
    
    return hasAriaProps || hasClickHandler || isDraggable;
};

Sources: testzeus_hercules/core/tools/get_interactive_elements.py:1-50

Upload File Tool (`upload_file`)

Handles file upload dialogs by setting file paths on input elements.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Upload a file to the page",
    name="upload_file"
)
async def upload_file(
    selector: Annotated[str, "md attribute value of file input element"],
    file_path: Annotated[str, "Path to the file to upload"],
) -> Annotated[Dict[str, str], "Result of the upload operation"]

Extra Tools System

Dynamic Tool Loading

The extra tools are loaded dynamically at runtime using Python's pkgutil module. This allows the system to extend functionality without modifying core code.

Loading Mechanism:

# testzeus_hercules/core/extra_tools/__init__.py
import importlib
import pkgutil
from pathlib import Path

from testzeus_hercules.config import get_global_conf

package_path = Path(__file__).parent

if get_global_conf().get_load_extra_tools().lower().strip() != "false":
    for _, module_name, _ in pkgutil.iter_modules([str(package_path)]):
        full_module_name = f"testzeus_hercules.core.extra_tools.{module_name}"
        module = importlib.import_module(full_module_name)
        
        # Export all public attributes to current namespace
        for attribute_name in dir(module):
            if not attribute_name.startswith("_"):
                globals()[attribute_name] = getattr(module, attribute_name)

Configuration:

Extra tools can be disabled via configuration:

python -m testzeus_hercules --load-extra-tools=false

Sources: testzeus_hercules/core/extra_tools/__init__.py:1-25

Accessibility Calls (`accessibility_calls`)

Provides accessibility testing using the axe-core library.

Function Signature:

@tool(
    agent_names=["browser_nav_agent"],
    description="Check page accessibility using axe-core",
    name="check_accessibility"
)
async def check_accessibility(page_path: Annotated[str, "Page path or URL"]) -> str

Execution Flow:

Fetches axe-core script from CDN
Injects script into page
Runs axe.run() in browser context
Collects violations and incomplete checks
Logs results in structured format

// Inject axe-core
await page.evaluate(
    `fetch("${AXE_SCRIPT_URL}").then(res => res.text())`
);

// Run accessibility checks
const results = await page.evaluate("async () => { return await axe.run(); }");

Response Format:

{
  "status": "success|failure",
  "message": "Human readable summary",
  "details": ["failure summary 1", "failure summary 2"]
}

Sources: testzeus_hercules/core/tools/accessibility_calls.py:1-80

Tool Return Format

All tools return a standardized dictionary structure:

Field	Type	Description
`summary_message`	`str`	Brief status message for agent consumption
`detailed_message`	`str`	Extended information including errors if any
`status`	`str`	Operation status (success/failure)

Success Response Example:

return {
    "summary_message": f'Successfully clicked element with selector: "{selector}"',
    "detailed_message": f'Element with selector: "{selector}" clicked successfully. Tag: {element_tag_name}',
}

Error Response Example:

return {
    "summary_message": f'Element with selector: "{selector}" is not visible, Try another element',
    "detailed_message": f'Element with selector: "{selector}" is not visible, Try another element',
}

Selector Logging System

Every tool interaction is logged using the SelectorLogger for proof generation and debugging.

Logging Interface

from testzeus_hercules.utils.browser_logger import get_browser_logger

selector_logger = get_browser_logger(get_global_conf().get_proof_path())

# Log successful interaction
await selector_logger.log_selector_interaction(
    tool_name="click",
    selector=selector,
    action="left",
    selector_type="css",
    success=True,
)

# Log failed interaction
await selector_logger.log_selector_interaction(
    tool_name="click",
    selector=selector,
    action="left",
    selector_type="css",
    success=False,
    error_message="Element not found",
)

Captured Data

The logger captures:

Tool name and action type
Selector used
Selector type (CSS vs custom)
Success/failure status
Alternative selectors for the element
Element attributes
Error messages on failure

Agent Integration

Browser Nav Agent

The primary agent that consumes browser tools. It receives natural language instructions and translates them into tool calls.

Tool Invocation Pattern:

# From executor_nav_agent.py
async def execute_task(self, instruction: str):
    # Agent decides which tool to call
    # Tool receives selector from instruction
    result = await click_using_selector(
        selector="[md='submit-button']",
        click_type="left"
    )
    
    # Agent processes result
    if "error" in result.get("summary_message", "").lower():
        # Handle error or try alternative

Executor Nav Agent

Handles script execution and Python sandbox operations:

Script Execution Context:

Variable	Type	Description
`page`	`Page`	Playwright page object
`browser`	`Browser`	Playwright browser instance
`context`	`BrowserContext`	Browser context
`playwright_manager`	`PlaywrightManager`	Manager instance
`logger`	`Logger`	Logging utility
`config`	`GlobalConf`	Global configuration

Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-50

Bulk Operations

The tool system supports bulk execution for batch operations:

Bulk Slider Tool

@tool(
    agent_names=["browser_nav_agent"],
    description="used to set slider values in multiple sliders in single attempt.",
    name="bulk_set_slider"
)
async def bulk_set_slider(
    entries: Annotated[
        List[List[str]],
        "List of [selector, value] pairs",
    ],
) -> Annotated[List[Dict[str, str]], "List of results"]

Configuration

Command Line Arguments

The tool system behavior can be configured via CLI:

Argument	Type	Description
`--llm-model`	`str`	LLM model for agent decisions
`--llm-temperature`	`float`	LLM sampling temperature (0.0-1.0)
`--agents-llm-config-file`	`str`	Path to agents LLM config file
`--enable-portkey`	`flag`	Enable Portkey LLM routing
`--portkey-api-key`	`str`	Portkey API key
`--browser-channel`	`str`	Browser channel (chrome-beta, etc.)
`--browser-version`	`str`	Specific browser version
`--enable-ublock`	`flag`	Enable uBlock Origin extension
`--load-extra-tools`	`str`	Load extra tools (default: true)

Sources: testzeus_hercules/config.py:1-100

Summary

The Tool System provides:

Unified Interface: All browser interactions follow the @tool decorator pattern
Agent Compatibility: Tools specify which agents can invoke them
Error Resilience: Graceful handling of element not found, not visible, and execution failures
Proof Generation: Comprehensive logging of all selector interactions
Extensibility: Dynamic loading of extra tools without core modifications
Standardized Results: Consistent return format across all tools

This architecture enables AI agents to reliably control browser automation while maintaining clean separation of concerns and testable component boundaries.

Sources: [testzeus_hercules/core/tools/click_using_selector.py:1-30]()

LLM Configuration

Related topics: Agent System, Memory Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Component Architecture

Continue reading this section for the full explanation and source context.

Section Configuration Flow

Continue reading this section for the full explanation and source context.

Section AgentRegistry

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Memory Management

LLM Configuration

Overview

The LLM Configuration system in testzeus-hercules provides a flexible, multi-provider framework for configuring Large Language Models across different agent types. This system enables the framework to support various LLM providers (OpenAI, Anthropic, Ollama, Azure) while maintaining provider-specific configurations through a centralized registry mechanism.

The configuration architecture supports:

Multiple LLM providers simultaneously
Per-agent model selection
Dynamic parameter adaptation based on model family
External configuration file loading
Environment variable integration

Sources: testzeus_hercules/core/agents_llm_config.py:1-20

Architecture

High-Level Component Architecture

graph TD
    A[CLI Arguments / Environment] --> B[Global Config]
    C[agents-llm-config-file.json] --> D[AgentsLLMConfig]
    D --> E[AgentRegistry]
    E --> F[Provider Configs]
    F --> G[planner_agent]
    F --> H[nav_agent]
    F --> I[mem_agent]
    F --> J[helper_agent]
    B --> K[Model Utils]
    K --> L[adapt_llm_params_for_model]
    L --> G
    L --> H
    L --> I
    L --> J

Configuration Flow

sequenceDiagram
    participant CLI as CLI Arguments
    participant Config as Global Config
    participant JSON as LLM Config File
    participant Processor as AgentsLLMConfig
    participant Registry as AgentRegistry
    participant Agent as SimpleHercules
    
    CLI->>Config: Parse --llm-* arguments
    JSON->>Processor: load_from_file()
    Processor->>Registry: register_provider()
    Registry->>Registry: Store configs per provider
    Config->>Agent: Pass model configs
    Agent->>ModelUtils: adapt_llm_params_for_model()
    ModelUtils->>Agent: Adapted LLM params

Sources: testzeus_hercules/core/agents_llm_config.py:25-50

Core Components

AgentRegistry

The AgentRegistry manages configurations for multiple providers and supports switching between them.

class AgentRegistry:
    def __init__(self) -> None:
        self._providers: Dict[str, Dict[str, AgentConfig]] = {}
        self._active_provider: Optional[str] = None

Method	Purpose
`register_provider(provider_key, configs)`	Register agent configs for a provider
`get_active_provider()`	Retrieve currently active provider configuration
`set_active_provider(provider_key)`	Switch active provider
`get_all_providers()`	List all registered providers

Sources: testzeus_hercules/core/agents_llm_config.py:20-35

AgentsLLMConfig

The main configuration processor that handles loading and normalization of agent configurations.

class AgentsLLMConfig:
    def __init__(self) -> None:
        self.registry = AgentRegistry()
    
    def load_from_file(self, file_path: str, provider_key: Optional[str] = None) -> None
    def normalize_agent_config(self, agent_config: Dict[str, Any]) -> AgentConfig

Sources: testzeus_hercules/core/agents_llm_config.py:40-70

Agent Types

The framework defines four specialized agent roles, each with configurable LLM parameters:

Agent Type	Purpose	Typical Model Requirements
`planner_agent`	Strategic task planning and decomposition	High reasoning capability
`nav_agent`	Browser navigation and UI interaction	Vision-capable, fast response
`mem_agent`	Memory and context management	Balanced performance
`helper_agent`	Utility functions and data processing	Variable based on task

Sources: testzeus_hercules/core/simple_hercules.py:1-30

Configuration File Format

JSON Schema Structure

{
  "provider_name": {
    "agent_type": {
      "model_name": "string",
      "model_api_key": "string",
      "model_api_type": "openai|anthropic|azure|ollama",
      "model_client_host": "string (optional)",
      "model_native_tool_calls": true|false,
      "model_hide_tools": "if_any_run|user|never",
      "llm_config_params": {
        "cache_seed": null|integer,
        "temperature": 0.0,
        "seed": 12345,
        "max_tokens": 4096,
        "presence_penalty": 0.0,
        "frequency_penalty": 0.0,
        "stop": []
      }
    }
  }
}

Example Configuration

{
  "openai": {
    "planner_agent": {
      "model_name": "gpt-4o",
      "model_api_key": "${OPENAI_API_KEY}",
      "model_api_type": "openai",
      "llm_config_params": {
        "cache_seed": null,
        "temperature": 0.0,
        "seed": 12345
      }
    }
  },
  "anthropic": {
    "nav_agent": {
      "model_name": "claude-3-5-haiku-latest",
      "model_api_key": "",
      "model_api_type": "anthropic",
      "llm_config_params": {
        "cache_seed": null,
        "temperature": 0.0
      }
    }
  }
}

Sources: agents_llm_config-example.json:1-60

CLI Configuration Options

The framework provides comprehensive command-line arguments for LLM configuration:

Basic LLM Parameters

Argument	Type	Description
`--llm-model`	string	Name of the LLM model
`--llm-model-api-key`	string	API key for the LLM model
`--llm-model-base-url`	string	Base URL for the LLM API
`--llm-model-api-type`	string	API type (openai, anthropic, azure, etc.)
`--llm-temperature`	float	Temperature for LLM sampling (0.0-1.0)

LLM Configuration File Options

Argument	Type	Description
`--agents-llm-config-file`	string	Path to the agents LLM configuration file
`--agents-llm-config-file-ref-key`	string	Reference key for selecting provider

Parameter Configuration

Argument	Type	Description
`LLM_MODEL_PRICING`	env	Model pricing for cost tracking
`LLM_MODEL_TEMPERATURE`	env	Default temperature setting
`LLM_MODEL_CACHE_SEED`	env	Cache seed for reproducible results
`LLM_MODEL_SEED`	env	Random seed for generation
`LLM_MODEL_MAX_TOKENS`	env	Maximum tokens in response
`LLM_MODEL_PRESENCE_PENALTY`	env	Presence penalty (-2.0 to 2.0)
`LLM_MODEL_FREQUENCY_PENALTY`	env	Frequency penalty (-2.0 to 2.0)
`LLM_MODEL_STOP`	env	Stop sequences

Sources: testzeus_hercules/config.py:1-100

Portkey Integration

The framework supports Portkey for advanced LLM routing with fallback and load balancing capabilities.

Portkey Configuration Options

Argument	Type	Description
`--enable-portkey`	flag	Enable Portkey integration
`--portkey-api-key`	string	API key for Portkey
`--portkey-strategy`	choice	Routing strategy: `fallback` or `loadbalance`

Environment Variables

Variable	Description
`ENABLE_PORTKEY`	Enable/disable Portkey
`PORTKEY_API_KEY`	Portkey API key
`PORTKEY_STRATEGY`	Routing strategy
`PORTKEY_CACHE_ENABLED`	Enable response caching
`PORTKEY_TARGETS`	Target models for routing
`PORTKEY_GUARDRAILS`	Enable safety guardrails
`PORTKEY_RETRY_COUNT`	Number of retries on failure

Sources: testzeus_hercules/config.py:50-80

Model Parameter Adaptation

The model_utils module provides intelligent parameter adaptation based on the model family being used.

adapt_llm_params_for_model

def adapt_llm_params_for_model(model_name: str, llm_config_params: Dict) -> Dict

This function automatically adjusts LLM parameters based on the detected model family:

Model Family	Adaptation Behavior
o1-series	Removes temperature, adjusts max_tokens handling
GPT-4o	Standard parameters
Claude	Adjusts for Anthropic API format
Ollama	Configures for local inference

Applied to All Agents

# In SimpleHercules initialization
planner_model = self.planner_agent_config["model_config_params"].get("model") or \
                self.planner_agent_config["model_config_params"].get("model_name")

self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
    planner_model, 
    self.planner_agent_config["llm_config_params"]
)

Sources: testzeus_hercules/core/simple_hercules.py:100-130

LLM Helper Utilities

The llm_helper module provides utility functions for LLM interactions:

Function	Purpose
`convert_model_config_to_autogen_format()`	Convert config to AutoGen format
`create_multimodal_agent()`	Create agents with vision capabilities
`extract_target_helper()`	Extract target information from responses
`format_plan_steps()`	Format planning step outputs
`parse_agent_response()`	Parse agent response structure
`process_chat_message_content()`	Process chat message content
`parse_response()`	General response parsing

Sources: testzeus_hercules/utils/llm_helper.py:1-30

Environment Variable Configuration

Full Configuration Matrix

Environment Variable	Type	Default	Purpose
`LLM_MODEL_PRICING`	dict	-	Model pricing information
`LLM_MODEL_TEMPERATURE`	float	0.0	Default sampling temperature
`LLM_MODEL_CACHE_SEED`	int	null	Caching seed
`LLM_MODEL_SEED`	int	-	Random seed
`LLM_MODEL_MAX_TOKENS`	int	4096	Max response tokens
`LLM_MODEL_PRESENCE_PENALTY`	float	0.0	Presence penalty
`LLM_MODEL_FREQUENCY_PENALTY`	float	0.0	Frequency penalty
`LLM_MODEL_STOP`	list	[]	Stop sequences
`TOKEN_VERBOSE`	bool	false	Enable token verbose logging
`HF_HOME`	path	-	HuggingFace cache location
`TOKENIZERS_PARALLELISM`	bool	false	Parallel tokenizer config

Sources: testzeus_hercules/config.py:100-150

Multi-Provider Configuration

Provider Switching

The system supports runtime provider switching through the configuration file:

graph LR
    A[Config File] --> B["provider: openai"]
    A --> C["provider: anthropic"]
    B --> D["planner_agent: GPT-4"]
    B --> E["nav_agent: GPT-4o-mini"]
    C --> F["planner_agent: Claude-3"]
    C --> G["nav_agent: Claude-3-Haiku"]

Selecting Active Provider

python -m testzeus_hercules \
    --agents-llm-config-file ./config.json \
    --agents-llm-config-file-ref-key "anthropic"

Sources: testzeus_hercules/core/agents_llm_config.py:60-80

Integration with SimpleHercules

The SimpleHercules class integrates all LLM configurations:

class SimpleHercules:
    def __init__(
        self,
        planner_agent_config: Dict[str, Any],
        nav_agent_config: Dict[str, Any],
        mem_agent_config: Dict[str, Any],
        helper_agent_config: Dict[str, Any],
        planner_max_chat_round: int = 50,
        browser_nav_max_chat_round: int = 100,
    ):
        # Configuration processing
        self.planner_agent_config = planner_agent_config
        self.nav_agent_config = nav_agent_config
        self.mem_agent_config = mem_agent_config
        self.helper_agent_config = helper_agent_config
        
        # Parameter adaptation per agent
        from testzeus_hercules.utils.model_utils import adapt_llm_params_for_model
        
        self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
            planner_model, 
            self.planner_agent_config["llm_config_params"]
        )

Sources: testzeus_hercules/core/simple_hercules.py:50-100

Best Practices

1. Configuration File Organization

Group configurations by provider (openai, anthropic, etc.)
Use environment variables for sensitive API keys
Maintain consistent agent naming across providers

2. Model Selection Guidelines

Use Case	Recommended Models
Complex planning	GPT-4, Claude-3-Opus
Fast navigation	GPT-4o-mini, Claude-3-Haiku
Vision tasks	GPT-4o, Claude-3-Sonnet
Local inference	Ollama models

3. Parameter Tuning

Use temperature: 0.0 for deterministic outputs
Set appropriate max_tokens based on expected response length
Enable model_native_tool_calls for better function calling

4. Cost Optimization

Use LLM_MODEL_PRICING for tracking
Enable Portkey caching with PORTKEY_CACHE_ENABLED
Consider fallback strategies for reliability

Troubleshooting

Common Issues

Issue	Solution
Model not recognized	Check `model_name` matches provider format
Temperature ignored	Some models (o1-series) ignore temperature parameter
API key errors	Ensure `${ENV_VAR}` syntax or actual key in config
Provider not found	Verify provider key matches config file structure

Debug Configuration

export TOKEN_VERBOSE=true
export ENABLE_BROWSER_LOGS=true
python -m testzeus_hercules --llm-model gpt-4o --llm-temperature 0.7

Memory Management

Related topics: Agent System, LLM Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Purpose and Scope

Continue reading this section for the full explanation and source context.

Section Implementation

Continue reading this section for the full explanation and source context.

Section Data Consolidation

Continue reading this section for the full explanation and source context.

Related topics: Agent System, LLM Configuration

Memory Management

Overview

The Memory Management system in testzeus-hercules provides persistent and contextual memory capabilities for AI agents executing browser automation tests. The system consists of three primary components: Static LTM (Long Term Memory), Dynamic LTM, and State Handler, each serving distinct purposes in managing test execution context and data persistence.

The architecture follows a multi-layered approach where Static LTM loads pre-configured test data, Dynamic LTM manages vector-based retrieval augmented memory, and State Handler provides runtime state tracking for agent coordination.

graph TD
    A[Testzeus-Hercules] --> B[Static LTM]
    A --> C[Dynamic LTM]
    A --> D[State Handler]
    
    B --> E[Test Data Files]
    B --> F[Stored Data]
    B --> G[Run Data]
    
    C --> H[Vector Store]
    C --> I[RetrieveUserProxyAgent]
    
    D --> J[_state_string]
    D --> K[_state_dict]
    
    L[Agents] --> M[Memory Access]
    M --> B
    M --> C
    M --> D

Static LTM (Long Term Memory)

Purpose and Scope

Static LTM is responsible for loading and consolidating pre-configured test data at application initialization. It operates as a singleton pattern, ensuring that all test data is loaded once and made available throughout the test execution lifecycle.

Sources: testzeus_hercules/core/memory/static_ltm.py:1-47

Implementation

The StaticLTM class extends the singleton pattern to ensure only one instance exists:

class StaticLTM:
    _instance = None

    def __new__(cls) -> "StaticLTM":
        if cls._instance is None:
            cls._instance = super().__new__(cls)
            cls._instance._initialize()
        return cls._instance

Sources: testzeus_hercules/core/memory/static_ltm.py:17-22

Data Consolidation

During initialization, Static LTM consolidates three types of data sources:

Data Source	Description	Method
Base Test Data	Loaded from `test_data.txt` via `load_data()`	`StaticDataLoader`
Stored Data	User-defined test artifacts	`get_stored_data()`
Run Data	Previous test execution context	`get_run_data()`

Sources: testzeus_hercules/core/memory/static_ltm.py:26-34

The consolidated data is stored in self.consolidated_data and accessed via get_user_ltm():

def get_user_ltm(self) -> Optional[str]:
    return self.consolidated_data

Sources: testzeus_hercules/core/memory/static_ltm.py:40-47

Usage Pattern

Agents access Static LTM through a module-level function:

def get_user_ltm() -> Optional[str]:
    return StaticLTM().get_user_ltm()

Sources: testzeus_hercules/core/memory/static_ltm.py:50-54

Dynamic LTM

Purpose and Scope

Dynamic LTM provides runtime memory management with vector-based retrieval capabilities. It enables agents to store, retrieve, and utilize contextual information during test execution using a RetrieveUserProxyAgent backed by ChromaDB for vector storage.

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:1-40

Core Components

#### SilentRetrieveUserProxyAgent

A specialized agent that extends RetrieveUserProxyAgent with suppressed output to prevent console noise during agent conversations:

class SilentRetrieveUserProxyAgent(RetrieveUserProxyAgent):
    @suppress_prints
    def initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
        return super().initiate_chat(*args, **kwargs)

    @suppress_prints
    async def a_initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
        return await super().a_initiate_chat(*args, **kwargs)

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:37-46

#### Print Suppression Decorator

The @suppress_prints decorator redirects stdout to a StringIO buffer during function execution:

def suppress_prints(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        silent_stdout = io.StringIO()
        original_stdout, sys.stdout = sys.stdout, silent_stdout
        try:
            return func(*args, **kwargs)
        finally:
            sys.stdout = original_stdout
    return wrapper

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:17-29

Integration with Configuration

Dynamic LTM respects global configuration to enable or disable its functionality:

from testzeus_hercules.config import get_global_conf

def save_content(self, content: str) -> None:
    config = get_global_conf()
    if not config.should_use_dynamic_ltm():
        return  # Skip when disabled

Sources: testzeus_hercules/core/simple_hercules.py:89-94

External Dependencies

Dynamic LTM utilizes the unstructured library for document parsing:

from unstructured.documents.elements import NarrativeText, Text, Title
from unstructured.partition.auto import partition

Sources: testzeus_hercules/core/memory/dynamic_ltm.py:13-14

State Handler

Purpose and Scope

State Handler provides lightweight runtime state management for coordinating data between agents during test execution. It maintains module-level dictionaries for storing string-based state and structured data.

Sources: testzeus_hercules/core/memory/state_handler.py:1-70

Module-Level State Storage

# Module-level state string
_state_string: Dict[str, str] = defaultdict(str)
_state_dict: Dict[str, Any] = defaultdict(deque)

Sources: testzeus_hercules/core/memory/state_handler.py:13-14

store_data Tool

The store_data function is registered as a tool for browser, API, and SQL navigation agents to persist information:

@tool(
    agent_names=["browser_nav_agent", "api_nav_agent", "sql_nav_agent"],
    description="Tool to store information.",
    name="store_data",
)
def store_data(
    text: Annotated[str, "The confirmation of stored value."],
) -> Annotated[Dict[str, Union[str, None]], "A dictionary containing a 'message' key..."]:
    global _state_string
    try:
        DynamicLTM().save_content(text)
        _state_string[get_global_conf().get_default_test_id()] += text
        return {"message": "Text appended successfully."}
    except Exception as e:
        return {"error": str(e)}

Sources: testzeus_hercules/core/memory/state_handler.py:23-47

Key Behaviors

Behavior	Description
Test ID Isolation	State is keyed by `get_global_conf().get_default_test_id()`
Dual Storage	Data propagates to both `_state_string` and Dynamic LTM
Error Resilience	Returns error dictionary instead of raising exceptions

Sources: testzeus_hercules/core/memory/state_handler.py:30-46

Memory Architecture Diagram

graph LR
    subgraph "Initialization Phase"
        A[Load Config] --> B[StaticLTM Singleton]
        B --> C[Load test_data.txt]
        C --> D[Consolidate Data]
    end
    
    subgraph "Runtime Phase"
        E[Agents Execute Tests]
        E --> F[store_data Tool Call]
        F --> G[State Handler]
        G --> H[DynamicLTM.save_content]
        H --> I[Vector Store Update]
        
        J[Test Query] --> K[RetrieveUserProxyAgent]
        K --> L[Vector Similarity Search]
        L --> M[Context Injection]
        M --> E
    end

Integration with SimpleHercules

The SimpleHercules class coordinates all memory components:

class SimpleHercules:
    def _save_to_memory(self, content: str) -> None:
        """Helper method to save content to memory."""
        config = get_global_conf()
        if not config.should_use_dynamic_ltm():
            return

        if self.memory:
            self.memory.save_content(content)
        else:
            logger.warning("Memory system not initialized")

Sources: testzeus_hercules/core/simple_hercules.py:85-97

Memory Initialization Flow

sequenceDiagram
    participant SH as SimpleHercules
    participant DLT as DynamicLTM
    participant CFG as Config
    participant LOG as Logger

    SH->>CFG: should_use_dynamic_ltm()
    CFG-->>SH: boolean
    SH->>DLT: save_content(content)
    alt LTM Enabled
        DLT->>DLT: Vector store update
        DLT-->>SH: success
    else LTM Disabled
        DLT-->>SH: skipped
    end

Configuration

Global Configuration Methods

Method	Purpose
`should_use_dynamic_ltm()`	Check if Dynamic LTM is enabled
`get_hf_home()`	Get HuggingFace cache directory for vector store
`get_default_test_id()`	Get current test execution identifier

The configuration is managed through testzeus_hercules/config.py, which provides command-line arguments for memory-related settings including:

--reuse-vector-db: Reuse existing vector DB instead of creating fresh one
--sandbox-tenant-id: Python sandbox tenant configuration

Sources: testzeus_hercules/config.py:45-58

Data Flow Summary

Layer	Storage Type	Access Pattern	Persistence
Static LTM	In-memory string	Singleton `get_user_ltm()`	Session-scoped
Dynamic LTM	Vector (ChromaDB)	RetrieveUserProxyAgent	Persistent
State Handler	In-memory dict	Module-level `_state_string`	Test execution-scoped

Error Handling

All memory components implement robust error handling:

try:
    DynamicLTM().save_content(text)
    _state_string[get_global_conf().get_default_test_id()] += text
    return {"message": "Text appended successfully."}
except Exception as e:
    traceback.print_exc()
    logger.error(f"An error occurred while appending to state: {e}")
    return {"error": str(e)}

Sources: testzeus_hercules/core/memory/state_handler.py:30-42

Component	File Path	Role
PlannerAgent	`core/agents/high_level_planner_agent.py`	Consumes memory for test planning
ExecutorNavAgent	`core/agents/executor_nav_agent.py`	Executes test steps with memory context
BaseNavAgent	`core/agents/base_nav_agent.py`	Agent base class with memory integration

Summary

The Memory Management system in testzeus-hercules implements a comprehensive multi-tier approach:

Static LTM provides pre-loaded test data consolidation via singleton pattern
Dynamic LTM offers vector-based retrieval augmented memory for contextual queries
State Handler enables runtime state sharing between agents through the store_data tool

This architecture ensures agents have access to both static test fixtures and dynamic execution context, enabling sophisticated AI-driven browser automation testing.

Sources: [testzeus_hercules/core/memory/static_ltm.py:1-47]()

API Testing

Related topics: Security Testing, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Overview

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section OpenAPI Specification Processing

Continue reading this section for the full explanation and source context.

Related topics: Security Testing, Tool System

API Testing

Overview

API Testing in testzeus-hercules enables automated end-to-end testing of REST APIs through AI-driven agents. The system leverages LLM-powered agents to parse OpenAPI specifications, generate Gherkin test scenarios, execute API calls, and validate responses against expected outcomes. This module integrates with the broader Hercules testing framework to provide comprehensive API validation capabilities alongside browser-based UI testing.

The API Testing feature accepts OpenAPI specification files (YAML or JSON format) and automatically generates executable Gherkin test cases that can be run against live API endpoints. The generated tests follow behavior-driven development (BDD) conventions, making them readable for both technical and non-technical stakeholders.

Architecture

The API Testing module consists of several interconnected components that work together to provide end-to-end API testing capabilities.

Component Overview

graph TD
    A[OpenAPI Specification] --> B[generate_api_functional_gherkin_test.py]
    B --> C[Gherkin Test Cases]
    C --> D[API Navigation Agent]
    D --> E[API Calls Tool]
    E --> F[SQL Calls Tool]
    E --> G[Python Sandbox Executor]
    F --> H[Database Validation]
    G --> I[Custom Logic Validation]
    D --> J[Response Parser]
    J --> K[Test Results]

Core Components

Component	File Path	Purpose
API Navigation Agent	`testzeus_hercules/core/agents/api_nav_agent.py`	Orchestrates API test execution using LLM-driven decision making
API Calls Tool	`testzeus_hercules/core/tools/api_calls.py`	Executes HTTP requests to API endpoints
SQL Calls Tool	`testzeus_hercules/core/tools/sql_calls.py`	Validates API data against database state
Python Sandbox	`testzeus_hercules/core/tools/execute_python_sandbox.py`	Executes custom validation logic
Gherkin Generator	`helper_scripts/generate_api_functional_gherkin_test.py`	Generates test cases from OpenAPI specs
Response Parser	`testzeus_hercules/utils/response_parser.py`	Parses and validates API responses

Sources: helper_scripts/generate_api_functional_gherkin_test.py:1-80

Test Generation Workflow

OpenAPI Specification Processing

The test generation process begins with parsing OpenAPI specification files. The system accepts both YAML and JSON formatted OpenAPI specs through the generate_api_functional_gherkin_test.py helper script.

parser.add_argument(
    "input_files",
    metavar="input_files",
    type=str,
    nargs="+",
    help="One or more OpenAPI spec files (YAML or JSON).",
)

Sources: helper_scripts/generate_api_functional_gherkin_test.py:15-22

Gherkin Test Case Generation

The LLM generates test cases based on the OpenAPI specification content. The generation uses a specialized prompt that instructs the model to produce Gherkin-format scenarios covering functional test cases.

def generate_test_cases(prompt: str, model: str) -> str:
    """Generates test cases using the OpenAI API."""
    client = OpenAI()
    completion = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
    )
    return completion.choices[0].message.content

Sources: helper_scripts/generate_api_functional_gherkin_test.py:82-90

Generation Parameters

Parameter	CLI Flag	Default	Description
Model	`--model`	o1-preview	LLM model for test generation
Output Folder	`--output`	(required)	Destination for generated feature files
Number of Test Cases	`--number_of_testcase`	100	Maximum test cases to generate per endpoint

The API Navigation Agent (api_nav_agent.py) serves as the orchestrator for executing API tests. It receives parsed test scenarios and coordinates execution across multiple tools to validate API behavior.

sequenceDiagram
    participant Test as Test Scenario
    participant Agent as API Nav Agent
    participant API as API Calls Tool
    participant SQL as SQL Calls Tool
    participant Sandbox as Python Sandbox
    
    Test->>Agent: Execute scenario
    Agent->>API: Send HTTP request
    API-->>Agent: Response data
    Agent->>SQL: Validate database state
    SQL-->>Agent: Validation result
    Agent->>Sandbox: Run custom assertions
    Sandbox-->>Agent: Assertion results
    Agent->>Test: Pass/Fail outcome

Sources: testzeus_hercules/core/agents/api_nav_agent.py

Execution Environment

Python Sandbox

API tests execute within a secured Python sandbox environment that provides controlled access to necessary resources while maintaining isolation.

def _get_config_driven_injections(config: Any) -> Dict[str, Any]:
    """
    Get injections defined in configuration.
    Allows dynamic configuration of available modules.
    """
    injections = {}
    
    # Read from config: SANDBOX_PACKAGES="requests,pandas,numpy"
    sandbox_packages = config.get_config().get("SANDBOX_PACKAGES", "").split(",")
    
    for package_name in sandbox_packages:
        package_name = package_name.strip()
        if package_name:
            try:
                injections[package_name] = __import__(package_name)
            except ImportError:
                logger.warning(f"Could not import configured package: {package_name}")
    
    return injections

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:80-100

Sandbox Access Variables

Scripts executing within the sandbox have automatic access to the following variables:

Variable	Type	Description
`page`	Playwright Page	Current browser page context
`browser`	Browser instance	Active browser session
`context`	Browser Context	Isolated browsing context
`playwright_manager`	PlaywrightManager	Manages Playwright lifecycle
`logger`	Logger	Logging utility
`config`	Configuration	Global configuration object

Additional tenant-specific modules can be injected based on the SANDBOX_TENANT_ID environment variable, and custom injections are available via the SANDBOX_CUSTOM_INJECTIONS environment variable.

Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:40-55

Response Handling

JSON Response Parsing

The response parser handles API responses with multiple fallback strategies for extracting structured data:

def parse_response(message: str) -> dict[str, Any]:
    # Check if message is wrapped in ```json ``` blocks
    if "```json" in message:
        start_idx = message.find("```json") + 7
        end_idx = message.find("```", start_idx + 7)
        message = message[start_idx:end_idx]
    else:
        if message.startswith("```"):
            message = message[3:]
        if message.endswith("```):
            message = message[:-3]
        if message.startswith("json"):
            message = message[4:]

    message = message.strip()
    message = message.replace("\\n", "\n")
    
    json_response: dict[str, Any] = json.loads(message)

Sources: testzeus_hercules/utils/response_parser.py:9-35

Error Recovery

When JSON parsing fails, the response parser attempts to extract plan and next_step fields from unstructured responses, ensuring graceful degradation when APIs return non-standard response formats.

Configuration

LLM Configuration

API Testing relies on LLM configuration for test generation and agent decision-making. Configuration can be provided via command-line arguments or through a dedicated configuration file.

parser.add_argument(
    "--llm-model",
    type=str,
    help="Name of the LLM model.",
    required=False,
)
parser.add_argument(
    "--llm-temperature",
    type=float,
    help="Temperature for LLM sampling (0.0-1.0).",
    required=False,
)

Sources: testzeus_hercules/config.py:35-45

Agents LLM Config File

For multi-agent configurations, specify the configuration file path:

--agents-llm-config-file /path/to/agents_llm_config.json
--agents-llm-config-file-ref-key <key_name>

Portkey Integration

Enable Portkey for LLM routing with fallback or load balancing strategies:

--enable-portkey
--portkey-api-key <api_key>
--portkey-strategy fallback|loadbalance

Sources: testzeus_hercules/config.py:60-75

Usage Examples

Generate Gherkin Tests from OpenAPI Spec

python helper_scripts/generate_api_functional_gherkin_test.py \
    spec/openapi.yaml \
    --output tests/api/ \
    --model gpt-4 \
    --number_of_testcase 50

Run API Tests

Tests can be executed through the main Hercules CLI or integrated into CI/CD pipelines. The agent configuration file supports specifying different models for different agents:

{
    "openai": {
        "planner_agent": {
            "model_name": "gpt-4",
            "model_api_type": "openai"
        }
    }
}

Sources: agents_llm_config-example.json

Integration with Browser Testing

The API Testing module integrates seamlessly with browser-based testing capabilities. The API Navigation Agent can coordinate with the Browser Navigation Agent to perform scenarios that span both API validation and UI verification.

When executing multi-step workflows, the system can:

Call API endpoints to set up test data
Launch browsers to verify UI state reflects API changes
Execute SQL queries to validate data persistence
Run custom Python assertions for complex business logic

Security Considerations

API Key Management

API keys should be provided through environment variables or secure configuration management, never hardcoded in test files:

export OPENAI_API_KEY=<your_api_key>
export PORTKEY_API_KEY=<your_portkey_key>

Sandbox Isolation

The Python sandbox provides execution isolation for custom test logic. Configure allowed packages through the SANDBOX_PACKAGES configuration parameter to limit access to only required libraries.

Best Practices

Organize OpenAPI specs by version - Maintain separate specification files for different API versions
Use meaningful test case names - Generated tests should clearly describe the scenario being validated
Combine with database validation - Use SQL Calls Tool to verify data consistency
Leverage response parsing - Use the response parser for handling complex API response formats
Configure appropriate LLM models - Use faster models for generation and more capable models for complex validation logic

CONTRIBUTING.md - Contribution guidelines for the project
Makefile - Build and test automation targets
Browser Testing - Companion documentation for UI testing

Sources: [helper_scripts/generate_api_functional_gherkin_test.py:1-80]()

Security Testing

Related topics: API Testing, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Agent Configuration

Continue reading this section for the full explanation and source context.

Section Execution Flow

Continue reading this section for the full explanation and source context.

Related topics: API Testing, Tool System

Security Testing

Security Testing in TestZeus Hercules is an automated framework designed to validate API security by generating and executing Gherkin-based test scenarios. The system leverages LLM-powered agents to analyze OpenAPI specifications and produce comprehensive security validation tests that check for vulnerabilities, configuration weaknesses, and proper handling of sensitive data.

Overview

The Security Testing module provides an end-to-end solution for validating API security without requiring manual test case authoring. It integrates with the broader Hercules testing framework to execute security validation scenarios alongside functional and navigation tests.

Core Components

Component	File Path	Purpose
Security Navigation Agent	`testzeus_hercules/core/agents/sec_nav_agent.py`	Orchestrates security test execution using LLM-driven agents
API Security Calls	`testzeus_hercules/core/tools/api_sec_calls.py`	Provides low-level HTTP client operations for security validation
Gherkin Test Generator	`helper_scripts/generate_api_security_gherkin_test.py`	Generates security-focused Gherkin test cases from OpenAPI specs

Architecture

graph TD
    A[OpenAPI Spec Files] --> B[generate_api_security_gherkin_test.py]
    B --> C[LLM API - OpenAI]
    C --> D[Security Gherkin Test Cases]
    D --> E[Hercules Test Executor]
    E --> F[sec_nav_agent.py]
    F --> G[api_sec_calls.py]
    G --> H[Target API Endpoints]
    H --> I[Security Validation Results]
    
    J[Configuration] --> F
    K[LLM Config] --> B

The sec_nav_agent.py module implements the BrowserNavAgent pattern specialized for security testing scenarios. It follows the same agent architecture used by the browser navigation agent but focuses on API security validation.

Agent Configuration

The security agent inherits the core agent configuration structure from the Hercules framework, utilizing the same LLM integration patterns as the main navigation agent defined in simple_hercules.py. Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-50

Execution Flow

sequenceDiagram
    participant TestRunner
    participant SecNavAgent
    participant APISecCalls
    participant TargetAPI
    participant ResponseParser
    
    TestRunner->>SecNavAgent: Execute security test scenario
    SecNavAgent->>APISecCalls: Send HTTP request with security payload
    APISecCalls->>TargetAPI: Validated HTTP request
    TargetAPI->>APISecCalls: Response with headers/body
    APISecCalls->>ResponseParser: Parse response data
    ResponseParser->>SecNavAgent: Structured security results
    SecNavAgent->>TestRunner: Security validation report

API Security Calls Module

The api_sec_calls.py module provides the foundational HTTP client capabilities for executing security tests. It supports various HTTP methods and authentication schemes required for comprehensive API security testing.

Supported Security Test Operations

Operation	Description	Authentication Support
GET Security Headers	Validate presence and correctness of security headers	Bearer, API Key, Basic
POST Injection Tests	Execute payload injection for XSS, SQLi validation	Bearer, API Key
Authentication Bypass	Test unauthorized access to protected endpoints	Token validation
Rate Limiting	Verify rate limiting mechanisms	None required
CORS Validation	Check cross-origin resource sharing policies	None required

Sources: testzeus_hercules/core/tools/api_sec_calls.py:1-100

Request Configuration

# Security test request structure
{
    "method": "GET|POST|PUT|DELETE|PATCH",
    "url": "https://api.target.com/endpoint",
    "headers": {
        "Authorization": "Bearer {token}",
        "Content-Type": "application/json"
    },
    "params": {},  # Query parameters
    "data": {},    # Request body for POST/PUT/PATCH
    "timeout": 30,
    "verify_ssl": true
}

Gherkin Test Generation

The generate_api_security_gherkin_test.py helper script uses LLM to automatically generate security-focused Gherkin test cases from OpenAPI specification files.

Input Processing

The generator accepts OpenAPI specifications in both YAML and JSON formats, parsing the specification to identify:

Endpoints and their HTTP methods
Security schemes defined in the spec
Request/response schemas
Authentication requirements

Sources: helper_scripts/generate_api_security_gherkin_test.py:1-60

Generation Prompt Strategy

The LLM prompt instructs the model to focus on generating tests that validate:

Vulnerability Detection: Tests that check for common vulnerabilities
Configuration Weaknesses: Validation of security configurations
Sensitive Data Handling: Verification of proper data protection
Authentication/Authorization: Access control testing
Input Validation: Sanitization and validation checks

Output Format

Generated test cases follow this structure:

Feature: API Security Validation - {Endpoint_Name}
    Scenario: Validate security headers on {method} {path}
        Given the API endpoint "{path}" requires authentication
        When I send a {method} request without authorization
        Then the response should have status code 401
        And the response should include "WWW-Authenticate" header
    
    Scenario: Test for SQL injection vulnerability on {path}
        Given the API endpoint "{path}" accepts query parameters
        When I send a GET request with malicious payload in parameter "id"
        Then the response should have status code 400 or 422
        And no SQL error should be present in response body

Sources: helper_scripts/generate_api_security_gherkin_test.py:80-120

Command Line Interface

Running Security Tests

# Generate security tests from OpenAPI spec
python -m helper_scripts.generate_api_security_gherkin_test \
    --input spec.yaml \
    --output ./tests/security \
    --model gpt-4o \
    --number_of_testcase 50

# Execute security tests with Hercules
testzeus-hercules --input-file ./tests/security/api_security.feature

Generation Script Arguments

Argument	Type	Default	Description
`input_files`	list[str]	required	One or more OpenAPI spec files (YAML or JSON)
`--output`	str	required	Output folder for generated feature files
`--model`	str	o1-preview	OpenAI model for test generation
`--number_of_testcase`	int	100	Number of test cases to generate

Sources: helper_scripts/generate_api_security_gherkin_test.py:30-55

Integration with Hercules Framework

Agent Initialization

The security agent is initialized through the same mechanism as other Hercules agents, using configuration from the LLM configuration file specified via CLI:

testzeus-hercules \
    --input-file security_tests.feature \
    --agents-llm-config-file ./config/security_agent.yaml \
    --llm-model gpt-4o

Execution Context

Security tests execute within the same context as functional tests, providing:

Shared browser/playwright session management
Consistent logging and telemetry
Unified reporting and result aggregation
Access to shared utilities and helpers

Security Test Scenarios

Common Test Categories

Category	Test Focus	Example Validation
Authentication	Token validation, session management	Invalid token returns 401
Authorization	Access control, privilege escalation	User cannot access admin endpoints
Input Validation	Payload sanitization, type checking	Malformed input returns 400
Headers	Security header presence	X-Frame-Options, CSP headers present
Rate Limiting	Request throttling	Excessive requests return 429
CORS	Cross-origin policy	Invalid origins rejected

Best Practices

Test Data Management

Use dedicated security test environments
Isolate security tests from production data
Implement proper cleanup for test artifacts
Rotate API keys/tokens used in tests

Test Coverage

Aim for comprehensive endpoint coverage
Include both positive and negative test cases
Validate all security headers defined in your policy
Test authentication bypass scenarios

Configuration

Environment Variables

Variable	Purpose
`OPENAI_API_KEY`	Required for LLM-powered test generation
`SECURITY_TEST_API_KEY`	API key for testing authenticated endpoints
`SECURITY_TEST_BASE_URL`	Override target API base URL

LLM Configuration

The security agent uses the same LLM configuration structure as other agents, specified through:

# security_agent_config.yaml
llm_config:
  model: gpt-4o
  temperature: 0.7
  max_tokens: 4096

other_settings:
  system_prompt: "You are a security testing expert..."
  max_consecutive_auto_reply: 10

Reporting

Security test results are integrated into the standard Hercules reporting format:

Test pass/fail status per scenario
Detailed assertion results
HTTP request/response logging
Security-specific metrics (headers present, vulnerabilities detected)

Browser Navigation Agent - Core navigation patterns
Simple Hercules - Main framework architecture
API Functional Testing - Functional test generation

Sources: [testzeus_hercules/core/tools/api_sec_calls.py:1-100]()

MCP Integration

Related topics: Agent System, Tool System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Component Relationship

Continue reading this section for the full explanation and source context.

Section Agent Configuration

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Tool System

MCP Integration

Overview

The MCP (Model Context Protocol) Integration in TestZeus Hercules enables the testing agent to discover, catalog, and execute tools exposed by external MCP servers. This integration allows Hercules to extend its capabilities by leveraging tools from multiple Model Context Protocol-compliant servers during end-to-end testing workflows.

MCP serves as a standardized communication layer that allows the testing framework to:

Enumerate and connect to configured MCP servers
List available tools and resource namespaces from each connected server
Execute remote tool calls with correct parameters
Retrieve resources by URI when required for test execution

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-10

Architecture

System Components

The MCP integration is built on three primary components:

Component	File	Purpose
McpNavAgent	`core/agents/mcp_nav_agent.py`	Main navigation agent that orchestrates MCP server interactions
MCPHelper	`utils/mcp_helper.py`	Utility class providing MCP client functionality
MCP Tools	`core/tools/mcp_tools.py`	Tool implementations for MCP operations

Component Relationship

graph TD
    A[TestZeus Hercules Core] --> B[McpNavAgent]
    B --> C[MCPHelper]
    C --> D[MCP Servers]
    
    B --> E[get_configured_mcp_servers]
    B --> F[check_mcp_server_connection]
    B --> G[execute_mcp_tool]
    B --> H[read_mcp_resource]
    
    E --> I[Server Discovery]
    F --> J[Connection Status]
    G --> K[Tool Execution]
    H --> L[Resource Retrieval]

McpNavAgent

The McpNavAgent is the central agent responsible for all MCP-related operations. It inherits from BaseNavAgent and implements the Model Context Protocol interaction patterns.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:6-9

Agent Configuration

Property	Value	Description
`agent_name`	`mcp_nav_agent`	Unique identifier for the agent
Inherits	`BaseNavAgent`	Base navigation agent functionality

Core Functions

The MCP Navigation Agent implements the following core functions:

Server Discovery - Enumerate configured MCP servers and their connection status
Capability Cataloging - List tools and resource namespaces for each connected server
Tool Execution - Call tools with correct parameters and handle responses
Resource Retrieval - Read resources by URI when required
Result Summarization - Capture server, tool, arguments, outputs; include timings and status

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:14-28

Operational Rules

#### Rule 1: Previous Step Validation

Before any new action, explicitly review the previous step and its outcome. Do not proceed if the prior critical step failed; address it first.

graph TD
    A[Execute Action] --> B{Previous Step Succeeded?}
    B -->|No| C[Address Failure First]
    B -->|Yes| D[Continue to Next Action]
    C --> D

#### Rule 2: Server Scan First

The agent must call get_configured_mcp_servers and for each server, call check_mcp_server_connection before taking any other action.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:31-35

Agent Prompt

The agent uses a specialized system prompt that defines its role and behavioral guidelines:

### MCP Navigation Agent

You are an MCP (Model Context Protocol) Navigation Agent that assists the Testing Agent by discovering MCP servers, cataloging their exposed tools/resources, and executing the right tool calls to complete the task. Always begin by scanning all configured servers before taking any action.

Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:11-21

MCPHelper Utility

The MCPHelper class provides the underlying functionality for MCP server interactions. It is exported through the mcp_helper.py module and integrates with the agent system through set_mcp_agents.

Sources: testzeus_hercules/utils/mcp_helper.py

Key Functions

Function	Purpose
`MCPHelper`	Main helper class for MCP operations
`set_mcp_agents`	Configures MCP agents within the testing framework

Configuration

Configuration File Format

MCP servers are configured using a JSON file. The example file mcp_servers.example.json demonstrates the expected format:

{
  "mcpServers": {
    "server_name": {
      "command": "command_to_run",
      "args": ["arg1", "arg2"],
      "env": {
        "KEY": "value"
      }
    }
  }
}

Command Line Arguments

The MCP integration can be configured through command line arguments:

Argument	Type	Description
`--agents-llm-config-file`	string	Path to the agents LLM configuration file
`--agents-llm-config-file-ref-key`	string	Reference key for the agents LLM configuration file

Sources: testzeus_hercules/config.py:27-39

Workflow

Standard MCP Interaction Flow

graph TD
    A[Start Test Execution] --> B[Initialize McpNavAgent]
    B --> C[Call get_configured_mcp_servers]
    C --> D[For Each Server]
    D --> E[Call check_mcp_server_connection]
    E --> F{Server Connected?}
    F -->|No| G[Log Error / Skip Server]
    F -->|Yes| H[Catalog Tools & Resources]
    H --> I[Task Requires MCP Tool?]
    I -->|Yes| J[Call execute_mcp_tool]
    I -->|No| K[Continue with Other Tasks]
    J --> L[Process Tool Response]
    L --> M[Return Results to Testing Agent]
    G --> D
    K --> N[Complete Test]
    M --> N

Tool Execution Workflow

When executing MCP tools, the agent follows this sequence:

Identify the target MCP server
Verify server connection status
Determine the correct tool and parameters
Execute the tool call via MCP protocol
Capture response including timing and status
Return formatted results to the testing agent

Integration with Testing Framework

Agent Hierarchy

graph BT
    A[BrowserNavAgent] --> B[BaseNavAgent]
    C[ApiNavAgent] --> B
    D[SqlNavAgent] --> B
    E[McpNavAgent] --> B
    F[SecNavAgent] --> B
    
    B --> G[TestZeus Hercules Core]

All navigation agents, including McpNavAgent, inherit from BaseNavAgent, ensuring consistent behavior and integration with the core testing framework.

Sources: testzeus_hercules/core/agents/__init__.py

Agent	Purpose
`BrowserNavAgent`	Web browser interaction and navigation
`ApiNavAgent`	API testing and validation
`SqlNavAgent`	Database query execution
`McpNavAgent`	MCP server tool execution
`SecNavAgent`	Security testing operations

Best Practices

Initialization

Always ensure MCP servers are properly configured before test execution
Verify server connectivity before attempting tool calls
Use the configured servers list as the authoritative source of available MCP servers

Error Handling

Check previous step outcomes before proceeding
Log connection failures with server identification
Handle tool execution errors with proper parameter validation
Provide clear error messages when MCP operations fail

Task Focus

Execute only actions required by the primary testing task
Use extra information from MCP responses cautiously
Avoid unnecessary server scans after initial discovery

Security Considerations

The MCP integration supports sensitive operations requiring careful configuration:

API keys should be provided through secure environment variables
Server configurations should be validated before use
Tool execution permissions should be properly scoped
Resource access should follow least-privilege principles

Summary

The MCP Integration module provides TestZeus Hercules with the ability to extend its testing capabilities through external MCP servers. By implementing a dedicated McpNavAgent that follows standardized MCP protocols, the framework can seamlessly discover servers, catalog their capabilities, and execute tools as needed during end-to-end testing scenarios.

Key benefits include:

Extensibility: Add new testing capabilities without modifying core framework code
Standardization: Uses the Model Context Protocol for consistent server communication
Resource Management: Access remote resources via standardized URI-based retrieval
Comprehensive Logging: Captures server status, tool execution times, and results

Sources: [testzeus_hercules/core/agents/mcp_nav_agent.py:1-10]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium 0.1.1

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium README/documentation is current enough for a first validation pass.

The project should not be treated as fully validated until this signal is reviewed.

medium 0.0.40

Users cannot judge support quality until recent activity, releases, and issue response are checked.

medium 0.1.0

Users cannot judge support quality until recent activity, releases, and issue response are checked.

Doramagic Pitfall Log

Doramagic extracted 13 source-linked risk signals. Review them before installing or handing real data to the project.

1. Configuration risk: 0.1.1

Severity: medium
Finding: Configuration risk is backed by a source signal: 0.1.1. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.1

2. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | README/documentation is current enough for a first validation pass.

3. Maintenance risk: 0.0.40

Severity: medium
Finding: Maintenance risk is backed by a source signal: 0.0.40. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.0.40

4. Maintenance risk: 0.1.0

Severity: medium
Finding: Maintenance risk is backed by a source signal: 0.1.0. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.0

5. Maintenance risk: 0.1.2

Severity: medium
Finding: Maintenance risk is backed by a source signal: 0.1.2. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.2

6. Maintenance risk: 0.1.6

Severity: medium
Finding: Maintenance risk is backed by a source signal: 0.1.6. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.6

7. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | last_activity_observed missing

8. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium

9. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium

10. Security or permission risk: 0.1.4

Severity: medium
Finding: Security or permission risk is backed by a source signal: 0.1.4. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.4

11. Security or permission risk: 0.2.2

Severity: medium
Finding: Security or permission risk is backed by a source signal: 0.2.2. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.2.2

12. Maintenance risk: issue_or_pr_quality=unknown

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | issue_or_pr_quality=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 8

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using testzeus-hercules with real data or production workflows.

0.2.2 - github / github_release
0.1.6 - github / github_release
0.1.4 - github / github_release
0.1.2 - github / github_release
0.1.1 - github / github_release
0.1.0 - github / github_release
0.0.40 - github / github_release
README/documentation is current enough for a first validation pass. - GitHub / issue

Source: Project Pack community evidence and pitfall evidence