Doramagic Project Pack · Human Manual
testzeus-hercules
TestZeus Hercules serves as an intelligent testing agent that can:
Getting Started with TestZeus Hercules
TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-in...
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
TestZeus Hercules is an open-source AI-powered end-to-end testing framework that leverages large language models (LLMs) to automate browser testing. It provides both interactive and non-interactive modes for executing automated tests against web applications.
Overview
TestZeus Hercules serves as an intelligent testing agent that can:
- Navigate web pages and interact with UI elements
- Generate and execute Gherkin-style test scenarios
- Perform API security scanning using Nuclei
- Parse accessibility trees for element identification
- Execute Python scripts in a sandboxed environment
Sources: CONTRIBUTING.md
Architecture
graph TD
A[User Input] --> B[CLI / Main Entry]
B --> C[Global Configuration]
C --> D[Navigation Agent]
D --> E[Browser Controller]
E --> F[CDP Stream Renderer]
F --> G[Accessibility Tree]
G --> D
D --> H[Python Sandbox Executor]
H --> I[Test Results]
D --> J[API Security Scanner]
J --> K[Nuclei Integration]Core Components
| Component | Purpose |
|---|---|
testzeus_hercules/__main__.py | Entry point handling bulk test execution |
testzeus_hercules/config.py | Command-line argument parsing and configuration |
testzeus_hercules/telemetry.py | Installation tracking and error reporting |
testzeus_hercules/core/agents/executor_nav_agent.py | Navigation agent for browser automation |
frontend/*/index.html | CDP stream rendering interfaces |
Sources: testzeus_hercules/__main__.py:25-45 Sources: testzeus_hercules/config.py
Installation
Prerequisites
- Python 3.x
- Git
- Make
Setup Steps
- Fork the repository
- Create virtual environment
``bash make virtualenv source .venv/bin/activate ``
- Install in development mode
``bash make install ``
Sources: CONTRIBUTING.md
Command-Line Interface
TestZeus Hercules provides extensive CLI options for configuration.
Basic Options
| Parameter | Type | Description |
|---|---|---|
--input-file | str | Path to the input file |
--output-path | str | Path to the output directory |
--test-data-path | str | Path to the test data directory |
--project-base | str | Path to the project base directory |
Sources: testzeus_hercules/config.py
LLM Configuration
| Parameter | Type | Description |
|---|---|---|
--llm-model | str | Name of the LLM model |
--llm-model-api-key | str | API key for the LLM model |
--llm-model-base-url | str | Base URL for the LLM API |
--llm-model-api-type | str | Type of API (openai, anthropic, azure) |
--llm-temperature | float | Temperature for LLM sampling (0.0-1.0) |
--agents-llm-config-file | str | Path to agents LLM configuration file |
Sources: testzeus_hercules/config.py
Browser Options
| Parameter | Description |
|---|---|
--browser-channel | Browser channel (e.g., chrome-beta, firefox-nightly) |
--browser-path | Custom path to browser executable |
--browser-version | Specific browser version (e.g., '114', '115.0.1', 'latest') |
--enable-ublock | Enable uBlock Origin extension |
--disable-ublock | Disable uBlock Origin extension |
--auto-accept-screen-sharing | Automatically accept screen sharing prompts |
Sources: testzeus_hercules/config.py
Test Execution Options
| Parameter | Description |
|---|---|
--bulk | Execute tests in bulk from tests directory |
--reuse-vector-db | Reuse existing vector DB instead of creating fresh one |
Sources: testzeus_hercules/config.py Sources: testzeus_hercules/__main__.py:45-60
Portkey Integration
| Parameter | Description |
|---|---|
--enable-portkey | Enable Portkey integration for LLM routing |
--portkey-api-key | API key for Portkey |
--portkey-strategy | Routing strategy (fallback or loadbalance) |
Sources: testzeus_hercules/config.py
Sandbox Configuration
| Parameter | Description |
|---|---|
--sandbox-tenant-id | Tenant ID for sandbox isolation |
Sources: testzeus_hercules/config.py
Running TestZeus Hercules
Interactive Mode
Run the interactive CDP stream renderer with user input capabilities:
make run-interactive
This launches the frontend at frontend/interactive/index.html which provides:
- Real-time screencast display
- Crosshair cursor for element selection
- Input capture for typing into the remote page
Sources: frontend/interactive/index.html
Non-Interactive Mode
Run tests without user interaction:
make run
This uses the non-interactive frontend at frontend/non-interactive/index.html which displays:
- Connection status
- Screencast output only
Sources: frontend/non-interactive/index.html
Bulk Execution
Execute multiple tests from a tests directory:
python -m testzeus_hercules --bulk
The system checks for a tests directory in the project source root and processes each test folder:
if get_global_conf().should_execute_bulk():
project_base = get_global_conf().get_project_source_root()
tests_dir = os.path.join(project_base, "tests")
Sources: testzeus_hercules/__main__.py:45-55
Response Parsing
TestZeus Hercules includes a robust response parser for handling LLM outputs:
graph LR
A[LLM Response] --> B{Is JSON?}
B -->|Yes| C[Parse JSON]
B -->|No| D[Extract Plan/Next Step]
C --> E[Return Dict]
D --> EThe parser handles:
- JSON wrapped in ```json code blocks
- Plain JSON responses
- Fallback extraction for
planandnext_stepfields
Sources: testzeus_hercules/utils/response_parser.py
Telemetry and Installation Tracking
On first run, TestZeus Hercules generates a unique installation ID:
def get_installation_id(file_path: str = "installation_id.txt", is_manual_run: bool = True):
if os.path.exists(file_path):
# Load existing installation data
else:
# Generate new installation ID
installation_id = str(uuid.uuid4())
Sources: testzeus_hercules/telemetry.py
Development Workflow
Code Quality
| Command | Purpose |
|---|---|
make fmt | Format code using black & isort |
make lint | Run pep8, black, mypy linters |
make test | Run tests and generate coverage report |
make watch | Run tests on every change |
Sources: CONTRIBUTING.md
Testing Requirements
- Code coverage must show
100%coverage - Add tests for all changes in your PR
make test
Sources: CONTRIBUTING.md
Release Process
- Make changes following the contribution guidelines
- Commit using conventional commit messages
- Run tests to ensure everything works
- Execute
make releaseto create a new tag and push
CAUTION: The make release will change local changelog files and commit all unstaged changes.
Sources: CONTRIBUTING.md
Navigation Agent Execution
The executor navigation agent follows specific guidelines:
Execution Principles
- Error Review: Review previous step outcomes before proceeding
- Script Execution: Use
execute_python_sandboxtool with access topage,browser,context,playwright_manager,logger,config - Sequential Execution: Execute one script at a time and await results
- Validation: Check for successful execution status before proceeding
Sources: testzeus_hercules/core/agents/executor_nav_agent.py
API Security Scanning
TestZeus Hercules integrates with Nuclei for API security testing:
async def run_nuclei_command(
is_open_api_spec: bool,
open_api_spec_path: Optional[str],
target_url: Optional[str],
tag: str,
output_file: Path,
headers: Optional[List[Tuple[str, str]]] = None,
):
Sources: testzeus_hercules/core/tools/api_sec_calls.py
Helper Scripts
CDP Journey Script
Generate test cases from journey data:
python helper_scripts/cdp_journey_script.py --number_of_testcase 5
This produces Gherkin specifications and test data files from JSON journey definitions.
Sources: helper_scripts/cdp_journey_script.py
API Functional Gherkin Test Generator
Generate Gherkin test cases from OpenAPI specifications:
python helper_scripts/generate_api_functional_gherkin_test.py spec.yaml --output ./features --number_of_testcase 100
Sources: helper_scripts/generate_api_functional_gherkin_test.py
Accessibility Tree Processing
TestZeus Hercules processes DOM elements to generate accessibility trees:
- Identifies interactive elements (buttons, links, inputs)
- Detects draggable elements
- Filters out non-interactive elements
- Provides detailed element metadata for the AI agent
Sources: testzeus_hercules/utils/get_detailed_accessibility_tree.py
Summary
TestZeus Hercules provides a comprehensive end-to-end testing solution with:
- AI-powered browser automation via LLM integration
- Flexible deployment (interactive and non-interactive modes)
- Extensive CLI configuration options
- Built-in support for bulk test execution
- API security scanning capabilities
- Gherkin test generation from various sources
Sources: CONTRIBUTING.md
Sources: [CONTRIBUTING.md]()
System Architecture
Related topics: Agent System, Tool System
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent System, Tool System
System Architecture
Overview
TestZeus Hercules is an open-source AI agent framework designed for end-to-end testing of web applications. The system leverages Large Language Models (LLMs) to orchestrate browser automation through Playwright, enabling natural language-driven test execution without requiring users to write traditional test scripts.
The architecture follows a modular design pattern with clear separation between:
- Core execution engine
- Agent-based navigation and task handling
- Python sandbox environment for script execution
- Frontend visualization components
- Configuration and telemetry systems
Sources: testzeus_hercules/__main__.py
Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual
Agent System
Related topics: System Architecture, Memory Management, LLM Configuration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Memory Management, LLM Configuration
Agent System
Overview
The Agent System is the core orchestration layer of the Hercules testing framework. It implements a multi-agent architecture where specialized agents collaborate to execute end-to-end testing scenarios across web browsers, APIs, databases, and other system components.
Sources: testzeus_hercules/core/agent_registry.py:1-50
Architecture Overview
The system follows a hierarchical agent design where a central planner coordinates specialized navigation agents, each responsible for a specific domain of interaction.
graph TD
A[HighLevelPlannerAgent] --> B[BrowserNavAgent]
A --> C[ApiNavAgent]
A --> D[SqlNavAgent]
A --> E[McpNavAgent]
A --> F[SecNavAgent]
A --> G[TimeKeeperNavAgent]
B --> H[ExecutorNavAgent]
C --> H
D --> H
E --> H
F --> H
G --> H
H --> I[Browser/API/SQL/MCP/Security]Sources: testzeus_hercules/core/simple_hercules.py:1-100
Agent Types
High-Level Planner Agent
The HighLevelPlannerAgent serves as the central coordinator that receives high-level test instructions and decomposes them into executable steps for specialized agents.
Key Responsibilities:
- Parsing test instructions and generating execution plans
- Routing tasks to appropriate specialized agents
- Aggregating results and handling test completion
- Managing assertions and validating expected outcomes
Sources: testzeus_hercules/core/agents/high_level_planner_agent.py:1-80
Browser Navigation Agent
The BrowserNavAgent handles all browser-based interactions including page navigation, element interaction, and DOM manipulation.
Capabilities:
- Web page navigation and URL handling
- Element clicking and text input
- Screenshot capture and visual validation
- Cookie and session management
Sources: testzeus_hercules/core/agents/browser_nav_agent.py:1-100
API Navigation Agent
The ApiNavAgent manages HTTP-based interactions for testing RESTful APIs and web services.
Capabilities:
- HTTP request construction and execution
- Response validation and assertion
- Authentication handling (OAuth, API keys, Bearer tokens)
- Multi-step API workflows
Sources: testzeus_hercules/core/agents/api_nav_agent.py:1-100
SQL Navigation Agent
The SqlNavAgent handles database interactions for data validation and setup during test execution.
Capabilities:
- SQL query execution
- Database connection management
- Result set validation
- Test data preparation and teardown
Sources: testzeus_hercules/core/agents/sql_nav_agent.py:1-100
MCP Navigation Agent
The McpNavAgent provides Model Context Protocol integration for interacting with external AI models and tools.
Capabilities:
- MCP server connection management
- Tool invocation through MCP protocol
- Context propagation for AI-assisted testing
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-100
Security Navigation Agent
The SecNavAgent handles security-related testing scenarios including authentication flows, authorization checks, and vulnerability scanning.
Capabilities:
- Authentication flow testing
- Session security validation
- Authorization boundary testing
- Security header verification
Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-100
Time Keeper Navigation Agent
The TimeKeeperNavAgent manages time-related test scenarios including scheduling, delays, and time-based assertions.
Capabilities:
- Time-based test scheduling
- Delay and timeout management
- Timestamp validation
- Scheduled task execution
Sources: testzeus_hercules/core/agents/time_keeper_nav_agent.py:1-100
Executor Navigation Agent
The ExecutorNavAgent serves as the execution engine that runs Python scripts and commands within a sandboxed environment.
Key Features:
- Python script execution in isolated sandbox
- Dynamic module injection based on tenant configuration
- Access to browser context, page objects, and configuration
- Custom injection support for tenant-specific utilities
Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-150
Agent Registry
The AgentRegistry provides a centralized registration and lookup mechanism for all agents in the system.
Registry Operations
| Operation | Description |
|---|---|
register_agent(name, agent) | Register a new agent with a unique name |
get_agent(name) | Retrieve an agent by name |
list_agents() | List all registered agents |
remove_agent(name) | Remove an agent from the registry |
Sources: testzeus_hercules/core/agent_registry.py:50-100
Agent Creation Flow
The SimpleHercules class orchestrates agent creation with the following workflow:
sequenceDiagram
participant SH as SimpleHercules
participant Planner as HighLevelPlannerAgent
participant Nav as Navigation Agents
participant Exec as ExecutorNavAgent
SH->>SH: Initialize configuration
SH->>Planner: Create planner agent
SH->>Nav: Create navigation agents (Browser, API, SQL, etc.)
SH->>Exec: Create executor agent
SH->>SH: Register all agents in registry
Planner->>Nav: Route tasks based on type
Nav->>Exec: Execute concrete actionsSources: testzeus_hercules/core/simple_hercules.py:100-200
Message Flow
Agents communicate through a structured message passing system with the following message types:
| Message Type | Purpose |
|---|---|
PLAN | Initial test plan and steps |
STEP | Individual test step execution |
INFO | Informational messages |
ASSERT | Assertion results |
COMPLETED | Task completion notification |
TERMINATED | Agent termination signal |
Sources: testzeus_hercules/core/simple_hercules.py:200-300
Configuration
LLM Model Configuration
Each agent supports individual LLM model configuration:
| Parameter | Type | Description |
|---|---|---|
model | string | Model name (e.g., gpt-4, claude-3) |
temperature | float | Sampling temperature (0.0-1.0) |
max_tokens | int | Maximum response tokens |
api_key | string | API authentication key |
base_url | string | Custom API endpoint URL |
Sources: testzeus_hercules/config.py:1-80
Agent-Specific Settings
# Example agent configuration structure
agent_config = {
"model_config_params": {
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 2000
},
"llm_config_params": {
"timeout": 60,
"retry_attempts": 3
},
"other_settings": {
"system_prompt": "You are a testing agent...",
"max_chat_rounds": 10
}
}
Response Parsing
The system uses parse_response() from the response parser module to extract structured data from agent outputs:
def parse_response(message: str) -> dict[str, Any]:
# Handles JSON extraction from markdown code blocks
# Normalizes newlines and whitespace
# Extracts plan and next_step fields
Sources: testzeus_hercules/utils/response_parser.py:1-60
Sandbox Execution
The ExecutorNavAgent provides a secure Python execution environment with configurable module injection:
Available Injections
| Module | Description |
|---|---|
playwright | Browser automation library |
requests | HTTP client library |
beautifulsoup4 | HTML parsing |
hercules_utils | Project utility functions |
| Custom packages | Configured via SANDBOX_PACKAGES |
Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:1-100
Sandbox Context Variables
Scripts executed in the sandbox have automatic access to:
| Variable | Type | Description |
|---|---|---|
page | Playwright Page | Current browser page |
browser | Playwright Browser | Browser instance |
context | Playwright Context | Browser context |
playwright_manager | Manager | Playwright management |
logger | Logger | Logging interface |
config | Config | Global configuration |
Command-Line Interface
The agent system supports configuration via command-line arguments:
| Argument | Description |
|---|---|
--llm-model | Specify LLM model name |
--llm-temperature | Set sampling temperature |
--agents-llm-config-file | Path to agent config file |
--enable-portkey | Enable Portkey routing |
--browser-channel | Browser channel selection |
--reuse-vector-db | Reuse existing vector database |
Sources: testzeus_hercules/config.py:80-150
Error Handling
Agents implement robust error handling with:
- Termination Message Check: Each agent validates termination conditions via
is_xxx_termination_message()functions - Tool Call Monitoring: Tracks pending tool calls to prevent premature termination
- Graceful Degradation: Continues execution with alternative approaches on failure
- Logging: Comprehensive logging for debugging and audit trails
Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual
Browser Automation
Related topics: Tool System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Tool System
Browser Automation
Overview
Browser Automation in TestZeus Hercules provides a comprehensive framework for controlling web browsers through Playwright, enabling autonomous agents to perform complex web interactions, testing, and data extraction tasks. The system acts as a bridge between LLM-powered agents and real browser instances, translating natural language instructions into precise DOM manipulations.
The automation layer handles multiple browser types (Chromium, Firefox, WebKit), manages browser contexts with device emulation, supports cloud-based testing platforms via CDP (Chrome DevTools Protocol) tunneling, and provides sophisticated DOM interaction tools including accessibility-aware element selection and real-time mutation observation.
Sources: testzeus_hercules/core/playwright_manager.py:1-50
Architecture
System Components
graph TD
A[SimpleHercules Core] --> B[PlaywrightManager]
B --> C[Browser Context]
C --> D[Browser Instance<br/>Chromium|Firefox|WebKit]
B --> E[Tool Registry]
E --> F[Navigation Tools]
E --> G[Interaction Tools]
E --> H[Extraction Tools]
B --> I[DOM Mutation Observer]
B --> J[BrowserLogger]
J --> K[Interaction Logs]
K --> L[Accessibility Tree]
L --> M[AccessibilityInfo]Core Components
| Component | File | Purpose |
|---|---|---|
| PlaywrightManager | core/playwright_manager.py | Central browser lifecycle management |
| BrowserLogger | core/browser_logger.py | Interaction logging and proof generation |
| DOMHelper | utils/dom_helper.py | DOM state management and waiting |
| AccessibilityTree | utils/get_detailed_accessibility_tree.py | Extract and format accessibility information |
| ToolRegistry | core/tools/tool_registry.py | Dynamic tool registration and routing |
Sources: testzeus_hercules/core/playwright_manager.py Sources: testzeus_hercules/utils/dom_helper.py
Source: https://github.com/test-zeus-ai/testzeus-hercules / Human Manual
Tool System
Related topics: Browser Automation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Browser Automation
Tool System
Overview
The Tool System is the core execution layer of TestZeus Hercules, providing AI agents with capabilities to interact with web pages through a unified, decorator-based interface. The system abstracts browser automation operations (clicking, typing, hovering, dragging, etc.) into discrete, callable tools that agents can invoke during test execution.
Tools serve as the fundamental building blocks that bridge natural language agent instructions with Playwright browser automation. Each tool encapsulates a specific browser interaction pattern, handles error cases gracefully, and returns structured results that agents can parse and respond to. Sources: testzeus_hercules/core/tools/click_using_selector.py:1-50
Architecture
System Components
graph TD
subgraph "Agent Layer"
A["Browser Nav Agent"]
B["Executor Nav Agent"]
end
subgraph "Tool Registry"
C["tool_registry.py"]
D["Tool Decorator"]
end
subgraph "Core Browser Tools"
E["click_using_selector"]
F["enter_text_using_selector"]
G["hover"]
H["drag_and_drop_tool"]
I["get_interactive_elements"]
end
subgraph "Extra Tools"
J["browser_assist_tools"]
K["accessibility_calls"]
L["upload_file"]
end
subgraph "Browser Automation"
M["Playwright Manager"]
N["Page Object"]
end
A --> C
B --> C
C --> E
C --> F
C --> G
C --> H
C --> I
C --> J
J --> K
J --> L
E --> M
F --> M
G --> M
M --> NTool Categories
| Category | Purpose | Location | Example Tools |
|---|---|---|---|
| Core Browser Tools | Primary page interactions | testzeus_hercules/core/tools/ | click, enter_text, hover, drag_drop |
| Extra Tools | Auxiliary functionality | testzeus_hercules/core/extra_tools/ | accessibility, browser_assist |
| Upload Tools | File handling | testzeus_hercules/core/tools/ | upload_file |
Tool Definition Pattern
The `@tool` Decorator
All tools in the system are defined using the @tool decorator, which registers the function with the tool registry and provides metadata for agent consumption.
from functools import wraps
from typing import Annotated, Any, Dict, List, Optional
def tool(agent_names: List[str], description: str, name: str):
"""Decorator to register a function as a callable tool."""
def decorator(func):
@wraps(func)
async def wrapper(*args, **kwargs):
return await func(*args, **kwargs)
wrapper._tool_config = {
"agent_names": agent_names,
"description": description,
"name": name,
}
return wrapper
return decorator
Tool Registration Metadata
| Parameter | Type | Description | Required |
|---|---|---|---|
agent_names | List[str] | List of agent identifiers that can call this tool | Yes |
description | str | Human-readable description for the LLM | Yes |
name | str | Unique identifier for the tool | Yes |
Example usage:
@tool(
agent_names=["browser_nav_agent"],
description="Click on an element using selector",
name="click"
)
async def click_element(selector: Annotated[str, "CSS selector"]) -> dict:
# Implementation
pass
Sources: testzeus_hercules/core/tools/click_using_selector.py:1-30
Core Browser Tools
Click Tool (`click_using_selector`)
The primary interaction tool for clicking on page elements.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="used to click on an element in the DOM.",
name="click"
)
async def click_using_selector(
selector: Annotated[
str,
"md attribute value of the dom element to interact, md is an ID"
],
click_type: Annotated[
Optional[str],
"type of click - left, right, double, mouseover, mouseenter, mouseleave, mouseexit, mousedown, mouseup"
] = "left",
timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the click operation"]
Execution Flow:
sequenceDiagram
participant Agent
participant ClickTool
participant PlaywrightManager
participant Page
participant SelectorLogger
Agent->>ClickTool: click_using_selector(selector, click_type)
ClickTool->>PlaywrightManager: find_element(selector)
PlaywrightManager->>Page: locator(selector).first
Page-->>PlaywrightManager: ElementHandle
PlaywrightManager-->>ClickTool: element
alt Element not found
ClickTool->>SelectorLogger: log_selector_interaction(success=False)
ClickTool-->>Agent: Error response
end
ClickTool->>Page: element.scroll_into_view_if_needed()
ClickTool->>Page: element.is_visible()
alt Element not visible
ClickTool-->>Agent: Try another element response
end
ClickTool->>SelectorLogger: get_alternative_selectors()
ClickTool->>SelectorLogger: get_element_attributes()
ClickTool->>Page: element.click(click_type)
alt Click success
ClickTool->>SelectorLogger: log_selector_interaction(success=True)
ClickTool-->>Agent: Success response
else Click fails
ClickTool->>SelectorLogger: log_selector_interaction(success=False)
ClickTool-->>Agent: Error response
endError Handling:
The click tool performs comprehensive error handling:
- Element Not Found: Logs selector interaction with
success=Falseand raisesValueError - Element Not Visible: Returns alternative suggestion response
- Scroll Failure: Gracefully continues (non-blocking)
- Click Execution Failure: Logs failure and returns error dictionary
if element is None:
await selector_logger.log_selector_interaction(
tool_name="click",
selector=selector,
action=type_of_click,
selector_type="css" if "md=" in selector else "custom",
success=False,
error_message=f'Element with selector: "{selector}" not found',
)
raise ValueError(f'Element with selector: "{selector}" not found')
Sources: testzeus_hercules/core/tools/click_using_selector.py:40-80
Enter Text Tool (`enter_text_using_selector`)
Handles text input into form fields and contenteditable elements.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="used to enter the given text into an input field or a contenteditable element.",
name="enter_text"
)
async def enter_text_using_selector(
selector: Annotated[str, "md attribute value of the dom element to interact"],
text_to_fill: Annotated[str, "text to enter into the element"],
submit: Annotated[Optional[bool], "whether to submit after entering text"] = False,
press_enter_after_input: Annotated[bool, "press Enter key after filling text"] = False,
) -> Annotated[Dict[str, str], "Result dictionary with summary and details"]
Hover Tool (`hover`)
Moves mouse cursor over an element without clicking.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="used to hover over an element in the DOM.",
name="hover"
)
async def hover_using_selector(
selector: Annotated[str, "md attribute value of the dom element to interact"],
timeout: Annotated[int, "Timeout in milliseconds"] = 30000,
) -> Annotated[Dict[str, Any], "Result of the hover operation"]
Drag and Drop Tool (`drag_and_drop_tool`)
Performs HTML5 drag-and-drop operations between elements.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="used to drag and drop an element.",
name="drag_and_drop"
)
async def drag_and_drop(
source: Annotated[str, "md attribute value of source element"],
target: Annotated[str, "md attribute value of target element"],
) -> Annotated[Dict[str, str], "Result of the drag and drop operation"]
Get Interactive Elements (`get_interactive_elements`)
Retrieves all interactive elements from the current page DOM.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="Get interactive elements from the current page",
name="get_interactive_elements"
)
async def get_interactive_elements() -> Annotated[str, "JSON string of interactive elements"]
DOM Processing Logic:
The tool executes JavaScript in the browser context to identify interactive elements:
const isInteractive = (element) => {
// Check input-related elements
const inputRelatedTags = new Set(['input', 'textarea', 'select', ...]);
const interactiveRoles = new Set(['button', 'link', 'checkbox', ...]);
// Check for ARIA attributes
const hasAriaProps = element.hasAttribute('aria-haspopup') ||
element.hasAttribute('aria-expanded') ||
element.hasAttribute('aria-checked');
// Check cursor style
const style = window.getComputedStyle(element);
const hasPointerCursor = style.cursor === 'pointer';
// Check draggable attribute
const isDraggable = element.draggable;
// Skip body and its direct children
if (element.tagName.toLowerCase() === 'body') return false;
return hasAriaProps || hasClickHandler || isDraggable;
};
Sources: testzeus_hercules/core/tools/get_interactive_elements.py:1-50
Upload File Tool (`upload_file`)
Handles file upload dialogs by setting file paths on input elements.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="Upload a file to the page",
name="upload_file"
)
async def upload_file(
selector: Annotated[str, "md attribute value of file input element"],
file_path: Annotated[str, "Path to the file to upload"],
) -> Annotated[Dict[str, str], "Result of the upload operation"]
Extra Tools System
Dynamic Tool Loading
The extra tools are loaded dynamically at runtime using Python's pkgutil module. This allows the system to extend functionality without modifying core code.
Loading Mechanism:
# testzeus_hercules/core/extra_tools/__init__.py
import importlib
import pkgutil
from pathlib import Path
from testzeus_hercules.config import get_global_conf
package_path = Path(__file__).parent
if get_global_conf().get_load_extra_tools().lower().strip() != "false":
for _, module_name, _ in pkgutil.iter_modules([str(package_path)]):
full_module_name = f"testzeus_hercules.core.extra_tools.{module_name}"
module = importlib.import_module(full_module_name)
# Export all public attributes to current namespace
for attribute_name in dir(module):
if not attribute_name.startswith("_"):
globals()[attribute_name] = getattr(module, attribute_name)
Configuration:
Extra tools can be disabled via configuration:
python -m testzeus_hercules --load-extra-tools=false
Sources: testzeus_hercules/core/extra_tools/__init__.py:1-25
Accessibility Calls (`accessibility_calls`)
Provides accessibility testing using the axe-core library.
Function Signature:
@tool(
agent_names=["browser_nav_agent"],
description="Check page accessibility using axe-core",
name="check_accessibility"
)
async def check_accessibility(page_path: Annotated[str, "Page path or URL"]) -> str
Execution Flow:
- Fetches axe-core script from CDN
- Injects script into page
- Runs
axe.run()in browser context - Collects violations and incomplete checks
- Logs results in structured format
// Inject axe-core
await page.evaluate(
`fetch("${AXE_SCRIPT_URL}").then(res => res.text())`
);
// Run accessibility checks
const results = await page.evaluate("async () => { return await axe.run(); }");
Response Format:
{
"status": "success|failure",
"message": "Human readable summary",
"details": ["failure summary 1", "failure summary 2"]
}
Sources: testzeus_hercules/core/tools/accessibility_calls.py:1-80
Tool Return Format
All tools return a standardized dictionary structure:
| Field | Type | Description |
|---|---|---|
summary_message | str | Brief status message for agent consumption |
detailed_message | str | Extended information including errors if any |
status | str | Operation status (success/failure) |
Success Response Example:
return {
"summary_message": f'Successfully clicked element with selector: "{selector}"',
"detailed_message": f'Element with selector: "{selector}" clicked successfully. Tag: {element_tag_name}',
}
Error Response Example:
return {
"summary_message": f'Element with selector: "{selector}" is not visible, Try another element',
"detailed_message": f'Element with selector: "{selector}" is not visible, Try another element',
}
Selector Logging System
Every tool interaction is logged using the SelectorLogger for proof generation and debugging.
Logging Interface
from testzeus_hercules.utils.browser_logger import get_browser_logger
selector_logger = get_browser_logger(get_global_conf().get_proof_path())
# Log successful interaction
await selector_logger.log_selector_interaction(
tool_name="click",
selector=selector,
action="left",
selector_type="css",
success=True,
)
# Log failed interaction
await selector_logger.log_selector_interaction(
tool_name="click",
selector=selector,
action="left",
selector_type="css",
success=False,
error_message="Element not found",
)
Captured Data
The logger captures:
- Tool name and action type
- Selector used
- Selector type (CSS vs custom)
- Success/failure status
- Alternative selectors for the element
- Element attributes
- Error messages on failure
Agent Integration
Browser Nav Agent
The primary agent that consumes browser tools. It receives natural language instructions and translates them into tool calls.
Tool Invocation Pattern:
# From executor_nav_agent.py
async def execute_task(self, instruction: str):
# Agent decides which tool to call
# Tool receives selector from instruction
result = await click_using_selector(
selector="[md='submit-button']",
click_type="left"
)
# Agent processes result
if "error" in result.get("summary_message", "").lower():
# Handle error or try alternative
Executor Nav Agent
Handles script execution and Python sandbox operations:
Script Execution Context:
| Variable | Type | Description |
|---|---|---|
page | Page | Playwright page object |
browser | Browser | Playwright browser instance |
context | BrowserContext | Browser context |
playwright_manager | PlaywrightManager | Manager instance |
logger | Logger | Logging utility |
config | GlobalConf | Global configuration |
Sources: testzeus_hercules/core/agents/executor_nav_agent.py:1-50
Bulk Operations
The tool system supports bulk execution for batch operations:
Bulk Slider Tool
@tool(
agent_names=["browser_nav_agent"],
description="used to set slider values in multiple sliders in single attempt.",
name="bulk_set_slider"
)
async def bulk_set_slider(
entries: Annotated[
List[List[str]],
"List of [selector, value] pairs",
],
) -> Annotated[List[Dict[str, str]], "List of results"]
Configuration
Command Line Arguments
The tool system behavior can be configured via CLI:
| Argument | Type | Description |
|---|---|---|
--llm-model | str | LLM model for agent decisions |
--llm-temperature | float | LLM sampling temperature (0.0-1.0) |
--agents-llm-config-file | str | Path to agents LLM config file |
--enable-portkey | flag | Enable Portkey LLM routing |
--portkey-api-key | str | Portkey API key |
--browser-channel | str | Browser channel (chrome-beta, etc.) |
--browser-version | str | Specific browser version |
--enable-ublock | flag | Enable uBlock Origin extension |
--load-extra-tools | str | Load extra tools (default: true) |
Sources: testzeus_hercules/config.py:1-100
Summary
The Tool System provides:
- Unified Interface: All browser interactions follow the
@tooldecorator pattern - Agent Compatibility: Tools specify which agents can invoke them
- Error Resilience: Graceful handling of element not found, not visible, and execution failures
- Proof Generation: Comprehensive logging of all selector interactions
- Extensibility: Dynamic loading of extra tools without core modifications
- Standardized Results: Consistent return format across all tools
This architecture enables AI agents to reliably control browser automation while maintaining clean separation of concerns and testable component boundaries.
Sources: [testzeus_hercules/core/tools/click_using_selector.py:1-30]()
LLM Configuration
Related topics: Agent System, Memory Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent System, Memory Management
LLM Configuration
Overview
The LLM Configuration system in testzeus-hercules provides a flexible, multi-provider framework for configuring Large Language Models across different agent types. This system enables the framework to support various LLM providers (OpenAI, Anthropic, Ollama, Azure) while maintaining provider-specific configurations through a centralized registry mechanism.
The configuration architecture supports:
- Multiple LLM providers simultaneously
- Per-agent model selection
- Dynamic parameter adaptation based on model family
- External configuration file loading
- Environment variable integration
Sources: testzeus_hercules/core/agents_llm_config.py:1-20
Architecture
High-Level Component Architecture
graph TD
A[CLI Arguments / Environment] --> B[Global Config]
C[agents-llm-config-file.json] --> D[AgentsLLMConfig]
D --> E[AgentRegistry]
E --> F[Provider Configs]
F --> G[planner_agent]
F --> H[nav_agent]
F --> I[mem_agent]
F --> J[helper_agent]
B --> K[Model Utils]
K --> L[adapt_llm_params_for_model]
L --> G
L --> H
L --> I
L --> JConfiguration Flow
sequenceDiagram
participant CLI as CLI Arguments
participant Config as Global Config
participant JSON as LLM Config File
participant Processor as AgentsLLMConfig
participant Registry as AgentRegistry
participant Agent as SimpleHercules
CLI->>Config: Parse --llm-* arguments
JSON->>Processor: load_from_file()
Processor->>Registry: register_provider()
Registry->>Registry: Store configs per provider
Config->>Agent: Pass model configs
Agent->>ModelUtils: adapt_llm_params_for_model()
ModelUtils->>Agent: Adapted LLM paramsSources: testzeus_hercules/core/agents_llm_config.py:25-50
Core Components
AgentRegistry
The AgentRegistry manages configurations for multiple providers and supports switching between them.
class AgentRegistry:
def __init__(self) -> None:
self._providers: Dict[str, Dict[str, AgentConfig]] = {}
self._active_provider: Optional[str] = None
| Method | Purpose |
|---|---|
register_provider(provider_key, configs) | Register agent configs for a provider |
get_active_provider() | Retrieve currently active provider configuration |
set_active_provider(provider_key) | Switch active provider |
get_all_providers() | List all registered providers |
Sources: testzeus_hercules/core/agents_llm_config.py:20-35
AgentsLLMConfig
The main configuration processor that handles loading and normalization of agent configurations.
class AgentsLLMConfig:
def __init__(self) -> None:
self.registry = AgentRegistry()
def load_from_file(self, file_path: str, provider_key: Optional[str] = None) -> None
def normalize_agent_config(self, agent_config: Dict[str, Any]) -> AgentConfig
Sources: testzeus_hercules/core/agents_llm_config.py:40-70
Agent Types
The framework defines four specialized agent roles, each with configurable LLM parameters:
| Agent Type | Purpose | Typical Model Requirements |
|---|---|---|
planner_agent | Strategic task planning and decomposition | High reasoning capability |
nav_agent | Browser navigation and UI interaction | Vision-capable, fast response |
mem_agent | Memory and context management | Balanced performance |
helper_agent | Utility functions and data processing | Variable based on task |
Sources: testzeus_hercules/core/simple_hercules.py:1-30
Configuration File Format
JSON Schema Structure
{
"provider_name": {
"agent_type": {
"model_name": "string",
"model_api_key": "string",
"model_api_type": "openai|anthropic|azure|ollama",
"model_client_host": "string (optional)",
"model_native_tool_calls": true|false,
"model_hide_tools": "if_any_run|user|never",
"llm_config_params": {
"cache_seed": null|integer,
"temperature": 0.0,
"seed": 12345,
"max_tokens": 4096,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"stop": []
}
}
}
}
Example Configuration
{
"openai": {
"planner_agent": {
"model_name": "gpt-4o",
"model_api_key": "${OPENAI_API_KEY}",
"model_api_type": "openai",
"llm_config_params": {
"cache_seed": null,
"temperature": 0.0,
"seed": 12345
}
}
},
"anthropic": {
"nav_agent": {
"model_name": "claude-3-5-haiku-latest",
"model_api_key": "",
"model_api_type": "anthropic",
"llm_config_params": {
"cache_seed": null,
"temperature": 0.0
}
}
}
}
Sources: agents_llm_config-example.json:1-60
CLI Configuration Options
The framework provides comprehensive command-line arguments for LLM configuration:
Basic LLM Parameters
| Argument | Type | Description |
|---|---|---|
--llm-model | string | Name of the LLM model |
--llm-model-api-key | string | API key for the LLM model |
--llm-model-base-url | string | Base URL for the LLM API |
--llm-model-api-type | string | API type (openai, anthropic, azure, etc.) |
--llm-temperature | float | Temperature for LLM sampling (0.0-1.0) |
LLM Configuration File Options
| Argument | Type | Description |
|---|---|---|
--agents-llm-config-file | string | Path to the agents LLM configuration file |
--agents-llm-config-file-ref-key | string | Reference key for selecting provider |
Parameter Configuration
| Argument | Type | Description |
|---|---|---|
LLM_MODEL_PRICING | env | Model pricing for cost tracking |
LLM_MODEL_TEMPERATURE | env | Default temperature setting |
LLM_MODEL_CACHE_SEED | env | Cache seed for reproducible results |
LLM_MODEL_SEED | env | Random seed for generation |
LLM_MODEL_MAX_TOKENS | env | Maximum tokens in response |
LLM_MODEL_PRESENCE_PENALTY | env | Presence penalty (-2.0 to 2.0) |
LLM_MODEL_FREQUENCY_PENALTY | env | Frequency penalty (-2.0 to 2.0) |
LLM_MODEL_STOP | env | Stop sequences |
Sources: testzeus_hercules/config.py:1-100
Portkey Integration
The framework supports Portkey for advanced LLM routing with fallback and load balancing capabilities.
Portkey Configuration Options
| Argument | Type | Description |
|---|---|---|
--enable-portkey | flag | Enable Portkey integration |
--portkey-api-key | string | API key for Portkey |
--portkey-strategy | choice | Routing strategy: fallback or loadbalance |
Environment Variables
| Variable | Description |
|---|---|
ENABLE_PORTKEY | Enable/disable Portkey |
PORTKEY_API_KEY | Portkey API key |
PORTKEY_STRATEGY | Routing strategy |
PORTKEY_CACHE_ENABLED | Enable response caching |
PORTKEY_TARGETS | Target models for routing |
PORTKEY_GUARDRAILS | Enable safety guardrails |
PORTKEY_RETRY_COUNT | Number of retries on failure |
Sources: testzeus_hercules/config.py:50-80
Model Parameter Adaptation
The model_utils module provides intelligent parameter adaptation based on the model family being used.
adapt_llm_params_for_model
def adapt_llm_params_for_model(model_name: str, llm_config_params: Dict) -> Dict
This function automatically adjusts LLM parameters based on the detected model family:
| Model Family | Adaptation Behavior |
|---|---|
| o1-series | Removes temperature, adjusts max_tokens handling |
| GPT-4o | Standard parameters |
| Claude | Adjusts for Anthropic API format |
| Ollama | Configures for local inference |
Applied to All Agents
# In SimpleHercules initialization
planner_model = self.planner_agent_config["model_config_params"].get("model") or \
self.planner_agent_config["model_config_params"].get("model_name")
self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
planner_model,
self.planner_agent_config["llm_config_params"]
)
Sources: testzeus_hercules/core/simple_hercules.py:100-130
LLM Helper Utilities
The llm_helper module provides utility functions for LLM interactions:
| Function | Purpose |
|---|---|
convert_model_config_to_autogen_format() | Convert config to AutoGen format |
create_multimodal_agent() | Create agents with vision capabilities |
extract_target_helper() | Extract target information from responses |
format_plan_steps() | Format planning step outputs |
parse_agent_response() | Parse agent response structure |
process_chat_message_content() | Process chat message content |
parse_response() | General response parsing |
Sources: testzeus_hercules/utils/llm_helper.py:1-30
Environment Variable Configuration
Full Configuration Matrix
| Environment Variable | Type | Default | Purpose |
|---|---|---|---|
LLM_MODEL_PRICING | dict | - | Model pricing information |
LLM_MODEL_TEMPERATURE | float | 0.0 | Default sampling temperature |
LLM_MODEL_CACHE_SEED | int | null | Caching seed |
LLM_MODEL_SEED | int | - | Random seed |
LLM_MODEL_MAX_TOKENS | int | 4096 | Max response tokens |
LLM_MODEL_PRESENCE_PENALTY | float | 0.0 | Presence penalty |
LLM_MODEL_FREQUENCY_PENALTY | float | 0.0 | Frequency penalty |
LLM_MODEL_STOP | list | [] | Stop sequences |
TOKEN_VERBOSE | bool | false | Enable token verbose logging |
HF_HOME | path | - | HuggingFace cache location |
TOKENIZERS_PARALLELISM | bool | false | Parallel tokenizer config |
Sources: testzeus_hercules/config.py:100-150
Multi-Provider Configuration
Provider Switching
The system supports runtime provider switching through the configuration file:
graph LR
A[Config File] --> B["provider: openai"]
A --> C["provider: anthropic"]
B --> D["planner_agent: GPT-4"]
B --> E["nav_agent: GPT-4o-mini"]
C --> F["planner_agent: Claude-3"]
C --> G["nav_agent: Claude-3-Haiku"]Selecting Active Provider
python -m testzeus_hercules \
--agents-llm-config-file ./config.json \
--agents-llm-config-file-ref-key "anthropic"
Sources: testzeus_hercules/core/agents_llm_config.py:60-80
Integration with SimpleHercules
The SimpleHercules class integrates all LLM configurations:
class SimpleHercules:
def __init__(
self,
planner_agent_config: Dict[str, Any],
nav_agent_config: Dict[str, Any],
mem_agent_config: Dict[str, Any],
helper_agent_config: Dict[str, Any],
planner_max_chat_round: int = 50,
browser_nav_max_chat_round: int = 100,
):
# Configuration processing
self.planner_agent_config = planner_agent_config
self.nav_agent_config = nav_agent_config
self.mem_agent_config = mem_agent_config
self.helper_agent_config = helper_agent_config
# Parameter adaptation per agent
from testzeus_hercules.utils.model_utils import adapt_llm_params_for_model
self.planner_agent_config["llm_config_params"] = adapt_llm_params_for_model(
planner_model,
self.planner_agent_config["llm_config_params"]
)
Sources: testzeus_hercules/core/simple_hercules.py:50-100
Best Practices
1. Configuration File Organization
- Group configurations by provider (openai, anthropic, etc.)
- Use environment variables for sensitive API keys
- Maintain consistent agent naming across providers
2. Model Selection Guidelines
| Use Case | Recommended Models |
|---|---|
| Complex planning | GPT-4, Claude-3-Opus |
| Fast navigation | GPT-4o-mini, Claude-3-Haiku |
| Vision tasks | GPT-4o, Claude-3-Sonnet |
| Local inference | Ollama models |
3. Parameter Tuning
- Use
temperature: 0.0for deterministic outputs - Set appropriate
max_tokensbased on expected response length - Enable
model_native_tool_callsfor better function calling
4. Cost Optimization
- Use
LLM_MODEL_PRICINGfor tracking - Enable Portkey caching with
PORTKEY_CACHE_ENABLED - Consider fallback strategies for reliability
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| Model not recognized | Check model_name matches provider format |
| Temperature ignored | Some models (o1-series) ignore temperature parameter |
| API key errors | Ensure ${ENV_VAR} syntax or actual key in config |
| Provider not found | Verify provider key matches config file structure |
Debug Configuration
export TOKEN_VERBOSE=true
export ENABLE_BROWSER_LOGS=true
python -m testzeus_hercules --llm-model gpt-4o --llm-temperature 0.7
See Also
Sources: [testzeus_hercules/core/agents_llm_config.py:1-20]()
Memory Management
Related topics: Agent System, LLM Configuration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent System, LLM Configuration
Memory Management
Overview
The Memory Management system in testzeus-hercules provides persistent and contextual memory capabilities for AI agents executing browser automation tests. The system consists of three primary components: Static LTM (Long Term Memory), Dynamic LTM, and State Handler, each serving distinct purposes in managing test execution context and data persistence.
The architecture follows a multi-layered approach where Static LTM loads pre-configured test data, Dynamic LTM manages vector-based retrieval augmented memory, and State Handler provides runtime state tracking for agent coordination.
graph TD
A[Testzeus-Hercules] --> B[Static LTM]
A --> C[Dynamic LTM]
A --> D[State Handler]
B --> E[Test Data Files]
B --> F[Stored Data]
B --> G[Run Data]
C --> H[Vector Store]
C --> I[RetrieveUserProxyAgent]
D --> J[_state_string]
D --> K[_state_dict]
L[Agents] --> M[Memory Access]
M --> B
M --> C
M --> DStatic LTM (Long Term Memory)
Purpose and Scope
Static LTM is responsible for loading and consolidating pre-configured test data at application initialization. It operates as a singleton pattern, ensuring that all test data is loaded once and made available throughout the test execution lifecycle.
Sources: testzeus_hercules/core/memory/static_ltm.py:1-47
Implementation
The StaticLTM class extends the singleton pattern to ensure only one instance exists:
class StaticLTM:
_instance = None
def __new__(cls) -> "StaticLTM":
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._initialize()
return cls._instance
Sources: testzeus_hercules/core/memory/static_ltm.py:17-22
Data Consolidation
During initialization, Static LTM consolidates three types of data sources:
| Data Source | Description | Method |
|---|---|---|
| Base Test Data | Loaded from test_data.txt via load_data() | StaticDataLoader |
| Stored Data | User-defined test artifacts | get_stored_data() |
| Run Data | Previous test execution context | get_run_data() |
Sources: testzeus_hercules/core/memory/static_ltm.py:26-34
The consolidated data is stored in self.consolidated_data and accessed via get_user_ltm():
def get_user_ltm(self) -> Optional[str]:
return self.consolidated_data
Sources: testzeus_hercules/core/memory/static_ltm.py:40-47
Usage Pattern
Agents access Static LTM through a module-level function:
def get_user_ltm() -> Optional[str]:
return StaticLTM().get_user_ltm()
Sources: testzeus_hercules/core/memory/static_ltm.py:50-54
Dynamic LTM
Purpose and Scope
Dynamic LTM provides runtime memory management with vector-based retrieval capabilities. It enables agents to store, retrieve, and utilize contextual information during test execution using a RetrieveUserProxyAgent backed by ChromaDB for vector storage.
Sources: testzeus_hercules/core/memory/dynamic_ltm.py:1-40
Core Components
#### SilentRetrieveUserProxyAgent
A specialized agent that extends RetrieveUserProxyAgent with suppressed output to prevent console noise during agent conversations:
class SilentRetrieveUserProxyAgent(RetrieveUserProxyAgent):
@suppress_prints
def initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
return super().initiate_chat(*args, **kwargs)
@suppress_prints
async def a_initiate_chat(self, *args: Any, **kwargs: Any) -> Any:
return await super().a_initiate_chat(*args, **kwargs)
Sources: testzeus_hercules/core/memory/dynamic_ltm.py:37-46
#### Print Suppression Decorator
The @suppress_prints decorator redirects stdout to a StringIO buffer during function execution:
def suppress_prints(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
silent_stdout = io.StringIO()
original_stdout, sys.stdout = sys.stdout, silent_stdout
try:
return func(*args, **kwargs)
finally:
sys.stdout = original_stdout
return wrapper
Sources: testzeus_hercules/core/memory/dynamic_ltm.py:17-29
Integration with Configuration
Dynamic LTM respects global configuration to enable or disable its functionality:
from testzeus_hercules.config import get_global_conf
def save_content(self, content: str) -> None:
config = get_global_conf()
if not config.should_use_dynamic_ltm():
return # Skip when disabled
Sources: testzeus_hercules/core/simple_hercules.py:89-94
External Dependencies
Dynamic LTM utilizes the unstructured library for document parsing:
from unstructured.documents.elements import NarrativeText, Text, Title
from unstructured.partition.auto import partition
Sources: testzeus_hercules/core/memory/dynamic_ltm.py:13-14
State Handler
Purpose and Scope
State Handler provides lightweight runtime state management for coordinating data between agents during test execution. It maintains module-level dictionaries for storing string-based state and structured data.
Sources: testzeus_hercules/core/memory/state_handler.py:1-70
Module-Level State Storage
# Module-level state string
_state_string: Dict[str, str] = defaultdict(str)
_state_dict: Dict[str, Any] = defaultdict(deque)
Sources: testzeus_hercules/core/memory/state_handler.py:13-14
store_data Tool
The store_data function is registered as a tool for browser, API, and SQL navigation agents to persist information:
@tool(
agent_names=["browser_nav_agent", "api_nav_agent", "sql_nav_agent"],
description="Tool to store information.",
name="store_data",
)
def store_data(
text: Annotated[str, "The confirmation of stored value."],
) -> Annotated[Dict[str, Union[str, None]], "A dictionary containing a 'message' key..."]:
global _state_string
try:
DynamicLTM().save_content(text)
_state_string[get_global_conf().get_default_test_id()] += text
return {"message": "Text appended successfully."}
except Exception as e:
return {"error": str(e)}
Sources: testzeus_hercules/core/memory/state_handler.py:23-47
Key Behaviors
| Behavior | Description |
|---|---|
| Test ID Isolation | State is keyed by get_global_conf().get_default_test_id() |
| Dual Storage | Data propagates to both _state_string and Dynamic LTM |
| Error Resilience | Returns error dictionary instead of raising exceptions |
Sources: testzeus_hercules/core/memory/state_handler.py:30-46
Memory Architecture Diagram
graph LR
subgraph "Initialization Phase"
A[Load Config] --> B[StaticLTM Singleton]
B --> C[Load test_data.txt]
C --> D[Consolidate Data]
end
subgraph "Runtime Phase"
E[Agents Execute Tests]
E --> F[store_data Tool Call]
F --> G[State Handler]
G --> H[DynamicLTM.save_content]
H --> I[Vector Store Update]
J[Test Query] --> K[RetrieveUserProxyAgent]
K --> L[Vector Similarity Search]
L --> M[Context Injection]
M --> E
endIntegration with SimpleHercules
The SimpleHercules class coordinates all memory components:
class SimpleHercules:
def _save_to_memory(self, content: str) -> None:
"""Helper method to save content to memory."""
config = get_global_conf()
if not config.should_use_dynamic_ltm():
return
if self.memory:
self.memory.save_content(content)
else:
logger.warning("Memory system not initialized")
Sources: testzeus_hercules/core/simple_hercules.py:85-97
Memory Initialization Flow
sequenceDiagram
participant SH as SimpleHercules
participant DLT as DynamicLTM
participant CFG as Config
participant LOG as Logger
SH->>CFG: should_use_dynamic_ltm()
CFG-->>SH: boolean
SH->>DLT: save_content(content)
alt LTM Enabled
DLT->>DLT: Vector store update
DLT-->>SH: success
else LTM Disabled
DLT-->>SH: skipped
endConfiguration
Global Configuration Methods
| Method | Purpose |
|---|---|
should_use_dynamic_ltm() | Check if Dynamic LTM is enabled |
get_hf_home() | Get HuggingFace cache directory for vector store |
get_default_test_id() | Get current test execution identifier |
Related Configuration File
The configuration is managed through testzeus_hercules/config.py, which provides command-line arguments for memory-related settings including:
--reuse-vector-db: Reuse existing vector DB instead of creating fresh one--sandbox-tenant-id: Python sandbox tenant configuration
Sources: testzeus_hercules/config.py:45-58
Data Flow Summary
| Layer | Storage Type | Access Pattern | Persistence |
|---|---|---|---|
| Static LTM | In-memory string | Singleton get_user_ltm() | Session-scoped |
| Dynamic LTM | Vector (ChromaDB) | RetrieveUserProxyAgent | Persistent |
| State Handler | In-memory dict | Module-level _state_string | Test execution-scoped |
Error Handling
All memory components implement robust error handling:
try:
DynamicLTM().save_content(text)
_state_string[get_global_conf().get_default_test_id()] += text
return {"message": "Text appended successfully."}
except Exception as e:
traceback.print_exc()
logger.error(f"An error occurred while appending to state: {e}")
return {"error": str(e)}
Sources: testzeus_hercules/core/memory/state_handler.py:30-42
Related Components
| Component | File Path | Role |
|---|---|---|
| PlannerAgent | core/agents/high_level_planner_agent.py | Consumes memory for test planning |
| ExecutorNavAgent | core/agents/executor_nav_agent.py | Executes test steps with memory context |
| BaseNavAgent | core/agents/base_nav_agent.py | Agent base class with memory integration |
Summary
The Memory Management system in testzeus-hercules implements a comprehensive multi-tier approach:
- Static LTM provides pre-loaded test data consolidation via singleton pattern
- Dynamic LTM offers vector-based retrieval augmented memory for contextual queries
- State Handler enables runtime state sharing between agents through the
store_datatool
This architecture ensures agents have access to both static test fixtures and dynamic execution context, enabling sophisticated AI-driven browser automation testing.
Sources: [testzeus_hercules/core/memory/static_ltm.py:1-47]()
API Testing
Related topics: Security Testing, Tool System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Security Testing, Tool System
API Testing
Overview
API Testing in testzeus-hercules enables automated end-to-end testing of REST APIs through AI-driven agents. The system leverages LLM-powered agents to parse OpenAPI specifications, generate Gherkin test scenarios, execute API calls, and validate responses against expected outcomes. This module integrates with the broader Hercules testing framework to provide comprehensive API validation capabilities alongside browser-based UI testing.
The API Testing feature accepts OpenAPI specification files (YAML or JSON format) and automatically generates executable Gherkin test cases that can be run against live API endpoints. The generated tests follow behavior-driven development (BDD) conventions, making them readable for both technical and non-technical stakeholders.
Architecture
The API Testing module consists of several interconnected components that work together to provide end-to-end API testing capabilities.
Component Overview
graph TD
A[OpenAPI Specification] --> B[generate_api_functional_gherkin_test.py]
B --> C[Gherkin Test Cases]
C --> D[API Navigation Agent]
D --> E[API Calls Tool]
E --> F[SQL Calls Tool]
E --> G[Python Sandbox Executor]
F --> H[Database Validation]
G --> I[Custom Logic Validation]
D --> J[Response Parser]
J --> K[Test Results]Core Components
| Component | File Path | Purpose |
|---|---|---|
| API Navigation Agent | testzeus_hercules/core/agents/api_nav_agent.py | Orchestrates API test execution using LLM-driven decision making |
| API Calls Tool | testzeus_hercules/core/tools/api_calls.py | Executes HTTP requests to API endpoints |
| SQL Calls Tool | testzeus_hercules/core/tools/sql_calls.py | Validates API data against database state |
| Python Sandbox | testzeus_hercules/core/tools/execute_python_sandbox.py | Executes custom validation logic |
| Gherkin Generator | helper_scripts/generate_api_functional_gherkin_test.py | Generates test cases from OpenAPI specs |
| Response Parser | testzeus_hercules/utils/response_parser.py | Parses and validates API responses |
Sources: helper_scripts/generate_api_functional_gherkin_test.py:1-80
Test Generation Workflow
OpenAPI Specification Processing
The test generation process begins with parsing OpenAPI specification files. The system accepts both YAML and JSON formatted OpenAPI specs through the generate_api_functional_gherkin_test.py helper script.
parser.add_argument(
"input_files",
metavar="input_files",
type=str,
nargs="+",
help="One or more OpenAPI spec files (YAML or JSON).",
)
Sources: helper_scripts/generate_api_functional_gherkin_test.py:15-22
Gherkin Test Case Generation
The LLM generates test cases based on the OpenAPI specification content. The generation uses a specialized prompt that instructs the model to produce Gherkin-format scenarios covering functional test cases.
def generate_test_cases(prompt: str, model: str) -> str:
"""Generates test cases using the OpenAI API."""
client = OpenAI()
completion = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=0.7,
)
return completion.choices[0].message.content
Sources: helper_scripts/generate_api_functional_gherkin_test.py:82-90
Generation Parameters
| Parameter | CLI Flag | Default | Description |
|---|---|---|---|
| Model | --model | o1-preview | LLM model for test generation |
| Output Folder | --output | (required) | Destination for generated feature files |
| Number of Test Cases | --number_of_testcase | 100 | Maximum test cases to generate per endpoint |
API Navigation Agent
The API Navigation Agent (api_nav_agent.py) serves as the orchestrator for executing API tests. It receives parsed test scenarios and coordinates execution across multiple tools to validate API behavior.
sequenceDiagram
participant Test as Test Scenario
participant Agent as API Nav Agent
participant API as API Calls Tool
participant SQL as SQL Calls Tool
participant Sandbox as Python Sandbox
Test->>Agent: Execute scenario
Agent->>API: Send HTTP request
API-->>Agent: Response data
Agent->>SQL: Validate database state
SQL-->>Agent: Validation result
Agent->>Sandbox: Run custom assertions
Sandbox-->>Agent: Assertion results
Agent->>Test: Pass/Fail outcomeSources: testzeus_hercules/core/agents/api_nav_agent.py
Execution Environment
Python Sandbox
API tests execute within a secured Python sandbox environment that provides controlled access to necessary resources while maintaining isolation.
def _get_config_driven_injections(config: Any) -> Dict[str, Any]:
"""
Get injections defined in configuration.
Allows dynamic configuration of available modules.
"""
injections = {}
# Read from config: SANDBOX_PACKAGES="requests,pandas,numpy"
sandbox_packages = config.get_config().get("SANDBOX_PACKAGES", "").split(",")
for package_name in sandbox_packages:
package_name = package_name.strip()
if package_name:
try:
injections[package_name] = __import__(package_name)
except ImportError:
logger.warning(f"Could not import configured package: {package_name}")
return injections
Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:80-100
Sandbox Access Variables
Scripts executing within the sandbox have automatic access to the following variables:
| Variable | Type | Description |
|---|---|---|
page | Playwright Page | Current browser page context |
browser | Browser instance | Active browser session |
context | Browser Context | Isolated browsing context |
playwright_manager | PlaywrightManager | Manages Playwright lifecycle |
logger | Logger | Logging utility |
config | Configuration | Global configuration object |
Additional tenant-specific modules can be injected based on the SANDBOX_TENANT_ID environment variable, and custom injections are available via the SANDBOX_CUSTOM_INJECTIONS environment variable.
Sources: testzeus_hercules/core/tools/execute_python_sandbox.py:40-55
Response Handling
JSON Response Parsing
The response parser handles API responses with multiple fallback strategies for extracting structured data:
def parse_response(message: str) -> dict[str, Any]:
# Check if message is wrapped in ```json ``` blocks
if "```json" in message:
start_idx = message.find("```json") + 7
end_idx = message.find("```", start_idx + 7)
message = message[start_idx:end_idx]
else:
if message.startswith("```"):
message = message[3:]
if message.endswith("```):
message = message[:-3]
if message.startswith("json"):
message = message[4:]
message = message.strip()
message = message.replace("\\n", "\n")
json_response: dict[str, Any] = json.loads(message)
Sources: testzeus_hercules/utils/response_parser.py:9-35
Error Recovery
When JSON parsing fails, the response parser attempts to extract plan and next_step fields from unstructured responses, ensuring graceful degradation when APIs return non-standard response formats.
Configuration
LLM Configuration
API Testing relies on LLM configuration for test generation and agent decision-making. Configuration can be provided via command-line arguments or through a dedicated configuration file.
parser.add_argument(
"--llm-model",
type=str,
help="Name of the LLM model.",
required=False,
)
parser.add_argument(
"--llm-temperature",
type=float,
help="Temperature for LLM sampling (0.0-1.0).",
required=False,
)
Sources: testzeus_hercules/config.py:35-45
Agents LLM Config File
For multi-agent configurations, specify the configuration file path:
--agents-llm-config-file /path/to/agents_llm_config.json
--agents-llm-config-file-ref-key <key_name>
Portkey Integration
Enable Portkey for LLM routing with fallback or load balancing strategies:
--enable-portkey
--portkey-api-key <api_key>
--portkey-strategy fallback|loadbalance
Sources: testzeus_hercules/config.py:60-75
Usage Examples
Generate Gherkin Tests from OpenAPI Spec
python helper_scripts/generate_api_functional_gherkin_test.py \
spec/openapi.yaml \
--output tests/api/ \
--model gpt-4 \
--number_of_testcase 50
Run API Tests
Tests can be executed through the main Hercules CLI or integrated into CI/CD pipelines. The agent configuration file supports specifying different models for different agents:
{
"openai": {
"planner_agent": {
"model_name": "gpt-4",
"model_api_type": "openai"
}
}
}
Sources: agents_llm_config-example.json
Integration with Browser Testing
The API Testing module integrates seamlessly with browser-based testing capabilities. The API Navigation Agent can coordinate with the Browser Navigation Agent to perform scenarios that span both API validation and UI verification.
When executing multi-step workflows, the system can:
- Call API endpoints to set up test data
- Launch browsers to verify UI state reflects API changes
- Execute SQL queries to validate data persistence
- Run custom Python assertions for complex business logic
Security Considerations
API Key Management
API keys should be provided through environment variables or secure configuration management, never hardcoded in test files:
export OPENAI_API_KEY=<your_api_key>
export PORTKEY_API_KEY=<your_portkey_key>
Sandbox Isolation
The Python sandbox provides execution isolation for custom test logic. Configure allowed packages through the SANDBOX_PACKAGES configuration parameter to limit access to only required libraries.
Best Practices
- Organize OpenAPI specs by version - Maintain separate specification files for different API versions
- Use meaningful test case names - Generated tests should clearly describe the scenario being validated
- Combine with database validation - Use SQL Calls Tool to verify data consistency
- Leverage response parsing - Use the response parser for handling complex API response formats
- Configure appropriate LLM models - Use faster models for generation and more capable models for complex validation logic
Related Documentation
- CONTRIBUTING.md - Contribution guidelines for the project
- Makefile - Build and test automation targets
- Browser Testing - Companion documentation for UI testing
Sources: [helper_scripts/generate_api_functional_gherkin_test.py:1-80]()
Security Testing
Related topics: API Testing, Tool System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: API Testing, Tool System
Security Testing
Security Testing in TestZeus Hercules is an automated framework designed to validate API security by generating and executing Gherkin-based test scenarios. The system leverages LLM-powered agents to analyze OpenAPI specifications and produce comprehensive security validation tests that check for vulnerabilities, configuration weaknesses, and proper handling of sensitive data.
Overview
The Security Testing module provides an end-to-end solution for validating API security without requiring manual test case authoring. It integrates with the broader Hercules testing framework to execute security validation scenarios alongside functional and navigation tests.
Core Components
| Component | File Path | Purpose |
|---|---|---|
| Security Navigation Agent | testzeus_hercules/core/agents/sec_nav_agent.py | Orchestrates security test execution using LLM-driven agents |
| API Security Calls | testzeus_hercules/core/tools/api_sec_calls.py | Provides low-level HTTP client operations for security validation |
| Gherkin Test Generator | helper_scripts/generate_api_security_gherkin_test.py | Generates security-focused Gherkin test cases from OpenAPI specs |
Architecture
graph TD
A[OpenAPI Spec Files] --> B[generate_api_security_gherkin_test.py]
B --> C[LLM API - OpenAI]
C --> D[Security Gherkin Test Cases]
D --> E[Hercules Test Executor]
E --> F[sec_nav_agent.py]
F --> G[api_sec_calls.py]
G --> H[Target API Endpoints]
H --> I[Security Validation Results]
J[Configuration] --> F
K[LLM Config] --> BSecurity Navigation Agent
The sec_nav_agent.py module implements the BrowserNavAgent pattern specialized for security testing scenarios. It follows the same agent architecture used by the browser navigation agent but focuses on API security validation.
Agent Configuration
The security agent inherits the core agent configuration structure from the Hercules framework, utilizing the same LLM integration patterns as the main navigation agent defined in simple_hercules.py. Sources: testzeus_hercules/core/agents/sec_nav_agent.py:1-50
Execution Flow
sequenceDiagram
participant TestRunner
participant SecNavAgent
participant APISecCalls
participant TargetAPI
participant ResponseParser
TestRunner->>SecNavAgent: Execute security test scenario
SecNavAgent->>APISecCalls: Send HTTP request with security payload
APISecCalls->>TargetAPI: Validated HTTP request
TargetAPI->>APISecCalls: Response with headers/body
APISecCalls->>ResponseParser: Parse response data
ResponseParser->>SecNavAgent: Structured security results
SecNavAgent->>TestRunner: Security validation reportAPI Security Calls Module
The api_sec_calls.py module provides the foundational HTTP client capabilities for executing security tests. It supports various HTTP methods and authentication schemes required for comprehensive API security testing.
Supported Security Test Operations
| Operation | Description | Authentication Support |
|---|---|---|
| GET Security Headers | Validate presence and correctness of security headers | Bearer, API Key, Basic |
| POST Injection Tests | Execute payload injection for XSS, SQLi validation | Bearer, API Key |
| Authentication Bypass | Test unauthorized access to protected endpoints | Token validation |
| Rate Limiting | Verify rate limiting mechanisms | None required |
| CORS Validation | Check cross-origin resource sharing policies | None required |
Sources: testzeus_hercules/core/tools/api_sec_calls.py:1-100
Request Configuration
# Security test request structure
{
"method": "GET|POST|PUT|DELETE|PATCH",
"url": "https://api.target.com/endpoint",
"headers": {
"Authorization": "Bearer {token}",
"Content-Type": "application/json"
},
"params": {}, # Query parameters
"data": {}, # Request body for POST/PUT/PATCH
"timeout": 30,
"verify_ssl": true
}
Gherkin Test Generation
The generate_api_security_gherkin_test.py helper script uses LLM to automatically generate security-focused Gherkin test cases from OpenAPI specification files.
Input Processing
The generator accepts OpenAPI specifications in both YAML and JSON formats, parsing the specification to identify:
- Endpoints and their HTTP methods
- Security schemes defined in the spec
- Request/response schemas
- Authentication requirements
Sources: helper_scripts/generate_api_security_gherkin_test.py:1-60
Generation Prompt Strategy
The LLM prompt instructs the model to focus on generating tests that validate:
- Vulnerability Detection: Tests that check for common vulnerabilities
- Configuration Weaknesses: Validation of security configurations
- Sensitive Data Handling: Verification of proper data protection
- Authentication/Authorization: Access control testing
- Input Validation: Sanitization and validation checks
Output Format
Generated test cases follow this structure:
Feature: API Security Validation - {Endpoint_Name}
Scenario: Validate security headers on {method} {path}
Given the API endpoint "{path}" requires authentication
When I send a {method} request without authorization
Then the response should have status code 401
And the response should include "WWW-Authenticate" header
Scenario: Test for SQL injection vulnerability on {path}
Given the API endpoint "{path}" accepts query parameters
When I send a GET request with malicious payload in parameter "id"
Then the response should have status code 400 or 422
And no SQL error should be present in response body
Sources: helper_scripts/generate_api_security_gherkin_test.py:80-120
Command Line Interface
Running Security Tests
# Generate security tests from OpenAPI spec
python -m helper_scripts.generate_api_security_gherkin_test \
--input spec.yaml \
--output ./tests/security \
--model gpt-4o \
--number_of_testcase 50
# Execute security tests with Hercules
testzeus-hercules --input-file ./tests/security/api_security.feature
Generation Script Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
input_files | list[str] | required | One or more OpenAPI spec files (YAML or JSON) |
--output | str | required | Output folder for generated feature files |
--model | str | o1-preview | OpenAI model for test generation |
--number_of_testcase | int | 100 | Number of test cases to generate |
Sources: helper_scripts/generate_api_security_gherkin_test.py:30-55
Integration with Hercules Framework
Agent Initialization
The security agent is initialized through the same mechanism as other Hercules agents, using configuration from the LLM configuration file specified via CLI:
testzeus-hercules \
--input-file security_tests.feature \
--agents-llm-config-file ./config/security_agent.yaml \
--llm-model gpt-4o
Execution Context
Security tests execute within the same context as functional tests, providing:
- Shared browser/playwright session management
- Consistent logging and telemetry
- Unified reporting and result aggregation
- Access to shared utilities and helpers
Security Test Scenarios
Common Test Categories
| Category | Test Focus | Example Validation |
|---|---|---|
| Authentication | Token validation, session management | Invalid token returns 401 |
| Authorization | Access control, privilege escalation | User cannot access admin endpoints |
| Input Validation | Payload sanitization, type checking | Malformed input returns 400 |
| Headers | Security header presence | X-Frame-Options, CSP headers present |
| Rate Limiting | Request throttling | Excessive requests return 429 |
| CORS | Cross-origin policy | Invalid origins rejected |
Best Practices
Test Data Management
- Use dedicated security test environments
- Isolate security tests from production data
- Implement proper cleanup for test artifacts
- Rotate API keys/tokens used in tests
Test Coverage
- Aim for comprehensive endpoint coverage
- Include both positive and negative test cases
- Validate all security headers defined in your policy
- Test authentication bypass scenarios
Configuration
Environment Variables
| Variable | Purpose |
|---|---|
OPENAI_API_KEY | Required for LLM-powered test generation |
SECURITY_TEST_API_KEY | API key for testing authenticated endpoints |
SECURITY_TEST_BASE_URL | Override target API base URL |
LLM Configuration
The security agent uses the same LLM configuration structure as other agents, specified through:
# security_agent_config.yaml
llm_config:
model: gpt-4o
temperature: 0.7
max_tokens: 4096
other_settings:
system_prompt: "You are a security testing expert..."
max_consecutive_auto_reply: 10
Reporting
Security test results are integrated into the standard Hercules reporting format:
- Test pass/fail status per scenario
- Detailed assertion results
- HTTP request/response logging
- Security-specific metrics (headers present, vulnerabilities detected)
Related Documentation
- Browser Navigation Agent - Core navigation patterns
- Simple Hercules - Main framework architecture
- API Functional Testing - Functional test generation
Sources: [testzeus_hercules/core/tools/api_sec_calls.py:1-100]()
MCP Integration
Related topics: Agent System, Tool System
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Agent System, Tool System
MCP Integration
Overview
The MCP (Model Context Protocol) Integration in TestZeus Hercules enables the testing agent to discover, catalog, and execute tools exposed by external MCP servers. This integration allows Hercules to extend its capabilities by leveraging tools from multiple Model Context Protocol-compliant servers during end-to-end testing workflows.
MCP serves as a standardized communication layer that allows the testing framework to:
- Enumerate and connect to configured MCP servers
- List available tools and resource namespaces from each connected server
- Execute remote tool calls with correct parameters
- Retrieve resources by URI when required for test execution
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:1-10
Architecture
System Components
The MCP integration is built on three primary components:
| Component | File | Purpose |
|---|---|---|
| McpNavAgent | core/agents/mcp_nav_agent.py | Main navigation agent that orchestrates MCP server interactions |
| MCPHelper | utils/mcp_helper.py | Utility class providing MCP client functionality |
| MCP Tools | core/tools/mcp_tools.py | Tool implementations for MCP operations |
Component Relationship
graph TD
A[TestZeus Hercules Core] --> B[McpNavAgent]
B --> C[MCPHelper]
C --> D[MCP Servers]
B --> E[get_configured_mcp_servers]
B --> F[check_mcp_server_connection]
B --> G[execute_mcp_tool]
B --> H[read_mcp_resource]
E --> I[Server Discovery]
F --> J[Connection Status]
G --> K[Tool Execution]
H --> L[Resource Retrieval]McpNavAgent
The McpNavAgent is the central agent responsible for all MCP-related operations. It inherits from BaseNavAgent and implements the Model Context Protocol interaction patterns.
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:6-9
Agent Configuration
| Property | Value | Description |
|---|---|---|
agent_name | mcp_nav_agent | Unique identifier for the agent |
| Inherits | BaseNavAgent | Base navigation agent functionality |
Core Functions
The MCP Navigation Agent implements the following core functions:
- Server Discovery - Enumerate configured MCP servers and their connection status
- Capability Cataloging - List tools and resource namespaces for each connected server
- Tool Execution - Call tools with correct parameters and handle responses
- Resource Retrieval - Read resources by URI when required
- Result Summarization - Capture server, tool, arguments, outputs; include timings and status
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:14-28
Operational Rules
#### Rule 1: Previous Step Validation
Before any new action, explicitly review the previous step and its outcome. Do not proceed if the prior critical step failed; address it first.
graph TD
A[Execute Action] --> B{Previous Step Succeeded?}
B -->|No| C[Address Failure First]
B -->|Yes| D[Continue to Next Action]
C --> D#### Rule 2: Server Scan First
The agent must call get_configured_mcp_servers and for each server, call check_mcp_server_connection before taking any other action.
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:31-35
Agent Prompt
The agent uses a specialized system prompt that defines its role and behavioral guidelines:
### MCP Navigation Agent
You are an MCP (Model Context Protocol) Navigation Agent that assists the Testing Agent by discovering MCP servers, cataloging their exposed tools/resources, and executing the right tool calls to complete the task. Always begin by scanning all configured servers before taking any action.
Sources: testzeus_hercules/core/agents/mcp_nav_agent.py:11-21
MCPHelper Utility
The MCPHelper class provides the underlying functionality for MCP server interactions. It is exported through the mcp_helper.py module and integrates with the agent system through set_mcp_agents.
Sources: testzeus_hercules/utils/mcp_helper.py
Key Functions
| Function | Purpose |
|---|---|
MCPHelper | Main helper class for MCP operations |
set_mcp_agents | Configures MCP agents within the testing framework |
Configuration
Configuration File Format
MCP servers are configured using a JSON file. The example file mcp_servers.example.json demonstrates the expected format:
{
"mcpServers": {
"server_name": {
"command": "command_to_run",
"args": ["arg1", "arg2"],
"env": {
"KEY": "value"
}
}
}
}
Command Line Arguments
The MCP integration can be configured through command line arguments:
| Argument | Type | Description |
|---|---|---|
--agents-llm-config-file | string | Path to the agents LLM configuration file |
--agents-llm-config-file-ref-key | string | Reference key for the agents LLM configuration file |
Sources: testzeus_hercules/config.py:27-39
Workflow
Standard MCP Interaction Flow
graph TD
A[Start Test Execution] --> B[Initialize McpNavAgent]
B --> C[Call get_configured_mcp_servers]
C --> D[For Each Server]
D --> E[Call check_mcp_server_connection]
E --> F{Server Connected?}
F -->|No| G[Log Error / Skip Server]
F -->|Yes| H[Catalog Tools & Resources]
H --> I[Task Requires MCP Tool?]
I -->|Yes| J[Call execute_mcp_tool]
I -->|No| K[Continue with Other Tasks]
J --> L[Process Tool Response]
L --> M[Return Results to Testing Agent]
G --> D
K --> N[Complete Test]
M --> NTool Execution Workflow
When executing MCP tools, the agent follows this sequence:
- Identify the target MCP server
- Verify server connection status
- Determine the correct tool and parameters
- Execute the tool call via MCP protocol
- Capture response including timing and status
- Return formatted results to the testing agent
Integration with Testing Framework
Agent Hierarchy
graph BT
A[BrowserNavAgent] --> B[BaseNavAgent]
C[ApiNavAgent] --> B
D[SqlNavAgent] --> B
E[McpNavAgent] --> B
F[SecNavAgent] --> B
B --> G[TestZeus Hercules Core]All navigation agents, including McpNavAgent, inherit from BaseNavAgent, ensuring consistent behavior and integration with the core testing framework.
Sources: testzeus_hercules/core/agents/__init__.py
Available Navigation Agents
| Agent | Purpose |
|---|---|
BrowserNavAgent | Web browser interaction and navigation |
ApiNavAgent | API testing and validation |
SqlNavAgent | Database query execution |
McpNavAgent | MCP server tool execution |
SecNavAgent | Security testing operations |
Best Practices
Initialization
- Always ensure MCP servers are properly configured before test execution
- Verify server connectivity before attempting tool calls
- Use the configured servers list as the authoritative source of available MCP servers
Error Handling
- Check previous step outcomes before proceeding
- Log connection failures with server identification
- Handle tool execution errors with proper parameter validation
- Provide clear error messages when MCP operations fail
Task Focus
- Execute only actions required by the primary testing task
- Use extra information from MCP responses cautiously
- Avoid unnecessary server scans after initial discovery
Security Considerations
The MCP integration supports sensitive operations requiring careful configuration:
- API keys should be provided through secure environment variables
- Server configurations should be validated before use
- Tool execution permissions should be properly scoped
- Resource access should follow least-privilege principles
Summary
The MCP Integration module provides TestZeus Hercules with the ability to extend its testing capabilities through external MCP servers. By implementing a dedicated McpNavAgent that follows standardized MCP protocols, the framework can seamlessly discover servers, catalog their capabilities, and execute tools as needed during end-to-end testing scenarios.
Key benefits include:
- Extensibility: Add new testing capabilities without modifying core framework code
- Standardization: Uses the Model Context Protocol for consistent server communication
- Resource Management: Access remote resources via standardized URI-based retrieval
- Comprehensive Logging: Captures server status, tool execution times, and results
Sources: [testzeus_hercules/core/agents/mcp_nav_agent.py:1-10]()
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
The project should not be treated as fully validated until this signal is reviewed.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
Doramagic Pitfall Log
Doramagic extracted 13 source-linked risk signals. Review them before installing or handing real data to the project.
1. Configuration risk: 0.1.1
- Severity: medium
- Finding: Configuration risk is backed by a source signal: 0.1.1. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.1
2. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | README/documentation is current enough for a first validation pass.
3. Maintenance risk: 0.0.40
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: 0.0.40. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.0.40
4. Maintenance risk: 0.1.0
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: 0.1.0. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.0
5. Maintenance risk: 0.1.2
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: 0.1.2. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.2
6. Maintenance risk: 0.1.6
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: 0.1.6. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.6
7. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | last_activity_observed missing
8. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium
9. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | no_demo; severity=medium
10. Security or permission risk: 0.1.4
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: 0.1.4. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.1.4
11. Security or permission risk: 0.2.2
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: 0.2.2. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/test-zeus-ai/testzeus-hercules/releases/tag/0.2.2
12. Maintenance risk: issue_or_pr_quality=unknown
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:888701643 | https://github.com/test-zeus-ai/testzeus-hercules | issue_or_pr_quality=unknown
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using testzeus-hercules with real data or production workflows.
- 0.2.2 - github / github_release
- 0.1.6 - github / github_release
- 0.1.4 - github / github_release
- 0.1.2 - github / github_release
- 0.1.1 - github / github_release
- 0.1.0 - github / github_release
- 0.0.40 - github / github_release
- README/documentation is current enough for a first validation pass. - GitHub / issue
Source: Project Pack community evidence and pitfall evidence