Doramagic Project Pack · Human Manual
sidebutton
SideButton is a browser automation and workflow orchestration platform that enables AI agents (such as Claude Desktop, Cursor) to interact with web applications through a unified MCP (Mode...
Introduction to SideButton
Related topics: Getting Started, System Architecture, MCP Server Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Getting Started, System Architecture, MCP Server Integration
Introduction to SideButton
SideButton is an open-source AI agent platform that combines an MCP (Model Context Protocol) server with browser automation tools, a YAML-based workflow engine, and extensible knowledge packs for domain-specific expertise. It enables autonomous AI agents to interact with web applications through standardized browser controls, CLI operations, and pre-built workflow automations.
Sources: AGENTS.md
High-Level Architecture
SideButton follows a modular monorepo architecture with four primary packages working together to provide a complete automation platform.
graph TB
subgraph "Client Layer"
EXT["Chrome Extension"]
CLI["CLI Tool"]
MCP["MCP Clients<br/>(Claude, Cursor)"]
end
subgraph "packages/server"
API["REST API & Dashboard"]
MCP_SRV["MCP Endpoint"]
WS["WebSocket Bridge"]
end
subgraph "packages/core"
PARSER["Workflow Parser"]
EXEC["Step Executor"]
end
subgraph "packages/dashboard"
UI["Svelte Web UI"]
end
EXT --> WS
CLI --> API
MCP --> MCP_SRV
API --> EXEC
MCP_SRV --> EXEC
WS --> EXEC
EXEC --> PARSER
UI --> APISources: AGENTS.md, CONTRIBUTING.md
Package Structure
The repository is organized as a monorepo using pnpm workspaces. Each package has a focused responsibility.
| Package | Purpose | Location |
|---|---|---|
packages/core | Workflow engine — parser, executor, and step implementations | Workflow execution runtime |
packages/server | HTTP server, MCP endpoint, CLI, and WebSocket bridge | API layer and server runtime |
packages/dashboard | Svelte web UI served at localhost:9876 | User interface |
packages/sidebutton | CLI entry point for npx sidebutton@latest | Command-line interface |
extension/ | Chrome extension (Manifest V3) | Browser automation |
Sources: AGENTS.md, CONTRIBUTING.md
Core Concepts
Workflows
Workflows are YAML files that define sequences of steps for automation tasks. They can include browser interactions, shell commands, LLM calls, and control flow logic.
name: example_workflow
steps:
- type: navigate
url: https://example.com
- type: click
selector: "#submit-button"
- type: extract
selector: ".result"
as: result
Sources: packages/core/README.md
Step Types
SideButton provides multiple categories of steps for different automation needs.
| Category | Steps | Purpose |
|---|---|---|
| Browser | navigate, click, type, scroll, hover, wait, extract, extractAll, exists, key | Web page interaction |
| Shell | shell.run, terminal.open, terminal.run | Command execution |
| LLM | llm.classify, llm.generate | AI-powered operations |
| Control | control.if, control.retry, control.stop, workflow.call | Flow control |
| Data | data.first | Data manipulation |
Sources: packages/core/README.md
Knowledge Packs
Knowledge packs (also called skill packs) teach autonomous AI agents how specific web applications work. They bundle markdown files containing selectors, data models, state definitions, and agentic workflows per web app.
Key capabilities of knowledge packs:
- Selectors: CSS/XPath selectors for UI elements
- Data Models: Structured data representations
- Agentic Workflows: Pre-defined sequences for common tasks
- Role Playbooks: Instructions for AI agent behavior
Sources: packages/sidebutton/README.md, AGENTS.md
MCP Integration
SideButton provides MCP (Model Context Protocol) server functionality for integration with AI coding assistants. The MCP tools allow AI agents to control browser automation and execute workflows programmatically.
Supported Clients
| Client | Transport | Configuration |
|---|---|---|
| Claude Desktop | stdio | npx sidebutton --stdio |
| Claude Code | SSE | http://localhost:9876/mcp |
| Cursor | HTTP | http://localhost:9876/mcp |
Sources: packages/server/README.md, packages/sidebutton/README.md
MCP Tools
| Tool | Description |
|---|---|
run_workflow | Execute a workflow by ID |
list_workflows | List available workflows |
get_workflow | Get workflow YAML definition |
get_run_log | Get execution log for a run |
list_run_logs | List recent workflow executions |
get_browser_status | Check browser extension connection |
capture_page | Capture selectors from current page |
navigate | Navigate browser to URL |
snapshot | Get page accessibility snapshot |
click | Click an element |
type | Type text into an element |
scroll | Scroll the page |
screenshot | Capture page screenshot |
hover | Hover over element |
extract | Extract text from element |
extract_all | Extract all matching elements |
Sources: packages/server/README.md, README.md
CLI Commands
The SideButton CLI provides commands for managing the server, workflows, and knowledge packs.
sidebutton # Start server (default port 9876)
sidebutton --stdio # Start with stdio transport (Claude Desktop)
sidebutton -p 8080 # Custom port
sidebutton list # List available workflows
sidebutton run <id> # Run a workflow by ID
sidebutton status # Check server status
Knowledge Pack Management
| Command | Description | ||
|---|---|---|---|
| `sidebutton registry add <path\ | url>` | Register and install all knowledge packs | |
sidebutton registry update [name] | Update installed packs from registry | ||
sidebutton registry remove <name> | Uninstall packs and remove registry | ||
sidebutton registry list | Show registries and pack counts | ||
sidebutton search [query] | Search packs across registries | ||
| `sidebutton install <path\ | url\ | name>` | One-off knowledge pack install |
sidebutton uninstall <domain> | Remove an installed knowledge pack | ||
sidebutton init [domain] | Scaffold a new knowledge pack | ||
sidebutton validate [path] | Lint and validate a knowledge pack |
Sources: packages/sidebutton/README.md, packages/server/README.md
Quick Start
Published Package (No Clone Required)
npx sidebutton@latest # starts server + dashboard on port 9876
Sources: AGENTS.md
Local Development Setup
# Clone the repo
git clone https://github.com/sidebutton/sidebutton.git
cd sidebutton
# Install dependencies
pnpm install
# Build all packages
pnpm build
# Start the server
pnpm start
# Open http://localhost:9876
Sources: CONTRIBUTING.md
Development Prerequisites
| Requirement | Version |
|---|---|
| Node.js | 20+ |
| pnpm | 9.15+ |
| Chrome | Latest (for browser automation) |
Sources: CONTRIBUTING.md
Development Workflow
Running Components
Start everything in watch mode with hot reload:
pnpm dev
Run components individually:
| Command | Description |
|---|---|
pnpm dev:server | Server with auto-restart on :9876 |
pnpm dev:dashboard | Dashboard with HMR on :5173 |
pnpm build | Build all packages |
pnpm test | Run all tests |
Sources: CONTRIBUTING.md
Provider Preference
When multiple integration methods exist, SideButton follows this preference order:
graph LR
A["API Provider"] --> B["CLI Tool"] --> C["Browser Automation"]
style A fill:#90EE90
style B fill:#FFD700
style C fill:#FFA07A- API is fastest and most reliable
- CLI provides programmatic access
- Browser automation is the universal fallback
Sources: packages/server/defaults/roles/software-engineer.md
Data Directories
| Directory | What it is |
|---|---|
packages/core/ | Workflow engine — parser, executor, step implementations |
packages/server/ | HTTP server, MCP endpoint, CLI, WebSocket bridge |
packages/dashboard/ | Svelte web UI served at localhost:9876 |
extension/ | Chrome extension for browser automation |
workflows/ | Public workflow library (YAML files) |
actions/ | User-created workflows (gitignored) |
Sources: CONTRIBUTING.md
Related Packages
| Package | NPM Link |
|---|---|
@sidebutton/core | npmjs.com |
@sidebutton/server | npmjs.com |
Sources: packages/core/README.md, packages/server/README.md
External Resources
| Resource | URL |
|---|---|
| Documentation | docs.sidebutton.com |
| GitHub Repository | github.com/sidebutton/sidebutton |
| Website | sidebutton.com |
| Knowledge Packs | sidebutton.com/skills |
License
SideButton is licensed under Apache-2.0.
Sources: CONTRIBUTING.md, packages/core/README.md, packages/server/README.md
Sources: [AGENTS.md]()
Getting Started
Related topics: Introduction to SideButton
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to SideButton
Getting Started
SideButton is a Model Context Protocol (MCP) server that provides browser automation, workflow execution, and knowledge pack management capabilities. It enables AI assistants like Claude Desktop and Cursor to interact with web browsers, execute predefined workflows, and leverage domain-specific knowledge packs.
Prerequisites
Before installing SideButton, ensure your environment meets the following requirements:
| Requirement | Version/Details |
|---|---|
| Node.js | v18 or higher |
| Package Manager | pnpm (recommended) or npm |
| Browser | Chrome/Chromium (for browser automation features) |
| OS | macOS, Windows, Linux |
Sources: README.md:1-50
Installation
Package Manager Installation
Install the SideButton CLI globally using your preferred package manager:
# Using npm
npm install -g sidebutton
# Using pnpm
pnpm add -g sidebutton
# Using yarn
yarn global add sidebutton
Verify the installation:
sidebutton --version
Development Setup (From Source)
For contributing or running the latest development version:
# Clone the repository
git clone https://github.com/sidebutton/sidebutton.git
cd sidebutton
# Install dependencies
pnpm install
# Build all packages
pnpm build
# Run CLI directly
pnpm cli --version
Sources: CONTRIBUTING.md:1-20
Quick Start
Starting the Server
The default command starts the SideButton server on port 9876:
sidebutton
To use a custom port:
sidebutton -p 8080
The server provides:
- REST API endpoint
- MCP (Model Context Protocol) endpoint
- WebSocket connection for browser extension
- Dashboard UI at
http://localhost:9876
Sources: packages/sidebutton/README.md:1-30
Architecture Overview
graph TD
A[Claude Desktop / Cursor] -->|MCP Protocol| B[SideButton Server]
A -->|stdio| B
B -->|REST API| C[Dashboard UI]
B -->|WebSocket| D[Chrome Extension]
B -->|Execute| E[Workflow Engine]
E -->|Browser Actions| F[Chrome Browser]
E -->|CLI Tools| G[Shell/CLI]
E -->|LLM Calls| H[OpenAI/Anthropic/Ollama]MCP Integration
SideButton can be integrated with various AI coding assistants through the MCP protocol.
Claude Desktop
Add SideButton to your Claude Desktop configuration file at ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sidebutton": {
"command": "npx",
"args": ["sidebutton", "--stdio"]
}
}
}
After adding the configuration, restart Claude Desktop to load the new MCP server.
Sources: README.md:50-70
Cursor
Add SideButton to your Cursor MCP configuration file at ~/.cursor/mcp.json:
{
"mcpServers": {
"sidebutton": {
"url": "http://localhost:9876/mcp"
}
}
}
Ensure the SideButton server is running before using Cursor with this configuration.
Sources: packages/server/README.md:20-35
Available MCP Tools
| Tool | Description |
|---|---|
run_workflow | Execute a workflow by ID |
list_workflows | List all available workflows |
get_workflow | Get workflow YAML definition |
get_run_log | Get execution log for a run |
list_run_logs | List recent workflow executions |
get_browser_status | Check browser extension connection |
capture_page | Capture selectors from current page |
navigate | Navigate browser to URL |
snapshot | Get page accessibility snapshot |
click | Click an element |
type | Type text into an element |
scroll | Scroll the page |
screenshot | Capture page screenshot |
hover | Hover over element |
extract | Extract text from element |
extract_all | Extract all matching elements |
Sources: packages/server/README.md:40-60
CLI Commands
Basic Commands
# Start the server (default port 9876)
sidebutton
# Start with stdio transport for Claude Desktop
sidebutton --stdio
# Start on custom port
sidebutton -p 8080
# List available workflows
sidebutton list
# Run a specific workflow
sidebutton run <workflow-id>
# Check server status
sidebutton status
Knowledge Packs Management
# Add a registry
sidebutton registry add <path|url>
# Update installed packs
sidebutton registry update [name]
# Remove a registry
sidebutton registry remove <name>
# List all registries
sidebutton registry list
# Search packs across registries
sidebutton search [query]
# Install a knowledge pack
sidebutton install <path|url|name>
# Uninstall a knowledge pack
sidebutton uninstall <domain>
Knowledge Pack Development
# Scaffold a new knowledge pack
sidebutton init [domain]
# Validate a knowledge pack
sidebutton validate [path]
# Publish to registry
sidebutton publish
Sources: packages/sidebutton/README.md:60-100
Dashboard
The SideButton dashboard provides a web-based UI for managing workflows and viewing execution history.
Access the dashboard at: http://localhost:9876
Dashboard Features
- View and manage shortcuts
- Browse available workflows
- View execution logs
- Add workflows to dashboard
- Monitor browser extension status
Chrome Extension
Install the SideButton Chrome extension from the Chrome Web Store.
Extension Features
- 40+ browser commands for navigation, clicking, typing, extraction
- Real DOM access via CSS selectors
- Recording mode to capture manual actions as workflows
- Embed action buttons into web pages
- WebSocket connection for stable reconnection
Sources: README.md:80-100
Workflow Execution
Running a Workflow via CLI
# List all available workflows
sidebutton list
# Execute a workflow by ID
sidebutton run <workflow-id>
# With parameters
sidebutton run <workflow-id> --param value
Running a Workflow via MCP
When connected to an MCP client like Claude Desktop:
# Use the run_workflow tool
run_workflow({ id: "workflow-id", params: { key: "value" } })
Workflow Step Types
| Category | Steps |
|---|---|
| Browser | navigate, click, type, scroll, hover, wait, extract, extractAll, exists, key |
| Shell | shell.run, terminal.open, terminal.run |
| LLM | llm.classify, llm.generate |
| Control | control.if, control.retry, control.stop, workflow.call |
| Data | data.first |
Sources: packages/core/README.md:20-40
Next Steps
- Explore Workflow Configuration for creating custom automations
- Set up Knowledge Packs for domain-specific capabilities
- Configure Provider Integrations for GitHub, Jira, and other tools
- Review Testing Guide for quality assurance workflows
Sources: [README.md:1-50](https://github.com/sidebutton/sidebutton/blob/main/README.md)
System Architecture
Related topics: Package Overview, MCP Server Integration, Workflow Engine
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Package Overview, MCP Server Integration, Workflow Engine
System Architecture
Overview
SideButton is a browser automation and workflow orchestration platform that enables AI agents (such as Claude Desktop, Cursor) to interact with web applications through a unified MCP (Model Context Protocol) interface. The system combines browser automation, CLI tools, LLM capabilities, and external integrations into a coherent workflow execution engine.
The architecture follows a modular monorepo design with four primary packages:
| Package | Purpose |
|---|---|
packages/sidebutton | CLI entry point and CLI transport for MCP |
packages/server | Fastify-based MCP server with REST API and dashboard |
packages/core | Workflow definition parsing and execution engine |
packages/dashboard | React-based web UI for workflow management |
Sources: README.md:1-50
Sources: [README.md:1-50]()
Package Overview
Related topics: System Architecture, Workflow Engine, Chrome Extension
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Workflow Engine, Chrome Extension
Package Overview
SideButton is a browser automation platform organized as a monorepo with four primary packages. The system enables workflow-driven automation through YAML definitions, MCP (Model Context Protocol) integration, a REST API, and a Chrome extension. This document provides a comprehensive overview of each package's architecture, responsibilities, and interdependencies.
Architecture Overview
SideButton follows a layered architecture pattern with clear separation of concerns across packages. The core workflow engine handles execution logic, the server package provides API endpoints and MCP connectivity, the CLI package offers command-line interaction, and the dashboard provides a web-based user interface.
graph TD
User[User] --> CLI[CLI Package]
User --> Dashboard[Dashboard Package]
User --> MCP[MCP Client]
User --> REST[REST API Client]
CLI --> Server[Server Package]
Dashboard --> Server
MCP --> Server
REST --> Server
Server --> Core[Core Package]
Server --> BrowserExtension[Chrome Extension]
Core --> WorkflowEngine[Workflow Engine]
Core --> Providers[Provider Integrations]Package Structure
The repository contains four main packages under the packages/ directory:
| Package | Description |
|---|---|
@sidebutton/core | Core workflow engine and execution runtime |
@sidebutton/server | MCP server, REST API, and embedded dashboard |
@sidebutton/sidebutton | Command-line interface |
dashboard | Frontend React application for the web UI |
Core Package (`@sidebutton/core`)
The core package contains the fundamental workflow orchestration engine. It handles parsing, validation, and execution of YAML-based workflow definitions.
Purpose and Scope
The core package is responsible for the runtime execution of automations. It provides the foundational primitives that the server package wraps with API endpoints. Workflows are defined in YAML and executed through a step-by-step interpreter that supports multiple action types.
Core Exports
The package exposes three primary functions for workflow management:
// packages/core/src/index.ts
export { parseWorkflow, validateWorkflow, executeWorkflow }
| Function | Purpose |
|---|---|
parseWorkflow | Parse YAML workflow definition into internal representation |
validateWorkflow | Validate workflow structure and step types |
executeWorkflow | Execute a workflow with provided context and parameters |
Step Types
The core package supports multiple categories of step types for workflow construction:
| Category | Steps |
|---|---|
| Browser | navigate, click, type, scroll, hover, wait, extract, extractAll, exists, key |
| Shell | shell.run, terminal.open, terminal.run |
| LLM | llm.classify, llm.generate |
| Control | control.if, control.retry, control.stop, workflow.call |
| Data | data.first |
Provider Integrations
The core package includes provider implementations for external service integration. GitHub integration is implemented in packages/core/src/providers/github.ts and provides the following capabilities:
listPRs- List pull requestsgetPR- Get pull request detailscreatePR- Create a pull requestlistIssues- List repository issuesgetIssue- Get issue details
Sources: packages/core/src/providers/github.ts
Server Package (`@sidebutton/server`)
The server package serves as the central hub for all external interactions with the workflow engine. It wraps the core package with MCP protocol support, REST API endpoints, and embeds the dashboard application.
MCP Server
The MCP server implementation exposes workflow execution capabilities to MCP-compatible clients including Claude Desktop and Cursor. The server runs on port 9876 by default and provides the following tools:
| MCP Tool | Description |
|---|---|
run_workflow | Execute a workflow by ID |
list_workflows | List available workflows |
get_workflow | Get workflow YAML definition |
get_run_log | Get execution log |
list_run_logs | List recent executions |
get_browser_status | Check extension connection |
capture_page | Capture page selectors |
navigate | Navigate browser to URL |
snapshot | Get accessibility tree |
click | Click element |
type | Type text |
scroll | Scroll page |
extract | Extract text |
screenshot | Capture screenshot |
hover | Hover over element |
Sources: packages/server/README.md
REST API
The server exposes 60+ JSON endpoints for external integrations. The API supports the same workflow operations available through MCP, enabling programmatic access from any HTTP client.
# Run a workflow
curl -X POST http://localhost:9876/api/workflows/check_ticket/run \
-H "Content-Type: application/json" \
-d '{"params": {"ticket_id": "PROJ-123"}}'
# List workflows
curl http://localhost:9876/api/workflows
# Get run log
curl http://localhost:9876/api/runs/latest
Sources: README.md
Embedded Dashboard
The server embeds a React-based dashboard application served from packages/dashboard/. The dashboard provides:
- Workflow browsing and execution
- Run log viewing
- Shortcut management
- Action library
- Workflow recording
Workflow Engine Extensions
The server extends the core workflow engine with 34+ step types, providing additional capabilities beyond the core package:
| Extended Category | Additional Steps |
|---|---|
| Browser | fill, press_key, scroll_into_view, evaluate, select_option |
| Extended | check_writing_quality, capture_page |
Knowledge Packs
The server supports knowledge packs (also called skill packs) that provide domain-specific knowledge for AI-driven tasks. Knowledge packs include:
- Selectors — CSS selectors for UI elements
- Data models — entity types, fields, relationships, valid states
- State machines — valid transitions per state
- Role playbooks — role-specific procedures (QA, SE, PM, SD)
- Common tasks — step-by-step procedures, gotchas, edge cases
Sources: README.md
CLI Package (`@sidebutton/sidebutton`)
The CLI package provides command-line interaction with the SideButton platform. It serves as the primary interface for local development and scripting workflows.
Installation
npm install -g sidebutton
Core Commands
| Command | Description |
|---|---|
sidebutton | Start server (default port 9876) |
sidebutton --stdio | Start with stdio transport (Claude Desktop) |
sidebutton -p 8080 | Start on custom port |
Workflow Management
| Command | Description |
|---|---|
sidebutton list | List available workflows |
sidebutton run <id> | Run a workflow |
sidebutton status | Check server status |
Knowledge Pack Commands
# Registry management
sidebutton registry add <path|url> # Install from registry
sidebutton registry update [name] # Update installed packs
sidebutton registry remove <name> # Remove registry and packs
sidebutton registry list # Show registries
# Search and install
sidebutton search [query] # Search packs across registries
sidebutton install <path|url|name> # Install a single knowledge pack
sidebutton uninstall <domain> # Remove a knowledge pack
# Development
sidebutton init [domain] # Scaffold a new knowledge pack
sidebutton validate [path] # Lint and validate a knowledge pack
sidebutton publish # Publish to registry
Sources: packages/sidebutton/README.md
Publishing Workflows
The CLI supports publishing skill packs to remote registries via the publish command:
sidebutton publish
This command sends the manifest and all associated files to the configured remote registry at ${REMOTE_BASE_URL}/api/skill-packs/publish. Authentication is required via bearer token.
Dashboard Package
The dashboard is a React-based frontend application that provides the visual interface for managing workflows and viewing execution logs.
Entry Point
The dashboard application is mounted in packages/dashboard/index.html:
<div id="app"></div>
<script type="module" src="/src/main.ts"></script>
Key Pages
| Page | Route | Purpose |
|---|---|---|
| Dashboard Home | / | Display workflow shortcuts |
| Actions | /actions | Browse and search available workflows |
| Action Detail | /actions/:id | View workflow details and run |
| Workflows | /workflows | Library of workflows |
| Workflow Detail | /workflows/:id | Read-only workflow view |
| Recordings | /recordings | View recorded automations |
| Run Logs | /run-logs | View execution history |
Chrome Extension
While not a separate npm package, the Chrome extension is an integral part of the SideButton ecosystem. It provides browser automation capabilities with:
- 40+ browser commands (navigate, click, type, extract, scroll, wait, snapshot)
- Real DOM access via CSS selectors
- Recording mode for capturing manual actions as workflows
- Embed buttons for injecting action buttons into web pages
- WebSocket connection with stable reconnection
Sources: README.md
Dependency Graph
The packages have the following dependency relationships:
graph LR
CLI["@sidebutton/sidebutton"] --> Server["@sidebutton/server"]
Dashboard["dashboard"] --> Server
Server --> Core["@sidebutton/core"]
Server --> BrowserExt["Chrome Extension"]
style Core fill:#e1f5fe
style Server fill:#fff3e0
style CLI fill:#e8f5e9
style Dashboard fill:#f3e5f5| Consumer | Dependency | Relationship |
|---|---|---|
@sidebutton/sidebutton | @sidebutton/server | CLI wraps server functionality |
dashboard | @sidebutton/server | Dashboard embeds in server |
@sidebutton/server | @sidebutton/core | Server uses core for workflow execution |
Technology Stack
| Layer | Technology |
|---|---|
| Runtime | Node.js |
| Core Engine | TypeScript |
| Server | Fastify |
| API Protocol | MCP (Model Context Protocol) |
| Dashboard | React, Vite |
| Browser Automation | Chrome Extension (Manifest V3) |
| Package Manager | pnpm (monorepo) |
Quick Start
Running the Server
# Start the server
sidebutton
# Or with custom port
sidebutton -p 8080
Running a Workflow
# List available workflows
sidebutton list
# Run a specific workflow
sidebutton run <workflow-id>
Integrating with Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sidebutton": {
"command": "npx",
"args": ["sidebutton", "--stdio"]
}
}
}
Related Documentation
Sources: [packages/core/src/providers/github.ts](packages/core/src/providers/github.ts)
Workflow Engine
Related topics: Step Types Reference, Workflow Examples, Package Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Step Types Reference, Workflow Examples, Package Overview
Workflow Engine
Overview
The SideButton Workflow Engine is a YAML-first orchestration system that enables automation of complex tasks through a declarative step-based approach. It provides 34+ built-in step types spanning browser automation, shell execution, LLM integration, and programmatic control flow operations.
The engine executes workflows defined in YAML format, supporting variable interpolation, conditional branching, retry logic, and cross-workflow chaining. Workflows can be triggered via MCP (Model Context Protocol), the REST API, or the dashboard interface.
Sources: README.md
Architecture
Core Components
graph TD
A[Workflow YAML] --> B[Parser]
B --> C[AST / Step Definitions]
C --> D[Executor]
D --> E[Step Handlers]
F[Variables/Context] --> D
D --> F
G[MCP Client] --> D
H[REST API] --> D
I[Dashboard] --> DThe engine consists of three primary layers:
- Parser: Validates and parses YAML workflow definitions into structured step objects
- Executor: Orchestrates step execution, manages state, and handles control flow
- Step Handlers: Provider-specific implementations for each step type
Sources: packages/core/README.md
Workflow Execution Flow
sequenceDiagram
participant Client
participant Executor
participant StepHandler
participant Context
Client->>Executor: executeWorkflow(workflow, context)
Executor->>Executor: Parse YAML
Loop For Each Step
Executor->>StepHandler: Execute Step
StepHandler->>Context: Update Variables
StepHandler-->>Executor: Result
Executor->>Executor: Check Control Flow
End
Executor-->>Client: Execution ResultStep Types
The workflow engine supports five major categories of steps:
Sources: README.md:1-50
Browser Steps
Used for web automation tasks. All browser steps require an active browser connection.
| Step Type | Description |
|---|---|
browser.navigate | Open a URL in the browser |
browser.click | Click an element by CSS selector |
browser.type | Type text into an input element |
browser.fill | Fill input value directly (React-compatible) |
browser.scroll | Scroll the page |
browser.extract | Extract text from an element into a variable |
browser.extractAll | Extract all matching elements |
browser.extractMap | Extract structured data from repeated elements |
browser.wait | Wait for element or fixed delay |
browser.exists | Check if element exists (returns boolean) |
browser.hover | Position cursor over element |
browser.key | Send keyboard keys |
browser.snapshot | Capture accessibility tree snapshot |
browser.injectCSS | Inject CSS styles into page |
browser.injectJS | Execute JavaScript in page context |
browser.select_option | Select dropdown option |
browser.scrollIntoView | Scroll element into viewport |
Sources: README.md:44-63
Shell Steps
Execute command-line operations on the host system.
| Step Type | Description |
|---|---|
shell.run | Execute a bash/shell command |
terminal.open | Open a visible terminal window (macOS) |
terminal.run | Run command in visible terminal window |
Sources: README.md:64-66
LLM Steps
Integrate with large language models for AI-driven operations.
| Step Type | Description |
|---|---|
llm.classify | Structured classification with predefined categories |
llm.generate | Free-form text generation |
llm.decide | Make decisions based on context |
Supported providers include Ollama (local), OpenAI, Anthropic, and Google.
Sources: README.md:67-73
Control Flow Steps
Manage workflow execution logic and branching.
| Step Type | Description |
|---|---|
control.if | Conditional branching based on expression evaluation |
control.retry | Retry block with configurable backoff |
control.stop | End workflow with success/error message |
workflow.call | Call another workflow with parameters |
variable.set | Set a variable value |
Sources: README.md:74-78
Data Steps
Manipulate and transform data between steps.
| Step Type | Description |
|---|---|
data.first | Extract first item from a list |
data.get | Retrieve stored data value |
Sources: README.md:79-82
Variable Interpolation
The workflow engine uses {{variable}} syntax for referencing extracted values and parameters.
steps:
- type: browser.extract
selector: ".username"
as: user
- type: shell.run
cmd: "echo 'Hello, {{user}}!'"
Variables can be:
- Extracted from page elements using
asparameter - Passed as workflow parameters
- Set via
variable.setsteps - Returned from nested workflow calls
Sources: README.md:103-115
Workflow Definition Schema
A workflow is defined with the following structure:
id: workflow_identifier
title: "Display Title"
description: "What this workflow does"
params:
param_name: string # or number, boolean, array, object
steps:
- type: browser.navigate
url: "https://example.com"
- type: browser.extract
selector: ".element"
as: extracted_value
Required Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique workflow identifier |
title | string | Human-readable title |
steps | array | Ordered list of step definitions |
Optional Fields
| Field | Type | Description |
|---|---|---|
description | string | Workflow description |
params | object | Parameter definitions with types |
category | string | Workflow category |
platform | string | Target platform |
Sources: packages/server/src/server.ts
Control Flow Patterns
Conditional Branching
- type: control.if
condition: "{{current_status}} != 'Done'"
then:
- type: llm.classify
prompt: "Should this ticket be closed?"
classes: [close, keep_open]
as: decision
Sources: README.md:117-125
Retry with Backoff
- type: control.retry
max_attempts: 3
backoff: 1000 # milliseconds
steps:
- type: shell.run
cmd: "curl -f https://api.example.com/health"
Workflow Chaining
- type: workflow.call
workflow_id: another_workflow
params:
input_value: "{{extracted_data}}"
Sources: packages/server/defaults/roles/software-engineer.md
MCP Integration
The workflow engine exposes functionality through the Model Context Protocol, enabling AI assistants to execute and manage workflows.
Sources: packages/server/README.md
Available MCP Tools
| Tool | Description |
|---|---|
run_workflow | Execute a workflow by ID |
list_workflows | List available workflows |
get_workflow | Get workflow YAML definition |
get_run_log | Get execution log for a run |
list_run_logs | List recent workflow executions |
get_browser_status | Check browser extension connection |
capture_page | Capture selectors from current page |
MCP Tool Handlers
graph LR
A[MCP Request] --> B[Handler]
B --> C{tool_name}
C -->|list_workflows| D[toolListWorkflows]
C -->|get_workflow| E[toolGetWorkflow]
C -->|list_run_logs| F[toolListRunLogs]
C -->|run_workflow| G[executeWorkflow]Sources: packages/server/src/mcp/handler.ts
List Workflows Response Format
interface WorkflowListItem {
workflow: Workflow;
source: 'actions' | 'workflows';
}
// Response includes:
// - workflow.id
// - workflow.title
// - workflow.params (if any)
// - source identifier
Sources: packages/server/src/mcp/handler.ts:1-50
GitHub Integration
The engine provides specialized steps for GitHub operations through the GitHub CLI provider.
Sources: packages/core/src/providers/github.ts
GitHub Step Types
| Step | Description |
|---|---|
git.listPRs | List pull requests with state filter |
git.getPR | Get PR details and diff statistics |
git.createPR | Create a new pull request |
git.listIssues | List issues with filters |
git.getIssue | Get issue details |
issues.create | Create an issue |
issues.comment | Add a comment to issue/PR |
issues.transition | Change issue status |
Create Pull Request Parameters
interface CreatePRParams {
repo?: string; // Repository in format "owner/repo"
title: string; // PR title
body?: string; // PR description
head: string; // Head branch name
base?: string; // Base branch (default: main)
}
Sources: packages/core/src/providers/github.ts:1-50
Common GitHub Workflows
Review Open PRs:
git.listPRswithstate: "open"— view pending reviewsgit.getPRwith PR number — read details and diff stats- Use browser tools for visual diff review
Autonomous Development Cycle:
git.listIssues— browse available issuesgit.getIssue— read candidate detailsllm.decide— select best issue based on priorityissues.comment— signal work is startinggit.createPR— submit completed work
Sources: packages/server/defaults/targets/_provider-github-cli.md
Execution Context
Each workflow execution maintains a context object that stores:
- Variables: Extracted values and set variables
- Parameters: Input parameters passed to the workflow
- Results: Step execution results
- Logs: Execution logs for debugging
# Context is passed to executeWorkflow
context:
params:
ticket_id: "PROJ-123"
variables:
current_status: "In Progress"
Error Handling
The engine provides multiple mechanisms for error handling:
- control.retry: Automatically retry failed steps with exponential backoff
- control.stop: Gracefully end workflow with error message
- Conditional checks: Use
browser.existsto verify elements before actions
steps:
- type: browser.exists
selector: ".error-message"
as: has_error
- type: control.if
condition: "{{has_error}}"
then:
- type: control.stop
status: error
message: "Error detected on page"
Dashboard Integration
The workflow engine integrates with the SideButton dashboard for:
- Workflow Library: Browse and install workflows
- Run History: View execution logs and results
- Quick Run: Execute workflows with parameter inputs
Sources: packages/server/src/server.ts
Workflow Installation Flow
- User navigates to workflow in library
- Dashboard renders install confirmation page
- POST request submits workflow to local server
- Server saves workflow to user's action library
- Success page confirms installation
graph TD
A[Browse Workflow] --> B[Click Install]
B --> C[POST /install/:workflowId]
C --> D[Server Validates]
D --> E[Save to Actions Library]
E --> F[Show Success Page]Best Practices
- Use extracted variables immediately: Variable references should occur close to their extraction step
- Add wait conditions: Use
browser.waitbefore extracting dynamic content - Handle missing elements: Check existence before interacting
- Limit retry attempts: Configure appropriate
max_attemptsfor unreliable operations - Keep workflows focused: Prefer workflow chaining over monolithic single workflows
Sources: packages/server/defaults/roles/qa.md
Sources: [README.md](https://github.com/sidebutton/sidebutton/blob/main/README.md)
Step Types Reference
Related topics: Workflow Engine, Workflow Examples
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Workflow Engine, Workflow Examples
Step Types Reference
SideButton workflows are composed of discrete steps that define actions to be executed sequentially. Each step type serves a specific purpose—from interacting with web pages to executing shell commands, making decisions via LLM, or orchestrating control flow. Understanding the available step types is essential for building effective automations.
Overview
Steps are the fundamental building blocks of SideButton workflows. They are defined within workflow YAML files and specify:
- What action to perform (the step type)
- What parameters to use for that action
- How to handle results (variable assignment, conditional logic)
Sources: packages/core/README.md
graph TD
A[Workflow YAML] --> B[Step Execution Engine]
B --> C[Browser Steps]
B --> D[Shell Steps]
B --> E[LLM Steps]
B --> F[Control Steps]
B --> G[Data Steps]
B --> H[Git Steps]
B --> I[Issues Steps]Step Type Categories
SideButton organizes steps into the following categories:
| Category | Purpose | Primary Use Case |
|---|---|---|
| Browser | Web page interaction | UI automation, data extraction |
| Shell | Command execution | Build tools, CLI operations |
| LLM | AI-powered decisions | Classification, content generation |
| Control | Flow control | Conditionals, retries, sub-workflows |
| Data | Data manipulation | Variable assignment, extraction |
| Git | Version control | PRs, issues, repository ops |
| Issues | Issue tracking | Bug tracking, task management |
Sources: packages/core/README.md
Sources: [packages/core/README.md]()
Workflow Examples
Related topics: Workflow Engine, Step Types Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Workflow Engine, Step Types Reference
Workflow Examples
Overview
Workflow Examples in SideButton are pre-built YAML-based automation sequences that demonstrate how to combine different step types to accomplish real-world tasks. These examples serve as both practical utilities and learning references for building custom automations.
The SideButton workflow system supports multiple integration methods depending on available providers:
| Provider Preference | Method | Reliability |
|---|---|---|
| 1st | API Provider | Fastest and most reliable |
| 2nd | CLI Tool | Good for git operations |
| 3rd | Browser Automation | Universal fallback |
Sources: packages/server/defaults/roles/software-engineer.md
Workflow Structure
Every workflow in SideButton follows a standardized YAML format with the following core structure:
id: workflow_identifier
title: "Human Readable Title"
description: "What this workflow accomplishes"
steps:
- type: step.type
property: value
Sources: CONTRIBUTING.md
Basic Anatomy of a Workflow
graph TD
A[YAML Workflow File] --> B[Workflow Engine]
B --> C{Step Type Router}
C -->|Browser| D[Browser Tools]
C -->|Shell| E[Terminal/CLI]
C -->|LLM| F[AI Provider]
C -->|Control| G[Flow Control]
C -->|Data| H[Data Manipulation]
D --> I[Execute Action]
E --> I
F --> I
G --> I
H --> I
I --> J[Run Log Entry]Step Types Reference
SideButton provides five categories of steps that can be combined in workflows:
Sources: packages/core/README.md
Browser Steps
| Step | Purpose |
|---|---|
navigate | Navigate browser to URL |
click | Click an element |
type | Type text into an element |
scroll | Scroll the page |
hover | Hover over element |
wait | Wait for condition |
extract | Extract text from element |
extractAll | Extract all matching elements |
exists | Check element exists |
key | Press keyboard key |
snapshot | Get accessibility tree |
screenshot | Capture screenshot |
Sources: README.md
Shell/Terminal Steps
| Step | Purpose |
|---|---|
shell.run | Execute shell command |
terminal.open | Open visible terminal window (macOS) |
terminal.run | Run command in terminal window |
LLM Steps
| Step | Purpose |
|---|---|
llm.classify | Structured classification with categories |
llm.generate | Free-form text generation |
llm.decide | AI-driven decision making |
Control Flow Steps
| Step | Purpose |
|---|---|
control.if | Conditional branching |
control.retry | Retry with backoff |
control.stop | End workflow with message |
workflow.call | Call another workflow with parameters |
Data Steps
| Step | Purpose |
|---|---|
data.first | Extract first item from list |
data.get | Get value from data object |
GitHub Workflow Examples
SideButton includes pre-built workflows for GitHub automation stored in the bundles/github/workflows/ directory.
GitHub PR Claude Review
This workflow demonstrates how to automate PR review using Claude AI. The workflow follows a typical review pattern:
graph LR
A[List Open PRs] --> B[Get PR Details]
B --> C[Extract Diff Stats]
C --> D[Navigate to PR]
D --> E[Review Files Changed]
E --> F[Generate Review Comment]Sources: bundles/github/workflows/github_pr_claude_review.yaml
Common PR Review Sequence:
git.listPRswithstate: "open"— see what needs reviewgit.getPRwith number — read details and diff stats- Use browser tools for visual diff review if needed
Sources: packages/server/defaults/targets/_provider-github-cli.md
Create Release Workflow
The release creation workflow demonstrates orchestrating multiple operations:
- Open the release page using browser automation
- Decide the next version tag based on commit history
- Create the release with proper version naming
Sources: bundles/github/workflows/create_release.yaml
Creating a PR after coding:
git.createPRwith title, head branch, base branchissues.commenton related issue linking the PR
Sources: packages/server/defaults/targets/_provider-github-cli.md
LLM-Based Workflows
Summarize Workflow
The llm_summarize.yaml workflow demonstrates integration with AI providers for text processing:
id: llm_summarize
title: "Summarize Content"
description: "Generate a summary using LLM"
steps:
- type: llm.generate
prompt: "{{input_text}}"
instruction: "Provide a concise summary"
Sources: workflows/llm_summarize.yaml Sources: packages/server/defaults/workflows/llm_summarize.yaml
LLM Provider Support:
LLM steps work with multiple providers:
- Ollama (local)
- OpenAI
- Anthropic
Sources: README.md
Decision Workflows
The llm.decide step type enables autonomous decision-making:
graph TD
A[Issue Received] --> B{llm.decide}
B -->|Clear & well-scoped| C[Pick and start work]
B -->|Ambiguous/blocked| D[Skip, pick next]
B -->|Same priority| E[Prefer smaller scope]
B -->|No suitable issues| F[Stop and report]Sources: packages/server/defaults/roles/software-engineer.md
Browser-Based Workflows
Wikipedia Open Example
This demonstrates basic browser navigation and content extraction:
id: wikipedia_open
title: "Open Wikipedia Page"
steps:
- type: browser.navigate
url: "{{wiki_url}}"
- type: browser.snapshot
as: page_content
- type: browser.extract
selector: "{{element_selector}}"
as: extracted_text
Sources: workflows/wikipedia_open.yaml
Best Practices for Browser Steps:
- Use
snapshotto understand page structure before taking actions - Use
extractto pull specific content from pages - Use
screenshotfor visual verification
Sources: packages/server/defaults/roles/software-engineer.md
Variable Interpolation
All workflows support variable interpolation using {{variable}} syntax:
steps:
- type: browser.extract
selector: ".username"
as: user
- type: shell.run
cmd: "echo 'Hello, {{user}}!'"
- type: llm.generate
prompt: "Write a greeting for {{user}}"
Sources: README.md
Parameterized Workflows
Workflows can accept parameters for flexibility:
id: check_ticket_status
title: "Check Ticket Status"
params:
ticket_id: string
steps:
- type: browser.navigate
url: "https://jira.example.com/browse/{{ticket_id}}"
- type: browser.extract
selector: "[data-testid='status-field']"
as: current_status
Sources: README.md
Execution Flow
sequenceDiagram
participant User
participant MCP as MCP Handler
participant Engine as Workflow Engine
participant Steps as Step Executors
User->>MCP: run_workflow(workflow_id, params)
MCP->>Engine: executeWorkflow(workflow, context)
Engine->>Steps: Execute Step 1
Steps-->>Engine: Step Result
Engine->>Steps: Execute Step 2
Steps-->>Engine: Step Result
Engine->>MCP: Run Log Entry
MCP-->>User: Execution ResultSources: packages/server/src/mcp/handler.ts Sources: packages/core/README.md
Workflow Bundles
SideButton organizes related workflows into bundles. The GitHub bundle includes:
{
"name": "sidebutton/github",
"version": "1.0.0",
"title": "GitHub Automation",
"description": "Workflows for GitHub releases, PR reviews, and repository management",
"workflows": [
"open_release_page.yaml",
"decide_next_tag.yaml",
"create_release.yaml",
"github_pr_claude_review.yaml"
],
"requires": {
"llm": true,
"browser": true
}
}
Sources: bundles/github/bundle.json
Adding Custom Workflows
The easiest way to contribute is by adding workflows to the workflows/ directory:
# Create a new workflow file
cat > workflows/my_workflow.yaml << 'EOF'
id: my_workflow
title: "My Workflow"
description: "What this workflow does"
steps:
- type: shell.run
cmd: "echo 'Hello!'"
EOF
Sources: CONTRIBUTING.md
See Also
- Step Reference — Complete documentation for all step types
- MCP Tools — Model Context Protocol integration
- Server Documentation — Backend workflow execution
- Core Engine — Workflow engine internals
Sources: [packages/server/defaults/roles/software-engineer.md]()
MCP Server Integration
Related topics: System Architecture, Chrome Extension, Knowledge Packs
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Chrome Extension, Knowledge Packs
MCP Server Integration
Overview
The SideButton MCP Server is the core integration layer that exposes browser automation tools, workflow execution capabilities, and knowledge pack management through the Model Context Protocol (MCP). This enables AI assistants like Claude Desktop, Claude Code, and Cursor to control web browsers and execute automated workflows using a standardized interface.
Sources: packages/server/README.md
Architecture
The MCP server is built as part of the @sidebutton/server package and supports multiple transport mechanisms for different AI assistant clients.
Transport Modes
| Transport | Use Case | Configuration |
|---|---|---|
| HTTP/SSE | Claude Code, Cursor | type: "sse", url: "http://localhost:9876/mcp" |
| stdio | Claude Desktop | command: "npx", args: ["sidebutton", "--stdio"] |
| WebSocket | Chrome Extension | Automatic reconnection support |
Sources: packages/sidebutton/README.md
Component Flow
graph TD
subgraph "AI Assistant"
A[Claude Desktop / Claude Code / Cursor]
end
subgraph "MCP Transport"
B[stdio / HTTP-SSE]
end
subgraph "SideButton Server"
C[MCP Handler]
D[Tool Registry]
E[Workflow Engine]
F[Browser Controller]
end
subgraph "Browser Layer"
G[Chrome Extension]
H[Real DOM Access]
end
A --> B
B --> C
C --> D
C --> E
C --> F
F <--> G
G <--> HTool Registry
The MCP server exposes a comprehensive set of tools organized by functionality. Each tool follows a consistent schema with annotations for the Claude Connectors Directory.
Sources: packages/server/src/mcp/tools.ts:1-50
Tool Annotations
| Annotation | Purpose | Example |
|---|---|---|
title | Human-readable display name | "Run Workflow" |
readOnlyHint | Indicates observation-only tools | true for snapshot |
destructiveHint | Indicates state-mutating tools | true for run_workflow |
openWorldHint | Indicates external world interaction | true for browser tools |
Sources: packages/server/src/mcp/tools.ts:14-23
Workflow Tools
Core Workflow Operations
| Tool | Description | Mutates State |
|---|---|---|
run_workflow | Execute a workflow automation by ID | Yes |
list_workflows | List all available workflows | No |
get_workflow | Get workflow YAML definition | No |
list_run_logs | List recent workflow executions | No |
get_run_log | Get execution log for a specific run | No |
run_workflow Parameters
{
workflow_id: string; // Required: Unique identifier
params?: { // Optional: Key-value parameters
[key: string]: string;
};
}
Sources: packages/server/src/mcp/tools.ts:35-52
Browser Automation Tools
The MCP server provides direct browser control through the connected Chrome Extension.
Navigation & State
| Tool | Description |
|---|---|
navigate | Navigate browser to a URL |
snapshot | Get page accessibility tree (DOM structure) |
screenshot | Capture page screenshot |
get_browser_status | Check extension connection status |
capture_page | Capture CSS selectors from current page |
Interaction Tools
| Tool | Description | Read-Only |
|---|---|---|
click | Click an element by selector | No |
type | Type text into an input element | No |
scroll | Scroll the page | No |
hover | Hover over an element | No |
extract | Extract text from an element | Yes |
extract_all | Extract text from all matching elements | Yes |
extract_map | Extract structured data from repeated elements | Yes |
select_option | Select a dropdown option | No |
wait | Wait for element or condition | No |
exists | Check if element exists | Yes |
key | Press a keyboard key | No |
Sources: packages/server/README.md
Browser Tool Annotations
{
name: 'snapshot',
description: 'Get page accessibility snapshot',
inputSchema: {
type: 'object',
properties: {
// Configuration options
}
},
annotations: {
title: 'Page Snapshot',
readOnlyHint: true, // Observation only
openWorldHint: true // Interacts with browser
}
}
Provider Integration Tools
SideButton integrates with external providers for enhanced functionality:
| Category | Tools |
|---|---|
| Git | git.listPRs, git.getPR, git.createPR, git.listIssues, git.getIssue |
| Issues | issues.search, issues.get, issues.create, issues.transition, issues.comment, issues.attach |
| Chat | chat.readChannel, chat.readThread, chat.listChannels |
| Terminal | terminal.open, terminal.run |
| LLM | llm.generate, llm.decide, llm.classify |
Git Provider Implementation
The GitHub CLI connector provides programmatic access to GitHub operations:
async createPullRequest(params: {
repo?: string;
title: string;
body?: string;
head: string;
base?: string;
}): Promise<{ number: number; url: string }>
Sources: packages/core/src/providers/github.ts
MCP Endpoint Configuration
Server Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
/mcp | SSE | Server-Sent Events for Claude Code/Cursor |
/mcp | POST | Tool invocation requests |
/mcp | GET | Server info and capabilities |
Sources: packages/server/src/server.ts
Client Configuration Examples
#### Claude Desktop
{
"mcpServers": {
"sidebutton": {
"command": "npx",
"args": ["sidebutton", "--stdio"]
}
}
}
#### Claude Code
{
"mcpServers": {
"sidebutton": {
"type": "sse",
"url": "http://localhost:9876/mcp"
}
}
}
#### Cursor
{
"mcpServers": {
"sidebutton": {
"url": "http://localhost:9876/mcp"
}
}
}
Sources: packages/sidebutton/README.md
CLI Commands for MCP
The sidebutton CLI provides workflow management commands:
sidebutton list # List available workflows
sidebutton run <id> # Run a workflow by ID
sidebutton status # Check server status
Sources: packages/sidebutton/README.md
Data Models
MCP Tool Schema
export interface McpTool {
name: string;
description: string;
inputSchema: Record<string, unknown>;
annotations?: McpToolAnnotations;
}
export interface McpToolAnnotations {
title: string;
readOnlyHint?: true;
destructiveHint?: true;
openWorldHint?: true;
}
Sources: packages/server/src/mcp/tools.ts:7-23
Quick Start
``bash npx sidebutton@latest ``
- Start the server:
- Connect your AI assistant using the appropriate configuration above
``bash sidebutton status ``
- Verify connection:
``bash sidebutton run <workflow-id> ``
- Execute a workflow:
Sources: AGENTS.md
See Also
- Core Workflow Engine - Workflow execution runtime
- Chrome Extension - Browser control implementation
- Knowledge Packs - Domain-specific automation packs
Source: https://github.com/sidebutton/sidebutton / Human Manual
Chrome Extension
Related topics: MCP Server Integration, Step Types Reference
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Server Integration, Step Types Reference
Chrome Extension
The SideButton Chrome Extension is a Manifest V3 browser extension that provides real-time browser automation capabilities through a WebSocket connection to the local MCP server. It enables AI agents and workflows to interact with web pages using real DOM access via CSS selectors.
Overview
The Chrome Extension serves as the primary interface between SideButton's workflow engine and web pages. Rather than relying on pixel coordinates or screenshots, it provides direct DOM manipulation capabilities, making browser automation precise and reliable.
Distribution: Available on the Chrome Web Store
Source location: extension/ directory in the repository
Sources: CONTRIBUTING.md
Architecture
graph TD
subgraph "Browser Context"
CE[Chrome Extension]
WS[WebSocket Connection]
end
subgraph "Local Server"
MCP[MCP Server :9876]
API[REST API]
end
subgraph "Workflow Engine"
WE[Workflow Executor]
ST[Step Types]
end
CE -->|Real DOM Access| PAGE[Web Pages]
CE -->|WebSocket| WS
WS --> MCP
MCP --> WE
WE --> ST
ST -->|browser.* steps| CEConnection Flow
- User clicks the SideButton extension icon in Chrome
- Extension establishes WebSocket connection to
http://localhost:9876 - MCP server validates the connection and exposes browser tools
- Workflows execute browser steps through the extension
- Extension interacts with web pages via Chrome DevTools Protocol
Sources: README.md
Browser Commands
The extension supports 40+ browser commands organized into functional categories:
| Command | Description | Use Case |
|---|---|---|
navigate | Navigate browser to URL | Open pages for automation |
click | Click an element by selector | Interact with buttons, links |
type | Type text into an element | Form input |
scroll | Scroll the page | Load more content |
hover | Hover over element | Trigger hover states |
extract | Extract text from element | Read page content |
extract_all | Extract all matching elements | Get lists of items |
extract_map | Extract structured data from repeated elements | Scrape data tables |
select_option | Select dropdown option | Choose from selects |
fill | Fill input value (React-compatible) | Handle React inputs |
press_key | Send keyboard keys | Keyboard shortcuts |
scroll_into_view | Scroll element into viewport | Ensure element visible |
evaluate | Execute JavaScript in browser | Custom interactions |
exists | Check if element exists | Conditional logic |
wait | Wait for element or delay | Synchronize with page |
screenshot | Capture page screenshot | Visual verification |
snapshot | Get page accessibility tree | Understand page structure |
capture_page | Capture selectors from current page | Identify elements |
check_writing_quality | Evaluate text quality | Content validation |
Sources: README.md
Key Features
Real DOM Access
Unlike screen-based automation tools that rely on pixel coordinates or OCR, SideButton uses real DOM access through CSS selectors. This provides:
- Precise element targeting
- Works with dynamically rendered content
- Handles SPA (Single Page Applications) correctly
- Faster execution than vision-based alternatives
Sources: README.md
Recording Mode
The extension includes a recording mode that captures manual actions as reusable workflows. This enables:
- Manual browsing through desired workflow steps
- Extension records each action with selector
- Export as YAML workflow definition
- Replay with workflow engine
Embed Buttons
SideButton can inject action buttons into any web page, enabling:
- Quick access to defined actions
- On-page automation triggers
- Custom UI integration
WebSocket Connection
The extension maintains a stable WebSocket connection with automatic reconnection:
sequenceDiagram
participant EXT as Extension
participant WS as WebSocket
participant MCP as MCP Server
participant PAGE as Web Page
EXT->>WS: Connect
WS->>MCP: Establish Session
MCP-->>WS: Connected
WS-->>EXT: Ready
loop On Command
MCP->>EXT: Execute Tool
EXT->>PAGE: DOM Action
PAGE-->>EXT: Result
EXT-->>MCP: Response
end
Note over EXT,WS: Auto-reconnect on disconnectStable Reconnection
The WebSocket implementation handles connection drops gracefully:
- Automatic retry with exponential backoff
- Works with local server instances
- Supports remote server connections
- Maintains session state across reconnections
Sources: README.md
Installation
From Chrome Web Store
- Visit the Chrome Web Store listing
- Click "Add to Chrome"
- Grant necessary permissions
From Source (Development)
- Go to
chrome://extensions/ - Enable Developer mode
- Click Load unpacked and select the
extension/folder - Navigate to any page and click the extension icon to connect
Sources: CONTRIBUTING.md
Connection States
| State | Indicator | Meaning |
|---|---|---|
| Connected | Green dot | Extension linked to server |
| Disconnected | Red dot | No active connection |
| Reconnecting | Yellow dot | Attempting to reconnect |
Verify connection status using the MCP get_browser_status tool:
{
"tool": "get_browser_status",
"expected": { "connected": true }
}
Sources: packages/server/defaults/roles/qa.md
Usage in Workflows
Browser steps are defined in YAML workflows:
steps:
- type: browser.navigate
url: "https://github.com/owner/repo/issues"
- type: browser.snapshot
as: page_state
- type: browser.click
selector: ".btn-primary"
- type: browser.type
selector: "#title"
text: "{{issue_title}}"
- type: browser.extract
selector: ".issue-number"
as: new_issue_id
Variable Interpolation
Use {{variable}} syntax to reference extracted values:
steps:
- type: browser.extract
selector: ".username"
as: user
- type: shell.run
cmd: "echo 'Hello, {{user}}!'"
Sources: README.md
Step Types Reference
Navigation Steps
| Step Type | Parameters | Description |
|---|---|---|
browser.navigate | url | Open URL in connected browser |
Interaction Steps
| Step Type | Parameters | Description |
|---|---|---|
browser.click | selector | Click element by CSS selector |
browser.type | selector, text | Type text into input |
browser.fill | selector, value | Fill input value (React-compatible) |
browser.hover | selector | Hover over element |
browser.select_option | selector, value | Select dropdown option |
browser.press_key | keys | Send keyboard keys |
browser.scroll | direction, amount | Scroll page |
browser.scroll_into_view | selector | Scroll element into view |
Extraction Steps
| Step Type | Parameters | Description |
|---|---|---|
browser.extract | selector, as | Extract text from single element |
browser.extract_all | selector, as | Extract all matching elements |
browser.extract_map | selector, mapping, as | Extract structured data |
browser.snapshot | as | Get accessibility tree |
browser.screenshot | as | Capture screenshot |
Verification Steps
| Step Type | Parameters | Description |
|---|---|---|
browser.exists | selector | Check if element exists |
browser.wait | selector or ms | Wait for element or delay |
Advanced Steps
| Step Type | Parameters | Description |
|---|---|---|
browser.capture_page | - | Capture selectors from current page |
browser.evaluate | script | Execute JavaScript |
Sources: packages/core/README.md
Integration with Providers
The extension works with platform-specific browser providers for deeper integration:
GitHub Browser Provider
When configured with GITHUB_BROWSER_URL, the extension can:
- Navigate to repository pages
- Read PR details via snapshot
- Review diffs by clicking "Files changed" tab
- List and filter pull requests
- Create issues through the web interface
Configuration: Set GITHUB_BROWSER_URL in Settings > Environment Variables (e.g., https://github.com)
Requirements: Must be logged into GitHub in the connected browser session
Sources: packages/server/defaults/targets/_provider-github-browser.md
Provider Preference
When multiple integration methods exist, SideButton follows this preference order:
- API Provider — Fastest and most reliable
- CLI Tool — Good for git operations, builds
- Browser Automation — Universal fallback for visual tasks
graph LR
A[Task] --> B{API Available?}
B -->|Yes| C[Use API]
B -->|No| D{CLI Available?}
D -->|Yes| E[Use CLI]
D -->|No| F[Browser Automation]
C -->|Browser needed| G[Browser via Extension]
E -->|Visual review| GBrowser tools complement CLI for visual tasks like:
- Diff viewing
- Board reviews
- UI bug identification
- Screenshot evidence
Sources: packages/server/defaults/roles/software-engineer.md
Smoke Test
Verify extension connectivity during deployment testing:
Step 1: Server Health
GET http://localhost:9876/health
Expected response:
{"status":"ok","version":"...","browser_connected":true}
If browser_connected: false — stop, Chrome extension is not connected.
Step 2: Extension Connection
Use get_browser_status tool:
Expected: { "connected": true }
If disconnected:
- Open Chrome
- Verify SideButton extension is enabled at
chrome://extensions - Refresh the page
Step 3: Snapshot Test
Navigate to any page, then use snapshot:
Verify: Returns structured YAML with element refs (ref=N), not empty, contains page elements.
Sources: packages/server/defaults/roles/qa.md
Error Handling
Common extension issues and solutions:
| Issue | Cause | Solution |
|---|---|---|
browser_connected: false | Extension not connected | Click extension icon to connect |
| WebSocket timeout | Server not running | Start with pnpm dev:server |
| Element not found | Selector changed | Use capture_page to refresh selectors |
| React input issues | Virtual DOM | Use fill instead of type |
Security Considerations
- Browser extension requires significant permissions for DOM access
- WebSocket connection is local by default
- Remote connections should use authenticated endpoints
- Never store credentials in workflow definitions
Sources: AGENTS.md
Related Documentation
- MCP Tools Reference — Full tool documentation
- Workflow Engine — Workflow execution
- REST API — HTTP API alternative
- Knowledge Packs — Domain-specific extensions
Sources: [CONTRIBUTING.md](https://github.com/sidebutton/sidebutton/blob/main/CONTRIBUTING.md)
Knowledge Packs
Related topics: MCP Server Integration, Getting Started
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: MCP Server Integration, Getting Started
Knowledge Packs
Knowledge Packs (also referred to as Skill Packs in CLI commands and code) are installable domain-specific modules that teach autonomous AI agents how specific web applications work. They serve as the foundational knowledge layer powering AI code review, automated testing, and enterprise AI agent deployments.
Overview
Knowledge Packs provide structured, domain-specific intelligence to the SideButton platform. Rather than requiring AI agents to learn from scratch how to interact with each web application, Knowledge Packs pre-package essential information that enables immediate, accurate automation.
The SideButton registry currently hosts 11 domains with 28+ modules published, and maintains an open registry where anyone can build and share packs for any web application.
Sources: README.md
Pack Components
Each Knowledge Pack comprises five core module types that together provide comprehensive domain understanding:
| Component | Description | Purpose |
|---|---|---|
| Selectors | CSS selectors for UI elements | Precise DOM element targeting without pixel coordinates or screenshots |
| Data Models | Entity types, fields, relationships, valid states | Structured understanding of domain objects |
| State Machines | Valid transitions per state | Predictable, safe workflow execution |
| Role Playbooks | Role-specific procedures (QA, SE, PM, SD) | Context-aware guidance for different user roles |
| Common Tasks | Step-by-step procedures, gotchas, edge cases | Handling typical operations with best practices |
Sources: README.md
Selector Modules
Selectors provide CSS-based targeting for browser automation, ensuring reliability across different browsers and viewport sizes. Unlike coordinate-based or screenshot-based approaches, CSS selectors remain stable as long as the application's DOM structure is maintained.
Role Playbooks
Role playbooks define standard operating procedures for specific personas. For example, the software-engineer role includes:
- Decision guidance for issue prioritization
- Step types for common development tasks
- Integration patterns for git, issues, chat, and terminal operations
Sources: packages/server/defaults/roles/software-engineer.md
Architecture
graph TD
A[User/Agent] -->|sidebutton install| B[CLI]
B --> C{Source Type}
C -->|Local Path| D[Local Directory]
C -->|Git URL| E[Git Repository]
C -->|Registry Name| F[SideButton Registry]
D --> G[Install Skill Pack]
E --> G
F --> H[Fetch from Registry API]
H --> G
G --> I[Parse Manifest]
I --> J[Copy to ~/.sidebutton/packs/]
J --> K[Knowledge Pack Active]
L[Workflow Engine] -->|Uses| K
M[MCP Tools] -->|Reads| KInstallation Methods
Knowledge Packs can be installed from multiple sources:
| Source Type | Command Example | Use Case |
|---|---|---|
| Local directory | sidebutton install ./my-pack | Development and testing |
| Git URL | sidebutton install https://github.com/org/skill-packs | Remote repositories |
| Registry name | sidebutton install github.com | Published registry packs |
# Install from registry
sidebutton install github.com
sidebutton install atlassian.net
# Install from local path
sidebutton install ./custom-pack
# Install from Git URL
sidebutton install https://github.com/org/skill-packs
# Force reinstall
sidebutton install github.com --force
Sources: packages/server/src/cli.ts
Registry Management
The registry system allows centralized distribution and discovery of Knowledge Packs.
Registry CLI Commands
| Command | Description | |
|---|---|---|
| `sidebutton registry add <path\ | url>` | Register and install all packs from a registry |
sidebutton registry update [name] | Update installed packs from registry | |
sidebutton registry remove <name> | Uninstall packs and remove registry | |
sidebutton registry list | Show registries and pack counts | |
sidebutton search [query] | Search packs across registries |
Sources: packages/server/README.md
Registry Configuration
Registries are stored in the SideButton configuration directory (~/.sidebutton/registries.json) and contain metadata about available skill pack sources.
Publishing Knowledge Packs
Publishing Process
- Initialize a new pack using
sidebutton init [domain] - Develop the pack with manifest and modules
- Validate using
sidebutton validate [path] - Authenticate with
sidebutton login - Publish via
sidebutton publish
Manifest Structure
The manifest.json defines the pack's metadata:
{
"domain": "github.com",
"title": "GitHub",
"version": "1.0.0",
"description": "GitHub integration for AI agents",
"tagline": "Streamlined GitHub workflows",
"category": "development",
"modules": ["selectors", "data-models", "state-machines"],
"roles": ["software-engineer", "qa"]
}
Sources: packages/server/src/cli.ts
Publishing Endpoint
const res = await fetch(`${REMOTE_BASE_URL}/api/skill-packs/publish`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${auth.token}`,
},
body: JSON.stringify({
domain: manifest.domain,
name: manifest.title || manifest.name || manifest.domain,
version: manifest.version,
description: manifest.description || '',
tagline: manifest.tagline || '',
modules: manifest.modules || [],
roles: manifest.roles || [],
category: manifest.category || '',
manifest,
files,
}),
});
Sources: packages/server/src/cli.ts
Integration with Workflow Engine
Knowledge Packs integrate with the core SideButton workflow engine through step types that reference pack-specific configurations:
graph LR
A[Knowledge Pack] --> B[Step Type Resolution]
B --> C[Provider Selection]
C --> D[Git Provider]
C --> E[Issues Provider]
C --> F[Chat Provider]
C --> G[Browser Provider]Available Step Types
| Category | Steps |
|---|---|
| Browser | navigate, click, type, scroll, hover, wait, extract, extractAll, exists, key |
| Shell | shell.run, terminal.open, terminal.run |
| LLM | llm.classify, llm.generate |
| Control | control.if, control.retry, control.stop, workflow.call |
| Data | data.first |
| Git | git.listPRs, git.getPR, git.createPR, git.listIssues, git.getIssue |
| Issues | issues.search, issues.get, issues.create, issues.transition, issues.comment |
| Chat | chat.readChannel, chat.readThread, chat.listChannels |
Sources: packages/core/README.md
Development Workflow
Creating a New Pack
# Initialize a new knowledge pack
sidebutton init my-app.com
# Scaffolded structure:
# my-app.com/
# ├── manifest.json
# ├── modules/
# │ ├── selectors/
# │ ├── data-models/
# │ └── state-machines/
# ├── roles/
# │ └── software-engineer.md
# └── targets/
# └── github.md
Validation
Before publishing, validate the pack structure:
sidebutton validate ./my-app.com
This command lints and checks:
- Manifest completeness
- Module structure validity
- Selector syntax correctness
- File integrity
Configuration Locations
| Path | Purpose |
|---|---|
~/.sidebutton/packs/ | Installed Knowledge Pack directories |
~/.sidebutton/registries.json | Registry configurations |
~/.sidebutton/config.json | Main SideButton configuration |
Best Practices
- Selector Stability: Use semantic CSS selectors that won't change with visual updates
- Versioning: Follow semantic versioning for pack updates
- Error Handling: Include edge case documentation in Common Tasks
- Role Coverage: Provide at least one role playbook for each major user persona
- State Documentation: Clearly define all valid state transitions
Available Packs
The SideButton registry includes Knowledge Packs for popular platforms:
| Domain | Category | Modules |
|---|---|---|
| github.com | Development | Selectors, Data Models, SE Role |
| atlassian.net | Development | Selectors, Data Models |
| *(10 more domains)* | Various | Various |
Sources: README.md
See Also
- Core Workflow Engine -
@sidebutton/corepackage - MCP Server -
@sidebutton/serverpackage with REST API - Chrome Extension - Browser extension integration
- Full Documentation
Sources: [README.md](https://github.com/sidebutton/sidebutton/blob/main/README.md)
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
The project should not be treated as fully validated until this signal is reviewed.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
The project may affect permissions, credentials, data exposure, or host boundaries.
Doramagic Pitfall Log
Doramagic extracted 10 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: Add control.foreach step type for iterating over lists
- Severity: medium
- Finding: Installation risk is backed by a source signal: Add control.foreach step type for iterating over lists. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/sidebutton/sidebutton/issues/1
2. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | README/documentation is current enough for a first validation pass.
3. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | last_activity_observed missing
4. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | no_demo; severity=medium
5. Security or permission risk: No sandbox install has been executed yet; downstream must verify before user use.
- Severity: medium
- Finding: No sandbox install has been executed yet; downstream must verify before user use.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.safety_notes | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | No sandbox install has been executed yet; downstream must verify before user use.
6. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | no_demo; severity=medium
7. Security or permission risk: Native <select> elements cannot be programmatically selected via click/type tools
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: Native <select> elements cannot be programmatically selected via click/type tools. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/sidebutton/sidebutton/issues/12
8. Security or permission risk: v1.1.0
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: v1.1.0. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/sidebutton/sidebutton/releases/tag/v1.1.0
9. Maintenance risk: issue_or_pr_quality=unknown
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | issue_or_pr_quality=unknown
10. Maintenance risk: release_recency=unknown
- Severity: low
- Finding: release_recency=unknown。
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:1124378210 | https://github.com/sidebutton/sidebutton | release_recency=unknown
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using sidebutton with real data or production workflows.
- Add control.foreach step type for iterating over lists - github / github_issue
- Native <select> elements cannot be programmatically selected via click/t - github / github_issue
- v1.1.0 - github / github_release
- README/documentation is current enough for a first validation pass. - GitHub / issue
Source: Project Pack community evidence and pitfall evidence