skyvern Manual Preview - Doramagic.ai

Doramagic Project Pack · Human Manual

skyvern

Skyvern is an open-source browser automation platform that enables AI agents to interact with websites by understanding natural language instructions. The platform combines large language ...

Introduction to Skyvern

Related topics: System Architecture, Browser Automation Engine

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Multi-LLM Support

Continue reading this section for the full explanation and source context.

Section Workflow Automation

Continue reading this section for the full explanation and source context.

Section Model Context Protocol (MCP) Integration

Continue reading this section for the full explanation and source context.

Introduction to Skyvern

Overview

Skyvern is an open-source browser automation platform that enables AI agents to interact with websites by understanding natural language instructions. The platform combines large language model (LLM) powered reasoning with browser automation capabilities, allowing developers to create workflows that can navigate websites, fill out forms, extract data, download files, and perform complex multi-step web tasks autonomously.

Skyvern operates by interpreting user prompts and executing browser actions through a CDP (Chrome DevTools Protocol) connection, providing AI applications with the ability to interact with the web just like a human user would Sources: README.md:1-50

Key Features

Multi-LLM Support

Skyvern supports integration with multiple LLM providers, enabling flexible deployment options:

Provider	Supported Models
OpenAI	GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini
Anthropic	Claude 4.7 Opus, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Haiku, Sonnet, Opus)
Azure OpenAI	Any GPT models deployed to Azure subscription
AWS Bedrock	Claude 4.7, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Sonnet, Opus)
Gemini	Gemini 3.1 Pro, Gemini 3 Flash

Sources: README.md:65-72

Workflow Automation

Skyvern enables the creation of automated workflows that can:

Navigate to websites and interact with web elements
Fill out forms and submit data
Extract structured information from web pages
Handle authentication and credential management
Download files and manage browser sessions
Handle multi-factor authentication (2FA/TOTP)
Schedule and execute tasks on a recurring basis

Sources: skyvern-frontend/src/routes/tasks/create/CreateNewTaskForm.tsx:1-30

Model Context Protocol (MCP) Integration

Skyvern provides MCP server implementation for seamless integration with AI applications. This allows AI applications to connect to Skyvern and utilize its browser automation capabilities through a standardized protocol Sources: integrations/mcp/README.md:1-25

Architecture Overview

System Components

graph TD
    A[AI Application] -->|MCP Protocol| B[Skyvern MCP Server]
    B --> C[Skyvern API]
    C --> D[Task Executor]
    D --> E[Browser Automation Engine]
    E --> F[CDP Browser Instance]
    
    G[LLM Provider] -->|Reasoning| D
    H[Credential Vault] -->|Auth| D
    I[Schedule Manager] -->|Trigger| C

Browser Connection Options

Skyvern supports multiple browser connection modes:

Local CDP Browser - Connect to a locally running Chrome instance
Skyvern Cloud Browser - Use managed browser infrastructure
Browser Tunneling - Expose local browser to Skyvern Cloud via tunnel

Sources: README.md:85-120

Getting Started

Installation and Setup

Requirements: Python 3.11+ environment Sources: integrations/mcp/README.md:15

# Install Skyvern
pip install skyvern

# Initialize configuration
skyvern init

# Run the server (local mode only)
skyvern run server

Quickstart for Contributors

# Install dependencies using uv
uv sync --group dev

# Run setup wizard
uv run skyvern quickstart

# Access UI at http://localhost:8080

Sources: README.md:45-60

SDK Usage

Python SDK

from skyvern import Skyvern

skyvern = Skyvern(api_key="your-api-key")
skyvern.set_browser_context(
    browser_type="cdp-connect",
    remote_debugging_url="http://127.0.0.1:9222"
)
task = await skyvern.run_task(
    prompt="Find the top post on hackernews today"
)

MCP Tools

Skyvern provides comprehensive MCP tools for browser automation:

Category	Tools
Navigation	`skyvern_navigate`, `skyvern_click`, `skyvern_select_option`, `skyvern_press_key`, `skyvern_drag`
Data Extraction	`skyvern_extract`, `skyvern_screenshot`, `skyvern_find`, `skyvern_validate`, `skyvern_get_html`
Authentication	`skyvern_login`, `skyvern_credential_list`, `skyvern_credential_get`
Tabs & Frames	`skyvern_tab_new`, `skyvern_tab_list`, `skyvern_tab_switch`, `skyvern_frame_list`
Network	`skyvern_console_messages`, `skyvern_network_requests`, `skyvern_network_route`

Sources: skyvern/cli/mcp_tools/README.md:1-50

Workflows

Skyvern supports workflow-based automation where complex tasks can be defined as a series of steps with conditional logic, evaluations, and human interaction checkpoints.

graph LR
    A[Start] --> B[Block 1: Action]
    B --> C[Block 2: Condition]
    C -->|True| D[Block 3: Evaluation]
    C -->|False| E[Block 4: Fallback]
    D --> F[Human Interaction]
    F --> G[Continue to Next]
    E --> G

Workflow Block Types

Block Type	Purpose
Action	Execute browser actions (click, type, navigate)
Condition	Branch logic based on page state
Evaluation	Run JavaScript to validate or extract data
Human Interaction	Pause workflow for manual input

Sources: skyvern-frontend/src/routes/workflows/workflowRun/WorkflowRunTimelineBlockItem.tsx:1-60

Authentication and Credentials

Credential Services

Skyvern supports multiple credential backends:

Skyvern Vault (built-in)
Bitwarden
1Password
Azure Key Vault
Custom credential services via API configuration

Sources: skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:1-40

2FA/TOTP Handling

Skyvern provides automated TOTP code extraction and attachment to runs:

<PushTotpCodeForm
  showAdvancedFields
  onSuccess={handleFormSuccess}
/>

The system extracts verification codes from push notifications and attaches them to relevant workflow runs automatically.

Sources: skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:1-30

Task Creation

Tasks are defined using natural language prompts that describe what Skyvern should do:

prompt="Find the top post on hackernews today"

Advanced Settings

Parameter	Description
Navigation Payload	JSON parameters for routes/states
Proxy Location	Route through geographic proxies
Browser Session ID	Use persistent browser sessions
Browser Address	CDP server address

Sources: skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:1-50

Scheduling

Tasks and workflows can be scheduled using cron expressions with timezone support:

schedule = await skyvern.create_schedule(
    workflow_id="workflow_xxx",
    cron_expression="0 9 * * *",  # Daily at 9 AM
    timezone="America/New_York"
)

Sources: skyvern-frontend/src/routes/workflows/editor/panels/schedulePanel/CreateScheduleDialog.tsx:1-60

Cloud Integration

Browser Tunneling

Connect Skyvern Cloud to your local browser with existing cookies and extensions:

# Start Chrome with tunnel to Skyvern Cloud
skyvern browser serve --tunnel

This command creates a tunnel URL that can be used to run tasks with your local browser state Sources: README.md:115-135

Claude Desktop Integration

Skyvern provides downloadable .mcpb bundles for quick Claude Desktop setup:

./scripts/package-mcpb.sh 1.0.23

Sources: skyvern/cli/mcpb/claude_desktop/README.md:1-25

Telemetry

By default, Skyvern collects basic usage statistics to understand how the platform is being used. To opt-out:

export SKYVERN_TELEMETRY=false

Sources: README.md:35-38

License

Skyvern's open-source repository is licensed under AGPL-3.0. The core automation logic is available in this repository, with anti-bot measures available in the managed cloud offering Sources: README.md:40-43

Documentation and Support

Documentation: https://www.skyvern.com/docs
Discord Community: https://discord.gg/fG2XXEuQX3
Email Support: [email protected]
GitHub Issues: Help Wanted标签的问题

For more detailed information on specific features:

Sources: [README.md:65-72]()

System Architecture

Related topics: Introduction to Skyvern, Browser Automation Engine, Workflow System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Component Structure

Continue reading this section for the full explanation and source context.

Section Key Frontend Components

Continue reading this section for the full explanation and source context.

Section Forge Application

Continue reading this section for the full explanation and source context.

System Architecture

Overview

Skyvern is an AI-powered web automation framework that enables programmatic browser control through natural language instructions. The system architecture consists of three primary layers: a React-based frontend interface, a Python backend API (Forge), and a browser automation engine. This document provides a comprehensive technical overview of the system's components, data flows, and integration patterns.

High-Level Architecture

graph TD
    subgraph Frontend["Frontend Layer (React/TypeScript)"]
        UI[User Interface]
        Forms[Task & Workflow Forms]
        Stream[Browser Stream Viewer]
    end
    
    subgraph Backend["Backend Layer (Python/Forge)"]
        API[Forge API]
        Agent[AI Agent Engine]
        Workflow[Workflow Engine]
        Scheduler[Scheduler Service]
    end
    
    subgraph Browser["Browser Automation Layer"]
        BrowserMgr[Browser Manager]
        CDP[Chrome DevTools Protocol]
        BrowserInst[Browser Instances]
    end
    
    subgraph Storage["Storage & External Services"]
        S3[S3 Storage]
        DB[(Database)]
        LLM[LLM Providers]
    end
    
    UI --> Forms
    Forms --> API
    UI --> Stream
    Stream --> BrowserMgr
    API --> Agent
    API --> Workflow
    API --> Scheduler
    Agent --> BrowserMgr
    Agent --> LLM
    Workflow --> S3
    Scheduler --> DB
    BrowserMgr --> CDP
    CDP --> BrowserInst

Frontend Architecture

The frontend is a React-based Single Page Application (SPA) located in the skyvern-frontend/ directory. It provides user interfaces for task creation, workflow management, credentials handling, and real-time browser streaming.

Component Structure

Component Category	Location	Purpose
Task Forms	`src/routes/tasks/create/`	Task creation and management forms
Workflow Editor	`src/routes/workflows/editor/`	Visual workflow building interface
Credentials	`src/routes/credentials/`	Credential and TOTP management
Schedules	`src/routes/schedules/`	Schedule viewing and configuration
Shared Components	`src/components/`	Reusable UI components

Key Frontend Components

#### BrowserStream Component

The BrowserStream component handles real-time browser visualization. It displays animated loading states while establishing connections and renders rotating messages to indicate progress.

// skyvern-frontend/src/components/BrowserStream.tsx
<RotateThrough interval={7 * 1000}>
  <span>Hm, working on the connection...</span>
  <span>Hang tight, we're almost there...</span>
  <span>Just a moment...</span>
  <span>Backpropagating...</span>
  <span>Attention is all I need...</span>
  <span>Consulting the manual...</span>
</RotateThrough>

Sources: skyvern-frontend/src/components/BrowserStream.tsx

#### Task Forms

Task creation is handled through two primary form components:

CreateNewTaskForm: Used for creating new tasks with navigation goals
SavedTaskForm: Used for creating tasks from saved templates

Both forms support advanced settings including navigation payloads for specifying parameters, routes, or states:

// Navigation Payload field in SavedTaskForm
<FormField
  control={form.control}
  name="navigationPayload"
  render={({ field }) => (
    <FormItem>
      <FormLabel>
        <h1 className="text-lg">Navigation Payload</h1>
        <h2 className="text-base text-slate-400">
          Specify important parameters, routes, or states
        </h2>
      </FormLabel>
      <CodeEditor {...field} language="json" />
    </FormItem>
  )}
/>

Sources: skyvern-frontend/src/routes/tasks/create/SavedTaskForm.tsx

#### Workflow Editor Workspace

The workflow editor workspace provides local execution capabilities with a dialog-based interface for running code locally:

// skyvern-frontend/src/routes/workflows/editor/Workspace.tsx
function bash(command: string, code?: string) {
  return <code className="rounded bg-slate-800 px-1.5 py-0.5">{command}</code>;
}

// Installation and setup instructions
// 1. Install skyvern: pip install skyvern
// 2. Set up skyvern: skyvern quickstart
// 3. Run the code: skyvern run code --params '{...}' main.py

Sources: skyvern-frontend/src/routes/workflows/editor/Workspace.tsx

Backend Architecture (Forge)

The Forge backend is the core Python application that handles task execution, workflow orchestration, and browser automation. Key modules include:

Forge Application

The main application entry point in skyvern/forge/forge_app.py initializes the FastAPI application, configures middleware, and registers routes.

AI Agent Engine

The agent system in skyvern/forge/agent.py processes natural language instructions and generates executable browser actions. The agent:

Receives task definitions and navigation goals
Interacts with LLM providers for decision-making
Generates action sequences for browser automation
Handles error recovery and retry logic

Workflow Service

Workflow definitions are managed through the SDK service in skyvern/forge/sdk/workflow/service.py. This module provides:

Workflow creation and versioning
Script management with cache keys
Execution history tracking

Browser Automation Layer

Browser Manager

The browser manager (skyvern/webeye/browser_manager.py) orchestrates browser instances using Chrome DevTools Protocol (CDP). It provides:

Browser pool management
Session persistence
Screenshot and recording capabilities
Multi-tab support

Browser Configuration Options

The frontend exposes several browser configuration parameters:

Parameter	Type	Purpose
`proxyLocation`	string	Proxy server routing
`browserSessionId`	string	Persistent session identifier (format: `pbs_xxx`)
`cdpAddress`	string	Remote CDP endpoint (e.g., `http://127.0.0.1:9222`)

Sources: skyvern-frontend/src/routes/tasks/create/PromptBox.tsx

Data Storage and External Services

AWS Integration

Skyvern uses AWS services for storage and cloud operations. The S3Uri class provides URI parsing for S3 resources:

# skyvern/forge/sdk/api/aws.py
class S3Uri:
    """Parse and manipulate S3 URIs."""
    
    def __init__(self, uri: str) -> None:
        self._parsed = urlparse(uri, allow_fragments=False)
    
    @property
    def bucket(self) -> str:
        return self._parsed.netloc
    
    @property
    def key(self) -> str:
        if self._parsed.query:
            return self._parsed.path.lstrip("/") + "?" + self._parsed.query
        return self._parsed.path.lstrip("/")

Sources: skyvern/forge/sdk/api/aws.py

Workflow Scripts Storage

Scripts are stored with metadata including cache keys and revision counts:

Field	Description
`Cache Key Value`	Unique identifier for the script
`Total Revisions`	Number of versions
`Runs`	Execution count
`Last Updated`	Most recent modification timestamp

Sources: skyvern-frontend/src/routes/workflows/WorkflowScriptsPage.tsx

Task Execution Model

Task Creation Flow

sequenceDiagram
    participant User
    participant Frontend
    participant Forge API
    participant Agent
    participant Browser
    
    User->>Frontend: Enter navigation goal
    User->>Frontend: Configure advanced settings
    User->>Frontend: Submit task
    Frontend->>Forge API: POST /v1/tasks
    Forge API->>Agent: Create task instance
    Agent->>Browser: Initialize browser session
    Browser-->>Agent: Session established
    Agent-->>Forge API: Task created
    Forge API-->>Frontend: Task response
    Frontend-->>User: Display task status

Task States

State	Description
`Navigation Goal`	Primary instruction for the agent
`Navigation Payload`	Additional parameters, routes, states
`Proxy Location`	Optional proxy routing
`Browser Session ID`	Persistent session reference

Workflow System Architecture

Workflow Components

Component	Purpose
Workflow Scripts	Cached code blocks with versioning
Schedules	Cron-based execution triggers
Workflow Runs	Individual execution instances
Workflow History	Version tracking and modification history

Schedule Configuration

Schedules support timezone-aware cron expressions:

// Schedule display components
<div className="space-y-2">
  <span className="text-sm text-slate-400">Timezone</span>
  <span className="text-sm text-slate-50">{schedule.timezone}</span>
</div>
<div className="space-y-2">
  <span className="text-sm text-slate-400">Cron</span>
  <code className="font-mono text-xs">{schedule.cron_expression}</code>
</div>

Sources: skyvern-frontend/src/routes/schedules/ScheduleDetailPage.tsx

Script Versioning

Each workflow script maintains a revision history:

// Revision count calculation
{versions?.versions
  ? versions.versions.filter(
      (v) => v.version < (activeVersion ?? 0),
    ).length
  : 0}
<span className="text-sm font-normal">prior</span>

Sources: skyvern-frontend/src/routes/workflows/WorkflowScriptDetailPage.tsx

Credentials and Authentication

TOTP/2FA Management

Skyvern supports 2FA code management for authenticated workflows:

Component	Description
`PushTotpCodeForm`	Form for submitting verification codes
Identifier Filter	Filter by email or phone
OTP Type Filter	Filter by type (TOTP/Magic Link)

Sources: skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx

LLM Provider Integration

Skyvern supports multiple LLM providers through a unified interface:

Provider	Supported Models
OpenAI	GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini
Anthropic	Claude 4.7 Opus, Claude 4.6, Claude 4.5
Azure OpenAI	Any deployed GPT models
AWS Bedrock	Claude 4.7, Claude 4.6, Claude 4.5
Gemini	Gemini 3.1 Pro, Gemini 3 Flash

Sources: README.md

Development and Deployment

Local Development Setup

# 1. Create virtual environment
uv sync --group dev

# 2. Initialize configuration
uv run skyvern quickstart

# 3. Access UI
# Navigate to http://localhost:8080

Sources: README.md

Running Workflows Locally

The workspace editor provides local execution capabilities:

# 1. Install skyvern
pip install skyvern

# 2. Set up skyvern
skyvern quickstart

# 3. Run workflow code
skyvern run code --params '{"param1": "val1"}' main.py

System Data Flow

graph LR
    subgraph Input["User Input"]
        Prompt[Natural Language Prompt]
        Payload[Navigation Payload]
        Config[Configuration]
    end
    
    subgraph Processing["Forge Processing"]
        Parse[Parse & Validate]
        Agent[Agent Reasoning]
        Plan[Action Planning]
    end
    
    subgraph Execution["Browser Execution"]
        Navigate[Navigate]
        Interact[Interact]
        Extract[Extract Data]
    end
    
    subgraph Output["Results"]
        Screenshots[Screenshots]
        Data[Extracted Data]
        Logs[Execution Logs]
    end
    
    Input --> Parse
    Parse --> Agent
    Agent --> Plan
    Plan --> Execute
    Execute --> Output
    
    style Input fill:#e1f5fe
    style Processing fill:#fff3e0
    style Execution fill:#e8f5e9
    style Output fill:#f3e5f5

Summary

The Skyvern system architecture follows a modular design with clear separation of concerns:

Frontend Layer: React SPA providing task creation, workflow editing, and real-time visualization
Backend Layer: Python FastAPI application handling agent orchestration, workflow management, and scheduling
Browser Layer: Chrome DevTools Protocol-based automation engine for web interaction
Storage Layer: S3 for large objects, database for structured data, and LLM providers for reasoning

The system supports multiple LLM providers, enables persistent browser sessions, and provides comprehensive workflow versioning and scheduling capabilities.

Sources: [skyvern-frontend/src/components/BrowserStream.tsx]()

Browser Automation Engine

Related topics: Introduction to Skyvern, AI-Powered Commands

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Components

Continue reading this section for the full explanation and source context.

Section Module Structure

Continue reading this section for the full explanation and source context.

Section Session Lifecycle

Continue reading this section for the full explanation and source context.

Browser Automation Engine

Overview

The Browser Automation Engine is the core component of Skyvern that enables AI agents to interact with websites through browser control. Instead of relying on fragile XPath-based selectors that break with website layout changes, Skyvern leverages Vision LLMs combined with Playwright and Chrome DevTools Protocol (CDP) to visually understand and interact with web pages.

The engine provides a unified interface for:

Launching and managing browser sessions
Navigating to URLs with configurable behavior
Executing actions (click, type, scroll, hover, etc.)
Capturing screenshots for LLM analysis
Extracting structured data from web pages
Handling multi-step workflows across websites

Sources: README.md:60-80

Architecture

System Components

graph TD
    A[Agent / Task Request] --> B[Browser Manager]
    B --> C[Real Browser Manager]
    C --> D[Playwright Browser]
    C --> E[CDP Connection]
    D --> F[Browser State]
    F --> G[Screenshot Capture]
    F --> H[DOM Extraction]
    E --> I[DevTools Protocol]
    G --> J[Vision LLM Analysis]
    J --> K[Action Handler]
    K --> C

Module Structure

Module	Purpose
`webeye/__init__.py`	Public API exports and core abstractions
`browser_manager.py`	Abstract browser manager interface
`real_browser_manager.py`	Concrete Playwright-based implementation
`browser_state.py`	Page state representation and snapshot
`actions/handler.py`	Action execution and coordination
`cdp_connection.py`	Chrome DevTools Protocol communication

Sources: skyvern/forge/sdk/routes/agent_protocol.py:30-50

Browser Session Management

Session Lifecycle

stateDiagram-v2
    [*] --> Created: browser_session_id
    Created --> Launching: launch()
    Launching --> Ready: browser ready
    Ready --> Navigating: navigate(url)
    Navigating --> Ready: page loaded
    Ready --> Executing: perform_action()
    Executing --> Ready: action complete
    Ready --> Closed: close()
    Closed --> [*]

Persistent Browser Sessions

Skyvern supports persistent browser sessions that maintain cookies, local storage, and login states across task executions:

# Create a persistent browser session
browser_session_id = "pbs_xxxxxxxxxxxx"

# Reuse session for subsequent tasks
task = await skyvern.run_task(
    prompt="Download invoice from my account",
    browser_session_id=browser_session_id,
)

Sources: skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:40-60

Session Configuration Parameters

Parameter	Type	Description
`browser_session_id`	string	ID of a persistent browser session
`cdp_address`	string	Browser DevTools address (e.g., `http://127.0.0.1:9222`)
`proxy_location`	string	Geographic proxy for requests
`extra_http_headers`	dict	Custom HTTP headers for requests
`totp_identifier`	string	2FA identifier for authenticated flows

Sources: skyvern/forge/sdk/routes/agent_protocol.py:40-55

Chrome DevTools Protocol Integration

CDP Connection

The CDP connection module provides low-level access to Chrome's debugging interface:

# CDP connection configuration
cdp_address = "http://127.0.0.1:9222"

Skyvern can connect to:

Local Chrome - Chrome with remote debugging enabled
Existing Browser - Your Chrome with cookies and extensions
Cloud Browser - Skyvern-hosted browser via tunnel

Sources: skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:65-80

Remote Debugging Setup

# Step 1: Open Chrome with remote debugging
chrome --remote-debugging-port=9222

# Or use Skyvern's CLI helper
skyvern init browser

The browser exposes WebSocket endpoint at http://127.0.0.1:9222 for CDP commands.

Sources: README.md:45-65

Browser State Representation

State Components

graph LR
    A[Browser State] --> B[Current URL]
    A --> C[Screenshot]
    A --> D[DOM Tree]
    A --> E[Cookies]
    A --> F[Local Storage]
    A --> G[Viewport Info]

Browser State Object

Property	Description
`url`	Current page URL
`title`	Page title
`screenshot`	Base64-encoded screenshot
`dom_tree`	Parsed DOM structure
`viewport`	Viewport dimensions
`elements`	Interactive element mapping

Sources: skyvern/webeye/browser_state.py

Action Handler

Supported Actions

The action handler executes LLM-decided actions on the browser:

Action	Parameters	Description
`click`	element_selector	Click on specified element
`type`	text, element_selector	Enter text into input field
`hover`	element_selector	Mouse hover over element
`scroll`	direction, amount	Scroll page view
`select`	value, element_selector	Select dropdown option
`press_key`	key	Press keyboard key
`wait`	duration	Wait for page to settle
`navigate`	url	Go to URL
`screenshot`	-	Capture current view
`extract`	schema	Extract data per schema

Sources: skyvern/webeye/actions/handler.py

Action Execution Flow

sequenceDiagram
    participant LLM as Vision LLM
    participant AH as Action Handler
    participant BM as Browser Manager
    participant Browser as Playwright/CDP
    
    LLM->>AH: Decide action from screenshot
    AH->>BM: Execute action request
    BM->>Browser: CDP/Playwright command
    Browser-->>BM: Action result
    BM-->>AH: Updated browser state
    AH-->>LLM: State + screenshot for next decision

Browser Configuration Options

Launch Configuration

Option	Default	Description
`headless`	true	Run browser without visible window
`viewport_width`	1280	Browser viewport width
`viewport_height`	720	Browser viewport height
`user_agent`	auto	User agent string
`ignore_https_errors`	false	Allow invalid certs

Option	Type	Description
`url`	string	Target URL
`navigation_payload`	object	Parameters, routes, or initial states
`follow_redirects`	boolean	Auto-follow HTTP redirects
`timeout`	int	Navigation timeout in ms

Sources: skyvern-frontend/src/routes/tasks/detail/TaskParameters.tsx:20-40

Integration with Agent System

Agent Protocol Integration

The browser automation engine integrates with Skyvern's agent protocol:

run_request=TaskRunRequest(
    engine=RunEngine.skyvern_v2,
    prompt=task_v2.prompt,
    url=task_v2.url,
    browser_session_id=run_request.browser_session_id,
    totp_identifier=task_v2.totp_identifier,
    proxy_location=task_v2.proxy_location,
    max_steps=run_request.max_steps,
)

Workflow Block Execution

graph TD
    A[Workflow Run] --> B[Initialize Browser]
    B --> C[Go To URL Block]
    C --> D[Browser Navigation]
    D --> E[Action Block]
    E --> F[Extract/Process]
    F --> G{More Blocks?}
    G -->|Yes| E
    G -->|No| H[Close Browser]
    H --> I[Return Results]

Sources: skyvern-frontend/src/routes/workflows/workflowRun/TaskBlockParameters.tsx:10-50

Advanced Features

Custom Browser Connection

Connect Skyvern Cloud to a local browser running on your machine:

# Start Chrome with tunnel to Skyvern Cloud
skyvern browser serve --tunnel

This enables:

Use existing cookies and logins
Bypass VPN restrictions
Full browser control via Skyvern API

Sources: README.md:80-100

Proxy Support

Route browser traffic through geographic proxies:

skyvern.run_task(
    prompt="Search for local restaurants",
    proxy_location="us-east-1",  # or "eu-west-1", "ap-south-1"
)

Available proxy locations provide access to region-specific content.

Sources: skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:25-35

Error Handling

Browser-Specific Errors

Error Type	Cause	Recovery
Navigation timeout	Page fails to load	Retry with extended timeout
Element not found	Dynamic content issues	Re-screenshot and retry
Browser crash	Memory/extension issues	Restart browser session
CDP connection lost	Network disruption	Reconnect and resume

Error Code Mapping

Custom error codes can be mapped for workflow-specific handling:

task = await skyvern.run_task(
    prompt="Process order",
    error_code_mapping={
        "ERR_LOGIN_FAILED": "retry_with_2fa",
        "ERR_PAYMENT_DECLINED": "notify_user",
    },
)

Sources: skyvern-frontend/src/routes/workflows/workflowRun/TaskBlockParameters.tsx:45-65

Security Considerations

Browser Tunneling Security

[!WARNING]

Always use --api-key when exposing your browser via tunnel. Without it, anyone with the URL has full control of your browser.

Best practices:

Never expose browser tunnels publicly
Use authenticated connections only
Rotate tunnel URLs frequently
Limit browser session access

Sources: README.md:95-105

Secure Credential Management

TOTP/2FA codes are handled through secure credential storage:

task = await skyvern.run_task(
    prompt="Login to bank account",
    totp_identifier="[email protected]",
)

The system extracts codes from push notifications or SMS and attaches them to relevant workflow steps.

Sources: skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:10-30

Summary

The Browser Automation Engine provides Skyvern's core capability to automate web interactions using Vision LLMs. Key aspects:

Unified abstraction over Playwright and CDP protocols
Persistent sessions for maintaining login states
Visual understanding via screenshot-based LLM analysis
Flexible configuration for proxy, headers, and browser options
Integrated with workflows for complex multi-step automation

This architecture enables Skyvern to operate on websites it has never seen before, adapt to layout changes automatically, and apply the same workflow across many different sites.

Sources: [README.md:60-80]()

Workflow System

Related topics: System Architecture, Database Models

Section Related Pages

Continue reading this section for the full explanation and source context.

Section WorkflowDefinition

Continue reading this section for the full explanation and source context.

Section WorkflowParameter

Continue reading this section for the full explanation and source context.

Section Supported Block Types

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Database Models

Workflow System

Overview

The Skyvern Workflow System is a core automation framework that enables chaining multiple tasks together to form cohesive units of work. It allows users to create complex multi-step automations by composing reusable building blocks called "workflow blocks."

Architecture

graph TD
    subgraph "Frontend Layer"
        WE[Workflow Editor]
        RR[Run Workflow Form]
        DP[Debugger Panel]
    end
    
    subgraph "API Layer"
        AP[Agent Protocol Routes]
        WS[Webhook Endpoint]
    end
    
    subgraph "Service Layer"
        WFS[Workflow Service]
        BS[Block Service]
    end
    
    subgraph "Core SDK"
        WMS[Workflow Models]
        BMS[Block Models]
        WDC[Definition Converter]
        WSS[Workflow Service SDK]
    end
    
    WE --> AP
    RR --> AP
    AP --> WFS
    WFS --> BS
    WFS --> WMS
    BS --> BMS
    WDC --> WMS
    WDC --> BMS
    WSS --> WMS
    WSS --> BMS

Workflow Model

WorkflowDefinition

The WorkflowDefinition is the core model representing a workflow:

class WorkflowDefinition(BaseModel):
    title: str
    description: Optional[str] = None
    blocks: List[WorkflowBlockDefinition]
    parameters: List[WorkflowParameter] = []

Field	Type	Description
`title`	`str`	Human-readable workflow title
`description`	`Optional[str]`	Optional description of workflow purpose
`blocks`	`List[WorkflowBlockDefinition]`	Ordered list of block definitions
`parameters`	`List[WorkflowParameter]`	Input parameters for workflow execution

Sources: skyvern/forge/sdk/workflow/models/workflow.py

WorkflowParameter

Workflows accept typed input parameters:

class WorkflowParameter(BaseModel):
    key: str
    workflow_parameter_type: WorkflowParameterType
    default_value: Optional[Any] = None
    description: Optional[str] = None
    required: bool = True

Field	Type	Description
`key`	`str`	Parameter identifier
`workflow_parameter_type`	`WorkflowParameterType`	Type: `string`, `integer`, `float`, `boolean`, `json`
`default_value`	`Optional[Any]`	Default value if not provided
`description`	`Optional[str]`	Parameter description
`required`	`bool`	Whether parameter is mandatory

Sources: skyvern/forge/sdk/workflow/models/workflow.py

Block Types

Skyvern supports 23 block types for multi-step automations. Each block type serves a specific purpose in workflow execution.

graph TD
    A[Workflow Start] --> B{Block Type}
    B --> C[Browser Tasks]
    B --> D[Data Operations]
    B --> E[Control Flow]
    B --> F[External Integration]
    
    C --> C1[Task v2]
    C --> C2[Browser Action]
    C --> C3[Navigation]
    C --> C4[Login]
    
    D --> D1[Extraction]
    D --> D2[HTTP Request]
    D --> D3[File Download]
    
    E --> E1[Conditional]
    E --> E2[For Loop]
    E --> E3[Wait]
    
    F --> F1[Email]
    F --> F2[Text Prompt]
    F --> F3[Print Page]

Supported Block Types

Block Type	Purpose	Key Parameters
`Taskv2`	Multi-step browser automation	`prompt`, `url`, `max_steps`, `totp_verification_url`, `disable_cache`
`URL`	Navigate to a URL	`url`, `continue_on_failure`
`Wait`	Pause execution	`duration`
`TextPrompt`	LLM text generation	`prompt`, `llm_key`, `json_schema`
`HTTPRequest`	External API calls	`url`, `method`, `headers`, `body`
`Extraction`	Data extraction from page	`prompt`, `llm_key`
`Validation`	Validate extracted data	`prompt`, `error_codes`
`PrintPage`	Print to PDF	`format`, `landscape`, `print_background`
`HumanInteraction`	Pause for human input	`instructions`, `positive_descriptor`, `negative_descriptor`
`Conditional`	Branch logic	`expression`
`ForLoop`	Iterate over items	`items`, `variable_name`
`FileDownload`	Download files	`url`, `follow_redirects`, `save_response_as_file`
`BrowserAction`	Single browser action	`action_type`, `element_id`
`Login`	Handle authentication	`credential_id`, `totp_identifier`

Sources: skyvern/forge/sdk/workflow/models/block.py Sources: skyvern/cli/mcp_tools/README.md

Block Execution Model

WorkflowBlockExecution

Each block execution is tracked with its status:

class WorkflowBlockExecution(BaseModel):
    workflow_run_id: str
    block_id: str
    block_type: WorkflowBlockType
    status: WorkflowBlockStatus
    output: Optional[Any] = None
    failure_reason: Optional[str] = None
    executed_branch_expression: Optional[str] = None
    executed_branch_result: Optional[bool] = None
    executed_branch_next_block: Optional[str] = None

Status	Description
`created`	Block added to execution queue
`queued`	Waiting for execution
`running`	Currently executing
`completed`	Successfully finished
`failed`	Execution failed
`cancelled`	Cancelled by user

Block Parameters by Type

#### Taskv2BlockParameters

class Taskv2BlockParameters(BaseModel):
    prompt: str
    url: Optional[str] = None
    max_steps: Optional[int] = None
    totp_verification_url: Optional[str] = None
    totp_identifier: Optional[str] = None
    disable_cache: bool = False

Parameter	Type	Default	Description
`prompt`	`str`	-	Navigation goal for the browser agent
`url`	`Optional[str]`	`None`	Starting URL for navigation
`max_steps`	`Optional[int]`	`None`	Maximum steps before stopping
`totp_verification_url`	`Optional[str]`	`None`	URL for 2FA verification
`totp_identifier`	`Optional[str]`	`None`	Identifier for TOTP credentials
`disable_cache`	`bool`	`False`	Disable action caching

Sources: skyvern/forge/sdk/workflow/models/block.py

#### GotoUrlBlockParameters

class GotoUrlBlockParameters(BaseModel):
    url: str
    continue_on_failure: bool = False

Parameter	Type	Default	Description
`url`	`str`	-	Target URL to navigate to
`continue_on_failure`	`bool`	`False`	Continue workflow on navigation failure

#### WaitBlockParameters

class WaitBlockParameters(BaseModel):
    duration: int

#### PrintPageBlockParameters

class PrintPageBlockParameters(BaseModel):
    format: PrintFormat = PrintFormat.A4
    landscape: bool = False
    print_background: bool = False
    include_timestamp: bool = True
    custom_filename: Optional[str] = None

Parameter	Type	Default	Description
`format`	`PrintFormat`	`A4`	Page format: `A4`, `Letter`, `Legal`
`landscape`	`bool`	`False`	Use landscape orientation
`print_background`	`bool`	`False`	Print background colors
`include_timestamp`	`bool`	`True`	Include timestamp in footer
`custom_filename`	`Optional[str]`	`None`	Custom output filename

#### HumanInteractionBlockParameters

class HumanInteractionBlockParameters(BaseModel):
    instructions: Optional[str] = None
    positive_descriptor: Optional[str] = None
    negative_descriptor: Optional[str] = None

Parameter	Type	Description
`instructions`	`Optional[str]`	Instructions for the human
`positive_descriptor`	`Optional[str]`	Label for positive confirmation
`negative_descriptor`	`Optional[str]`	Label for negative/cancellation action

Workflow Execution Flow

sequenceDiagram
    participant Client
    participant API
    participant WorkflowService
    participant BlockService
    participant Executor

    Client->>API: POST /workflows/{id}/run
    API->>WorkflowService: create_workflow_run()
    WorkflowService->>WorkflowService: Validate parameters
    WorkflowService->>WorkflowService: Create WorkflowRun record
    WorkflowService-->>API: WorkflowRun
    
    loop For each block
        API->>BlockService: execute_block()
        BlockService->>Executor: Process block
        Executor-->>BlockService: Block result
        BlockService-->>API: WorkflowBlockExecution
    end
    
    API->>Client: Webhook callback (optional)

Workflow Service API

Core Operations

Method	Description	Source
`create_workflow`	Create new workflow	skyvern/forge/sdk/workflow/service.py
`get_workflow`	Retrieve workflow by ID	skyvern/forge/sdk/workflow/service.py
`update_workflow`	Update workflow definition	skyvern/forge/sdk/workflow/service.py
`delete_workflow`	Delete workflow	skyvern/forge/sdk/workflow/service.py
`list_workflows`	List all workflows	skyvern/forge/sdk/workflow/service.py
`run_workflow`	Execute workflow	skyvern/services/workflow_service.py
`cancel_workflow_run`	Cancel running workflow	skyvern/services/workflow_service.py

Running Workflows

Workflows can be executed via:

API: POST /workflows/{workflow_id}/run
CLI: skyvern_workflow_run tool
Schedule: Cron-based scheduled execution

Run Parameters

When running a workflow, the following parameters can be specified:

Parameter	Type	Description
`parameters`	`Dict[str, Any]`	Workflow input parameters
`webhook_callback_url`	`Optional[str]`	URL for result callback
`proxy_location`	`Optional[ProxyLocation]`	Geographic proxy location
`run_with`	`RunWith`	`agent` or `code` execution mode
`ai_fallback`	`bool`	Fall back to AI if code generation fails

Sources: skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx

Webhook Integration

Workflows support webhook callbacks for asynchronous result delivery:

graph LR
    A[Workflow Run] --> B{Complete?}
    B -->|Yes| C[Send webhook]
    B -->|No| D[Retry queue]
    D --> B
    C --> E[Customer Endpoint]

The webhook payload includes:

{
    "workflow_run_id": str,
    "workflow_id": str,
    "status": WorkflowRunStatus,
    "output": Optional[Any],
    "failure_reason": Optional[str],
    "created_at": datetime,
    "modified_at": datetime,
    "blocks": List[WorkflowBlockExecution]
}

MCP Integration

Skyvern provides MCP (Model Context Protocol) tools for workflow management:

Available Tools

Tool	Description
`skyvern_workflow_create`	Create new workflow
`skyvern_workflow_list`	List all workflows
`skyvern_workflow_get`	Get workflow details
`skyvern_workflow_run`	Execute workflow
`skyvern_workflow_status`	Check run status
`skyvern_workflow_update`	Update workflow
`skyvern_workflow_delete`	Delete workflow
`skyvern_workflow_cancel`	Cancel running workflow
`skyvern_block_schema`	Get block type schema
`skyvern_block_validate`	Validate block definition

Sources: skyvern/cli/mcp_tools/README.md

Frontend Components

Workflow Editor

Located at /workflows/{workflow_id}/build, the editor provides:

Visual block composition
Block parameter configuration
Workflow validation
Preview mode

Run Workflow Form

Located at /workflows/{workflow_id}/run, supports:

Parameter input with type validation
Run method selection (agent or code)
Webhook URL configuration
Proxy location selection

Debugger Panel

Located at /workflows/{workflow_id}/debug, provides:

Real-time execution status
Block-by-block output inspection
Extracted information viewer
Failure reason analysis

Workflow Run Timeline

Displays execution history with:

Block status indicators
Execution timestamps
Extracted data per block
Navigation to diagnostics

Data Flow

graph TD
    subgraph "Definition Layer"
        WD[Workflow Definition]
        BD[Block Definitions]
        WP[Workflow Parameters]
    end
    
    subgraph "Execution Layer"
        WR[Workflow Run]
        BR[Block Executions]
        ST[State Management]
    end
    
    subgraph "Output Layer"
        OT[Output Data]
        ER[Error Reports]
        WH[Webhook Events]
    end
    
    WD --> WR
    BD --> BR
    WP --> WR
    BR --> ST
    ST --> OT
    BR -->|on failure| ER
    WR --> WH

Key Features

Conditional Execution

The Conditional block evaluates expressions and branches workflow execution:

class ConditionalBlockParameters(BaseModel):
    expression: str  # e.g., "data.status == 'approved'"

After evaluation, the system records:

executed_branch_expression: The evaluated expression
executed_branch_result: Boolean result
executed_branch_next_block: Next block ID based on result

For Loop Iteration

The ForLoop block iterates over collections:

class ForLoopBlockParameters(BaseModel):
    items: List[Any]
    variable_name: str  # Variable to expose in loop context

Error Handling

Blocks support continue_on_failure flag for graceful degradation:

class GotoUrlBlockParameters:
    url: str
    continue_on_failure: bool = False

When enabled, workflow continues to next block on failure.

TOTP/2FA Support

Browser tasks can handle two-factor authentication:

class Taskv2BlockParameters:
    totp_verification_url: Optional[str]
    totp_identifier: Optional[str]

Users can push verification codes via the frontend or API.

Security Considerations

Webhook Signature Validation

Webhook endpoints must validate signatures:

async def webhook(request: Request) -> Response:
    signature = request.headers.get("x-skyvern-signature")
    timestamp = request.headers.get("x-skyvern-timestamp")
    
    if not signature or not timestamp:
        raise HTTPException(status_code=400)
    
    payload = await request.body()
    expected = generate_skyvern_signature(
        payload.decode("utf-8"),
        settings.SKYVERN_API_KEY
    )

Credential Management

Workflows requiring authentication reference stored credentials by ID rather than embedding sensitive data.

CLI Commands

# Switch between environments
skyvern mcp switch

# List workflows
skyvern workflow list

# Run workflow
skyvern workflow run <workflow_id>

AI-Powered Commands

Related topics: Browser Automation Engine, LLM Provider Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Copilot Agent (skyvern/forge/sdk/copilot/agent.py)

Continue reading this section for the full explanation and source context.

Section Tool System (skyvern/forge/sdk/copilot/tools.py)

Continue reading this section for the full explanation and source context.

Section Browser Page AI (skyvern/library/skyvernbrowserpageai.py)

Continue reading this section for the full explanation and source context.

AI-Powered Commands

Skyvern provides a comprehensive suite of AI-powered commands that enable intelligent browser automation through natural language instructions. These commands leverage Large Language Models (LLMs) to interpret user intent and execute complex browser interactions autonomously.

Overview

AI-Powered Commands in Skyvern represent a paradigm shift from traditional scripted automation to intelligent, intent-based browser control. Instead of writing precise step-by-step instructions, users describe what they want to achieve in natural language, and Skyvern's AI agents interpret and execute the necessary browser actions.

The system integrates with multiple LLM providers including OpenAI (GPT-4.1, o3, o4-mini), Anthropic (Claude 4.5-4.7), Azure OpenAI, AWS Bedrock, and Google Gemini to power the AI decision-making engine.

Architecture

graph TD
    A[User Input / Natural Language] --> B[Copilot Agent]
    B --> C[LLM Provider]
    C --> D[Decision Engine]
    D --> E[Browser Actions]
    E --> F[Element Interaction]
    F --> G[State Validation]
    G --> H[Continue / Complete]
    
    B --> I[Tool Selection]
    I --> J[Data Extraction]
    I --> K[Visual Validation]
    I --> L[Network Monitoring]
    
    subgraph Tools
        J
        K
        L
    end

Core Components

Copilot Agent (`skyvern/forge/sdk/copilot/agent.py`)

The Copilot Agent serves as the central orchestration layer for AI-powered commands. It maintains conversation context, manages tool selection, and coordinates the execution flow between user instructions and browser actions.

Component	Responsibility
Context Manager	Maintains conversation history and state
Tool Selector	Chooses appropriate tools based on intent
Action Executor	Executes browser actions
Response Formatter	Formats AI responses for user consumption

Tool System (`skyvern/forge/sdk/copilot/tools.py`)

Skyvern's tool system provides a comprehensive set of primitives for browser automation. Each tool is designed to handle specific interaction patterns while being composable for complex workflows.

Browser Page AI (`skyvern/library/skyvern_browser_page_ai.py`)

This module provides the foundational AI capabilities for understanding and interacting with web page content. It includes element identification, content extraction, and visual analysis capabilities.

Element Interactions

Command	Purpose	Parameters
`skyvern_click`	Click on identified elements	`element_selector`, `options`
`skyvern_type`	Enter text into input fields	`text`, `element_selector`
`skyvern_hover`	Hover over elements	`element_selector`
`skyvern_scroll`	Scroll within page or elements	`direction`, `amount`
`skyvern_select_option`	Select dropdown options	`value`, `element_selector`
`skyvern_press_key`	Press keyboard keys	`key`, `modifiers`
`skyvern_drag`	Drag and drop operations	`source`, `target`
`skyvern_wait`	Wait for conditions	`condition`, `timeout`
`skyvern_file_upload`	Upload files to elements	`file_path`, `element_selector`

Command	Purpose
`skyvern_navigate`	Navigate to URLs
`skyvern_go_back`	Navigate browser history back
`skyvern_go_forward`	Navigate browser history forward
`skyvern_reload`	Reload current page

Tab and Frame Management

Command	Purpose
`skyvern_tab_new`	Open new browser tab
`skyvern_tab_list`	List all open tabs
`skyvern_tab_switch`	Switch to specific tab
`skyvern_tab_close`	Close current or specified tab
`skyvern_tab_wait_for_new`	Wait for new tab to open
`skyvern_frame_list`	List all iframes on page
`skyvern_frame_switch`	Switch to iframe context

Data Extraction Commands

Skyvern provides multiple methods for extracting structured data from web pages:

Structured Extraction

Command	Purpose	Output Format
`skyvern_extract`	Extract structured data	JSON with defined schema
`skyvern_get_html`	Get page HTML	Raw HTML string
`skyvern_get_value`	Get form element values	String or JSON

Visual Extraction

Command	Purpose
`skyvern_screenshot`	Capture full or partial screenshots
`skyvern_get_styles`	Get computed CSS styles
`skyvern_find`	Find elements by visual similarity

Content Analysis

The extraction system uses AI to understand page structure and extract relevant information based on user intent. It supports:

Dynamic schema generation based on natural language requests
Multi-field extraction from complex layouts
Nested data structures and repeating elements
Confidence scoring for extracted values

Validation and Verification

AI-Powered Validation

Command	Purpose
`skyvern_validate`	Validate element states or page conditions
`skyvern_evaluate`	Run JavaScript for custom validation
`skyvern_evaluate_async`	Execute async JavaScript operations

Validation commands use the LLM to interpret complex conditions that would be difficult to express in traditional selectors or XPath expressions.

Screenshot Validation

The screenshot command supports comparison against reference images and can detect visual regressions:

result = await skyvern.screenshot(
    full_page=True,
    compare_with="baseline.png",
    threshold=0.1  # 10% allowed difference
)

Network and Console Commands

Network Monitoring

Command	Purpose
`skyvern_network_requests`	List network requests
`skyvern_network_request_detail`	Get request/response details
`skyvern_network_route`	Intercept and modify requests
`skyvern_network_unroute`	Remove request interception
`skyvern_har_start`	Start HAR recording
`skyvern_har_stop`	Stop and export HAR data

Console Inspection

Command	Purpose
`skyvern_console_messages`	Retrieve console logs
`skyvern_get_errors`	Get JavaScript errors
`skyvern_handle_dialog`	Handle browser dialogs (alert, confirm, prompt)

Authentication and Credentials

Skyvern supports intelligent login flows with multiple authentication methods:

Command	Purpose
`skyvern_login`	Execute automated login
`skyvern_credential_list`	List stored credentials
`skyvern_credential_get`	Retrieve specific credentials
`skyvern_credential_delete`	Remove stored credentials

Credential Management

The credential system integrates with:

Skyvern Vault: Built-in secure storage
Bitwarden: Enterprise password management
1Password: Team password sharing
Azure Key Vault: Cloud credential storage

Two-Factor Authentication

Skyvern handles 2FA/TOTP flows automatically:

Detects OTP requirement during login
Extracts codes from configured sources
Supports magic link authentication
Push notification handling via skyvern/cli/skills/README.md

State Management

Session State

Command	Purpose
---------	---------
`skyvern_state_save`	Save current browser state
`skyvern_state_load`	Restore saved state
`skyvern_get_session_storage`	Read session storage
`skyvern_set_session_storage`	Write to session storage
`skyvern_clear_session_storage`	Clear session storage
`skyvern_clear_local_storage`	Clear local storage

Clipboard Operations

Command	Purpose
---------	---------
`skyvern_clipboard_read`	Read from clipboard
`skyvern_clipboard_write`	Write to clipboard

Workflow Integration

AI-Powered Commands can be orchestrated into complete workflows:

graph LR
    A[Navigation] --> B[Authentication]
    B --> C[Data Extraction]
    C --> D[Validation]
    D --> E{Success?}
    E -->|No| F[Retry Logic]
    F --> B
    E -->|Yes| G[Output Results]

Workflow Commands

Command	Purpose
`skyvern_workflow_create`	Create new workflow
`skyvern_workflow_list`	List available workflows
`skyvern_workflow_get`	Get workflow details
`skyvern_workflow_run`	Execute workflow
`skyvern_workflow_cancel`	Cancel running workflow

Agent Functions (`skyvern/forge/agent_functions.py`)

The agent functions module provides the core building blocks for AI-driven browser automation:

Function Categories

Navigation Functions: Handle URL navigation, back/forward, and reload
Interaction Functions: Click, type, hover, scroll, and element manipulation
Extraction Functions: HTML retrieval, value extraction, screenshot capture
Validation Functions: Element presence, state verification, screenshot comparison
State Functions: Local/session storage, clipboard, authentication state

Function Interface

All agent functions follow a consistent interface:

async def agent_function(
    task_id: str,
    step_id: str,
    **kwargs  # Function-specific parameters
) -> AgentFunctionCallResult:
    """
    Execute AI-powered browser action
    
    Returns:
        AgentFunctionCallResult with:
        - success: bool
        - extracted_data: Optional[dict]
        - screenshot: Optional[str] base64
        - error: Optional[str]
    """

Integration with Skills Package

The skills package (skyvern/cli/skills/README.md) bundles AI-powered commands for coding agents:

Available Skills

Skill	Description
`qa`	QA test frontend changes in real browser
`skyvern`	Full CLI reference for browser automation
`smoke-test`	CI-oriented smoke testing

QA Skill Workflow

graph TD
    A[git diff] --> B[Generate Tests]
    B --> C[Run Against Dev Server]
    C --> D[Report Results]
    D --> E{Screenshots}
    E --> F[Pass/Fail Status]

Configuration

Environment Variables

Variable	Purpose	Default
`SKYVERN_TELEMETRY`	Enable/disable usage telemetry	`true`
`SKYVERN_BASE_URL`	API endpoint for Skyvern Cloud	Local server
`SKYVERN_API_KEY`	Authentication key	None

Browser Configuration

Parameter	Purpose
`BROWSER_TYPE`	Browser engine (chromium, firefox, webkit)
`BROWSER_HEADLESS`	Run without visible UI
`BROWSER_REMOTE_DEBUGGING_URL`	Connect to remote browser instance

Best Practices

Effective Command Usage

Be Specific with Selectors: Use precise element identifiers when available
Add Validation Steps: Always validate state changes after actions
Handle Timing: Use wait commands for dynamic content
Screenshot for Debugging: Capture screenshots at key decision points

Error Handling

try:
    result = await skyvern.act("click", selector="#submit-button")
    if not result.success:
        # Fallback or retry logic
        await skyvern.validate("element_visible", selector="#error-message")
except Exception as e:
    await skyvern.screenshot()
    raise

Summary

AI-Powered Commands in Skyvern transform browser automation from rigid scripting to intelligent, adaptive interactions. By combining natural language understanding with comprehensive browser control primitives, developers can create robust automation flows that handle complexity and edge cases gracefully.

The modular architecture allows commands to be used individually for simple tasks or combined into sophisticated workflows for enterprise-scale automation needs.

Source: https://github.com/Skyvern-AI/skyvern / Human Manual

Database Models

Related topics: Artifact Storage, Workflow System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Task Model

Continue reading this section for the full explanation and source context.

Section Workflow Model

Continue reading this section for the full explanation and source context.

Section Workflow Run Model

Continue reading this section for the full explanation and source context.

Related topics: Artifact Storage, Workflow System

Database Models

Overview

Skyvern's database layer is built using SQLAlchemy ORM with Alembic for database migrations. The persistence layer is located in skyvern/forge/sdk/db/ and provides the data models for all core entities including Tasks, Workflows, Workflow Runs, Browser Profiles, Credentials, and Schedules.

The database models define the schema for persistent storage of automation tasks, execution state, workflow definitions, and runtime data.

Architecture

graph TD
    A[API Layer] --> B[Repository Layer]
    B --> C[SQLAlchemy Models]
    C --> D[(PostgreSQL Database)]
    B --> E[Task Repository]
    B --> F[Workflow Repository]
    B --> G[Workflow Run Repository]

Core Entities

Task Model

The Task model represents an automation task with its configuration and execution state.

Field	Type	Description
task_id	String	Unique identifier (UUID)
workflow_run_id	String (nullable)	Associated workflow run
status	TaskStatus	Current task status
request	JSON	Task request configuration
navigation_goal	String	Navigation objective
navigation_payload	JSON	Additional navigation parameters
data_extraction_goal	String	Data extraction objective
extracted_information_schema	JSON	Expected output schema
created_at	DateTime	Creation timestamp
modified_at	DateTime	Last modification timestamp
organization_id	String	Organization ownership

Sources: skyvern/forge/sdk/db/models.py

Workflow Model

The Workflow model stores workflow definitions and configurations.

Field	Type	Description
workflow_id	String	Unique workflow identifier
title	String	Workflow name
description	String	Workflow description
workflow_definition	JSON	Workflow structure and steps
webhook_callback_url	String (nullable)	Callback URL for completion
organization_id	String	Organization ownership
created_at	DateTime	Creation timestamp
modified_at	DateTime	Last modification timestamp

Sources: skyvern/forge/sdk/db/models.py

Workflow Run Model

The WorkflowRun model tracks individual executions of workflows.

Field	Type	Description
workflow_run_id	String	Unique run identifier
workflow_id	String	Parent workflow reference
status	WorkflowRunStatus	Run status
organization_id	String	Organization ownership
started_at	DateTime	Execution start time
completed_at	DateTime (nullable)	Execution completion time
error	String (nullable)	Error message if failed

Sources: skyvern/forge/sdk/db/models.py

Task Status Enum

The TaskStatus enum defines possible task states:

class TaskStatus(str, Enum):
    created = "created"
    pending = "pending"
    running = "running"
    completed = "completed"
    failed = "failed"
    cancelled = "cancelled"

Sources: skyvern/forge/sdk/db/enums.py

Task Status Flow

stateDiagram-v2
    [*] --> created: Task Created
    created --> pending: Queued for Execution
    pending --> running: Agent Starts
    running --> completed: Success
    running --> failed: Error
    running --> cancelled: User Cancelled
    completed --> [*]
    failed --> [*]
    cancelled --> [*]

Repository Pattern

Skyvern uses a repository pattern to abstract database operations.

TaskRepository

Provides CRUD operations for Task entities:

create_task() - Create new task record
get_task() - Retrieve task by ID
update_task() - Update task fields
get_tasks_for_workflow_run() - Get tasks for workflow execution
get_tasks_by_organization() - List organization tasks

Sources: skyvern/forge/sdk/db/repositories/tasks.py

WorkflowRepository

Manages Workflow entity persistence:

create_workflow() - Create new workflow
get_workflow() - Retrieve workflow definition
update_workflow() - Update workflow
get_workflows_by_organization() - List organization workflows

Sources: skyvern/forge/sdk/db/repositories/workflows.py

WorkflowRunRepository

Handles WorkflowRun entity operations:

create_workflow_run() - Start new workflow execution
get_workflow_run() - Get run details
update_workflow_run() - Update run status
get_workflow_runs_for_workflow() - List runs for a workflow

Sources: skyvern/forge/sdk/db/repositories/workflow_runs.py

Database Migrations

Alembic manages database schema migrations in the alembic/versions/ directory.

Migration files follow the naming convention: {version}_{description}.py

Example migration operations:

Adding new columns to existing tables
Creating new tables for additional entities
Index creation for query optimization
Data type modifications

Sources: alembic/versions

Relationships

erDiagram
    Organization ||--o{ Task : owns
    Organization ||--o{ Workflow : owns
    Organization ||--o{ WorkflowRun : owns
    Workflow ||--o{ WorkflowRun : executes
    WorkflowRun ||--o{ Task : contains

Additional Models

The database layer also includes models for:

Model	Purpose
BrowserProfile	Browser configuration settings
Credential	Authentication credentials storage
Schedule	Cron-based task scheduling
ScheduleRun	Scheduled execution tracking

Sources: skyvern/forge/sdk/db/models.py

Usage Example

from skyvern.forge.sdk.db.repositories.tasks import TaskRepository
from skyvern.forge.sdk.db.models import Task

task_repo = TaskRepository()
new_task = await task_repo.create_task(
    organization_id="org_123",
    navigation_goal="Search for flights",
    navigation_payload={"origin": "SFO", "destination": "LAX"}
)

Configuration

Database connection is configured via environment variables:

Variable	Description
DATABASE_URL	PostgreSQL connection string
SKYVERN_ORG_ID	Default organization ID

Sources: skyvern/forge/sdk/db/models.py

Sources: [skyvern/forge/sdk/db/models.py]()

Artifact Storage

Overview

Artifact Storage is a core system in Skyvern responsible for persisting and retrieving various artifacts generated during task execution and workflow runs. These artifacts include screenshots, HTML content, LLM prompts and responses, element trees, download files, and execution logs. The system provides a pluggable storage backend architecture that supports multiple storage providers while maintaining a consistent API.

The storage layer abstracts away the complexity of different storage backends (local filesystem, Amazon S3, Azure Blob Storage) from the rest of the application, allowing deployments to choose the most appropriate storage solution for their infrastructure requirements.

Architecture

High-Level Architecture

graph TD
    A[API Clients] --> B[Agent Protocol Routes]
    B --> C[Artifact Manager]
    C --> D[Storage Factory]
    D --> E[Local Storage]
    D --> F[S3 Storage]
    D --> G[Azure Blob Storage]
    
    H[Artifact Models] --> C
    C --> H
    
    I[Configuration] --> D

Component Responsibilities

Component	File	Responsibility
Artifact Manager	`artifact/manager.py`	Orchestrates artifact operations, lifecycle management
Storage Factory	`artifact/storage/factory.py`	Creates appropriate storage backend based on configuration
Local Storage	`artifact/storage/local.py`	Filesystem-based storage implementation
S3 Storage	`artifact/storage/s3.py`	AWS S3/ S3-compatible storage implementation
Azure Blob Storage	`artifact/storage/azure.py`	Azure Blob Storage implementation
Artifact Models	`artifact/models.py`	Data models for artifacts and artifact types

Artifact Types

Skyvern distinguishes between multiple artifact types, each serving a specific purpose in documenting and debugging task execution.

Supported Artifact Types

class ArtifactType(str, Enum):
    SCREENSHOT_LLM = "screenshot_llm"
    SCREENSHOT_ACTION = "screenshot_action"
    HTML_SCRAPE = "html_scrape"
    ELEMENT_TREE = "element_tree"
    ELEMENT_TREE_VISIBLE = "element_tree_visible"
    LLM_PROMPT = "llm_prompt"
    LLM_RESPONSE_PARSED = "llm_response_parsed"
    DOWNLOAD = "download"
    SKYVERN_LOG = "skyvern_log"

Type	Description	Content-Type
`SCREENSHOT_LLM`	Annotated screenshots for LLM context	image/png
`SCREENSHOT_ACTION`	Action screenshots captured during execution	image/png
`HTML_SCRAPE`	Raw HTML content from web pages	text/html
`ELEMENT_TREE`	Complete DOM element tree	application/json
`ELEMENT_TREE_VISIBLE`	Filtered visible elements tree	application/json
`LLM_PROMPT`	Prompt sent to LLM for decision making	text/plain
`LLM_RESPONSE_PARSED`	Parsed LLM response with action list	application/json
`DOWNLOAD`	Downloaded file content	application/octet-stream
`SKYVERN_LOG`	Skyvern execution logs	text/plain

Sources: skyvern/forge/sdk/artifact/models.py

Data Models

Artifact Model

The Artifact model represents a single stored artifact with metadata:

class Artifact(BaseModel):
    artifact_id: str
    organization_id: str
    run_id: str | None = None
    task_id: str | None = None
    step_id: str | None = None
    workflow_run_id: str | None = None
    workflow_block_execution_id: str | None = None
    artifact_type: ArtifactType
    uri: str
    filename: str | None = None
    content_type: str | None = None
    metadata: dict[str, Any] | None = None
    created_at: datetime
    modified_at: datetime | None = None

Sources: skyvern/forge/sdk/artifact/models.py

Content-Type Mapping

_ARTIFACT_CONTENT_TYPES: dict[ArtifactType, str] = {
    ArtifactType.SCREENSHOT_LLM: "image/png",
    ArtifactType.SCREENSHOT_ACTION: "image/png",
    ArtifactType.HTML_SCRAPE: "text/html",
    ArtifactType.ELEMENT_TREE: "application/json",
    ArtifactType.ELEMENT_TREE_VISIBLE: "application/json",
    ArtifactType.LLM_PROMPT: "text/plain",
    ArtifactType.LLM_RESPONSE_PARSED: "application/json",
    ArtifactType.DOWNLOAD: "application/octet-stream",
    ArtifactType.SKYVERN_LOG: "text/plain",
}

Storage Backends

Local Storage

The local storage backend stores artifacts on the filesystem, ideal for development and single-instance deployments.

class LocalStorage(BaseStorage):
    def __init__(self, artifact_path: str = settings.ARTIFACT_STORAGE_PATH) -> None:
        self.artifact_path = artifact_path

Key implementation details:

Path Construction: Uses organization and artifact IDs to create hierarchical directory structures
Windows Compatibility: Replaces colons with dashes in timestamps and removes invalid filename characters on Windows systems
SHA256 Verification: Computes SHA256 checksums for stored files

def _safe_timestamp() -> str:
    ts = datetime.utcnow().isoformat()
    return ts.replace(":", "-") if WINDOWS else ts

def _windows_safe_filename(name: str) -> str:
    if not WINDOWS:
        return name
    invalid = '<>:"/\\|?*'
    name = "".join("-" if ch in invalid else ch for ch in name)
    return name.rstrip(" .")

Sources: skyvern/forge/sdk/artifact/storage/local.py

S3 Storage

The S3 backend provides scalable object storage suitable for production deployments.

Configuration Environment Variables:

Variable	Description
`AWS_ACCESS_KEY_ID`	AWS access key for authentication
`AWS_SECRET_ACCESS_KEY`	AWS secret key for authentication
`AWS_REGION`	AWS region for bucket operations
`S3_BUCKET_NAME`	Name of the S3 bucket
`ARTIFACT_S3_ENDPOINT_URL`	Custom S3-compatible endpoint (optional)

Sources: skyvern/forge/sdk/artifact/storage/s3.py

Azure Blob Storage

The Azure backend integrates with Azure Blob Storage for cloud deployments.

Configuration Environment Variables:

Variable	Description
`AZURE_STORAGE_CONNECTION_STRING`	Azure storage connection string
`AZURE_STORAGE_CONTAINER_NAME`	Container name for artifacts

Sources: skyvern/forge/sdk/artifact/storage/azure.py

Storage Factory

The storage factory pattern enables runtime selection of the appropriate storage backend:

graph LR
    A[Configuration] --> B[Storage Factory]
    B --> C{Backend Type}
    C -->|local| D[LocalStorage]
    C -->|s3| E[S3Storage]
    C -->|azure| F[AzureBlobStorage]

Backend Selection Logic:

def get_storage_backend() -> BaseStorage:
    if settings.ARTIFACT_STORAGE_BACKEND == "s3":
        return S3Storage()
    elif settings.ARTIFACT_STORAGE_BACKEND == "azure":
        return AzureBlobStorage()
    else:
        return LocalStorage()

Sources: skyvern/forge/sdk/artifact/storage/factory.py

API Endpoints

Get Artifact Content

Retrieves raw content of an artifact with support for range requests and HMAC-signed URLs.

Endpoint: GET /api/v1/artifacts/{artifact_id}/content

Query Parameters:

Parameter	Type	Description
`sig`	string	HMAC signature for URL authentication
`expiry`	string	Expiration timestamp for signed URLs
`kid`	string	Key identifier for signature verification
`artifact_name`	string	Optional filename override
`artifact_type`	string	Expected artifact type
`x-api-key`	string	API key authentication (header)
`authorization`	string	Bearer token authentication (header)

Responses:

Status	Description
200	Raw artifact content
206	Partial content (Range request)
403	Invalid or expired artifact URL
404	Artifact not found
416	Range not satisfiable

Content-Disposition Behavior:

if artifact.artifact_type == ArtifactType.DOWNLOAD:
    # Use attachment disposition for downloads
    return media_type, _build_attachment_disposition(raw_name)
return media_type, "inline"  # Inline for all other types

Sources: skyvern/forge/sdk/routes/agent_protocol.py

Range Request Support

The artifact content endpoint supports HTTP range requests for partial content retrieval:

def _parse_range_header(range_header: str | None, content_length: int) -> tuple[int, int] | None:
    """Return one satisfiable byte range, _RANGE_UNSATISFIABLE when unsatisfiable, or None when ignored."""
    if not range_header:
        return None
    # Parses "bytes=start-end" format
    # Validates ASCII digits, rejects negatives

Range Header Format: bytes=start-end (RFC 7233 compliant)

Sources: skyvern/forge/sdk/routes/agent_protocol.py

HMAC URL Signing

Artifact URLs can be signed using HMAC for time-limited access without requiring API key authentication:

sequenceDiagram
    Client->>Server: Request with sig, expiry, kid
    Server->>Server: Validate HMAC signature
    Server->>Storage: Fetch artifact
    Storage-->>Server: Artifact content
    Server-->>Client: Signed URL response

Signing Requirements:

HMAC keyring must be configured: ARTIFACT_CONTENT_HMAC_KEYRING
URL must include valid sig, expiry, and kid query parameters
Signature is verified before returning artifact content

Sources: skyvern/forge/sdk/routes/agent_protocol.py

Configuration Options

Storage Configuration

Environment Variable	Default	Description
`ARTIFACT_STORAGE_BACKEND`	`local`	Storage backend type (local/s3/azure)
`ARTIFACT_STORAGE_PATH`	`/tmp/skyvern/artifacts`	Local storage path
`ARTIFACT_CONTENT_HMAC_KEYRING`	-	HMAC keyring for signed URLs

S3 Configuration

Environment Variable	Description
`AWS_ACCESS_KEY_ID`	AWS credentials
`AWS_SECRET_ACCESS_KEY`	AWS credentials
`AWS_REGION`	Region setting
`S3_BUCKET_NAME`	Target bucket
`ARTIFACT_S3_ENDPOINT_URL`	S3-compatible endpoint

Azure Configuration

Environment Variable	Description
`AZURE_STORAGE_CONNECTION_STRING`	Connection string
`AZURE_STORAGE_CONTAINER_NAME`	Container name

File Extension Mapping

The storage layer maintains a mapping from artifact types to file extensions for consistent naming:

FILE_EXTENTSION_MAP: dict[ArtifactType, str] = {
    ArtifactType.SCREENSHOT_LLM: ".png",
    ArtifactType.SCREENSHOT_ACTION: ".png",
    ArtifactType.HTML_SCRAPE: ".html",
    ArtifactType.ELEMENT_TREE: ".json",
    ArtifactType.ELEMENT_TREE_VISIBLE: ".json",
    ArtifactType.LLM_PROMPT: ".txt",
    ArtifactType.LLM_RESPONSE_PARSED: ".json",
    ArtifactType.DOWNLOAD: ".bin",
    ArtifactType.SKYVERN_LOG: ".log",
}

Sources: skyvern/forge/sdk/artifact/storage/base.py

Usage Patterns

Storing an Artifact

# Via Artifact Manager
artifact = await artifact_manager.create_artifact(
    organization_id=org_id,
    artifact_type=ArtifactType.SCREENSHOT_LLM,
    content=image_bytes,
    task_id=task_id,
    step_id=step_id,
)

Retrieving an Artifact

# Get artifact metadata
artifact = await artifact_manager.get_artifact(artifact_id)

# Get presigned or signed URL
url = await artifact_manager.get_artifact_url(artifact)

Range Request for Large Files

headers = {"Range": "bytes=0-1023"}
response = await client.get(f"/api/v1/artifacts/{id}/content", headers=headers)

Security Considerations

Signed URLs: HMAC-signed URLs provide time-limited access without exposing storage credentials
Attachment Disposition: Download artifacts use Content-Disposition: attachment to prevent browser rendering of potentially malicious content
Organization Isolation: Artifacts are namespaced by organization ID to prevent cross-tenant access
Content-Type Validation: Responses set appropriate content-types based on artifact type

Frontend Integration

The frontend displays artifacts through dedicated UI components:

Component	Location	Purpose
`StepArtifacts.tsx`	`routes/tasks/detail/`	Task artifact viewer with tabbed interface
`Artifact` component	Shared	Renders different artifact types
`ZoomableImage`	Shared	Displays screenshots with zoom capability

The artifact viewer supports multiple tabs for different artifact types:

Info
Annotated Screenshots
Action Screenshots
HTML Element Tree
Element Tree
Prompt
Action List
HTML (Raw)

Sources: skyvern-frontend/src/routes/tasks/detail/StepArtifacts.tsx

Sources: [skyvern/forge/sdk/artifact/models.py]()

Credential Management

Related topics: Browser Automation Engine, Workflow System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Skyvern Internal Vault

Continue reading this section for the full explanation and source context.

Section Bitwarden Integration

Continue reading this section for the full explanation and source context.

Section Azure Key Vault

Continue reading this section for the full explanation and source context.

Related topics: Browser Automation Engine, Workflow System

Credential Management

Overview

Credential Management in Skyvern provides a secure, unified system for storing, retrieving, and managing authentication credentials across tasks and workflows. Skyvern supports multiple credential vault types, enabling integration with external password managers and custom credential services while maintaining a native internal vault.

Credentials in Skyvern can be of three primary types:

Credential Type	Description
`password`	Username/password credential pairs for basic authentication
`credit_card`	Credit card information for payment forms
`secret`	Generic secret values for API keys, tokens, and other sensitive data

Sources: skyvern-frontend/src/routes/workflows/components/CredentialSelector.tsx:1-100

Architecture

Skyvern's credential management system is designed with a multi-vault architecture that allows seamless integration with various credential providers while maintaining a consistent internal API.

graph TD
    subgraph "Client Layer"
        UI[Web UI]
        API[API Client]
        MCP[MCP Tools]
    end
    
    subgraph "Credential Services"
        SkyvernVault[Skyvern Internal Vault]
        Bitwarden[Bitwarden Service]
        Azure[Azure Key Vault Service]
        Custom[Custom Credential Service]
    end
    
    subgraph "Storage Layer"
        DB[(Database)]
    end
    
    UI --> API
    MCP --> API
    API --> SkyvernVault
    API --> Bitwarden
    API --> Azure
    API --> Custom
    SkyvernVault --> DB

Sources: skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:1-50

Credential Vault Types

Skyvern Internal Vault

The default vault type stores credentials directly in Skyvern's database. This is the simplest option for getting started and requires no external configuration.

Bitwarden Integration

Skyvern can integrate with Bitwarden to leverage existing credentials stored in your Bitwarden vault. This integration supports:

Reading existing credentials from Bitwarden
Writing new credentials back to Bitwarden
Automatic 2FA/TOTP handling

Sources: skyvern/cli/mcp_tools/README.md:1-50

Azure Key Vault

For enterprise environments, Skyvern supports Azure Key Vault integration, allowing credentials stored in Azure's secure key management system to be used in tasks and workflows.

Sources: skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:1-80

Custom Credential Service

Organizations with proprietary credential management systems can implement a custom credential service. This requires:

API Configuration: Set up API base URL and authentication token
Service Implementation: Implement the credential service interface
Vault Type Selection: Configure parameters to use vault_type="custom"

The custom credential service configuration includes:

api_base_url: The base URL of your credential service API
api_token: Authentication token for the service

Sources: skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:60-75

Using Credentials in Workflows

Credential Parameter Types

Credentials can be referenced as workflow parameters, allowing secure injection of sensitive data into task execution. The system supports the following parameter types:

Parameter Type	Usage	Example Reference
`credential`	Credential objects from vault	`{{ my_credential.username }}`
`context`	Context parameters from previous steps	`{{ context.source_param }}`
`custom`	Custom credential service credentials	Uses vault_type selection

Sources: skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:40-65

Credential Reference Syntax

Within HTTP Request nodes, credentials are referenced using template syntax:

Password credential: {{ my_credential.username }} / {{ my_credential.password }}
Secret credential: {{ my_secret.secret_value }}

Sources: skyvern-frontend/src/routes/workflows/editor/nodes/HttpRequestNode/HttpRequestNode.tsx:1-50

Credential Parameter Validation

When running workflows, credential parameters are validated to ensure:

Required Fields: Boolean and credential parameters must have values
JSON Validation: JSON-type credential parameters must parse correctly
Missing Credential Detection: The system detects orphaned credential parameters where the referenced credential no longer exists in the vault

// Validation example from workflow execution
if (parameter.workflow_parameter_type === "credential") {
    if (value === null || value === undefined) {
        return "This field is required";
    }
}

Sources: skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx:1-100

Orphaned Credential Detection

The system provides warnings when workflow parameters reference credentials that no longer exist in the vault:

⚠️ my_credential (missing credential)

This warning helps identify workflows that need to be updated after credential deletion or vault changes.

Sources: skyvern-frontend/src/routes/workflows/editor/nodes/TaskNode/ParametersMultiSelect.tsx:1-50

Two-Factor Authentication (TOTP)

Skyvern supports automated Two-Factor Authentication through TOTP (Time-based One-Time Password) handling. This is critical for automating workflows that require 2FA verification.

Push TOTP Code Flow

Initiate Push: When a task encounters a TOTP challenge, Skyvern can push a verification code to the user
Code Entry: User receives the verification message (SMS, email, or authenticator app)
Code Extraction: Skyvern extracts the code from the verification message
Attachment: The code is automatically attached to the relevant workflow run

interface TOTPConfig {
    totp_identifier: string;  // Email or phone for receiving codes
    totp_url?: string;        // Direct verification URL if available
    totp_type: 'totp' | 'magic_link';
}

Sources: skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:1-80

TOTP Parameter Filtering

The credential management interface supports filtering TOTP credentials by:

Identifier: Filter by email or phone number
OTP Type: Filter by numeric code or magic link

MCP Integration

Skyvern's Model Context Protocol (MCP) tools provide programmatic access to credential management:

{
  "mcpServers": {
    "skyvern": {
      "type": "streamable-http",
      "url": "https://api.skyvern.com/mcp/",
      "headers": { "x-api-key": "YOUR_API_KEY" }
    }
  }
}

Available MCP Credential Tools

Tool	Description
`skyvern_credential_list`	List all credentials in the vault
`skyvern_credential_get`	Retrieve a specific credential
`skyvern_credential_delete`	Remove a credential from the vault
`skyvern_login`	Authenticate using stored credentials

Supported vault integrations: Skyvern vault, Bitwarden, 1Password, and Azure Key Vault with automatic 2FA/TOTP support.

Sources: integrations/mcp/README.md:1-80

Security Considerations

Browser Tunneling Security

When exposing Skyvern through browser tunneling, ensure API key authentication is enabled:

WARNING: Always use --api-key when exposing your browser via a tunnel. Without it, anyone with the URL has full control of your browser.

Sources: README.md:1-100

Credential Masking

Sensitive credential data is masked in UI displays:

Tokens longer than 8 characters are truncated: sk_live_xxx...
Full values are never displayed in logs or error messages

Sources: skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:20-35

External Vault Security

When using external credential services:

Store API tokens securely (environment variables preferred)
Use HTTPS for all credential service communications
Implement IP allowlisting where supported
Rotate credentials regularly

Configuration Reference

Environment Variables

Variable	Description
`SKYVERN_API_KEY`	API key for Skyvern authentication
`SKYVERN_BASE_URL`	Base URL for self-hosted deployments
`SKYVERN_TELEMETRY`	Set to `false` to opt out of telemetry

Credential Service Configuration

Field	Required	Description
`api_base_url`	Yes (custom)	Base URL of the credential service
`api_token`	Yes (custom)	Authentication token
`token_type`	No	Type of authentication token
`tested_url`	No	URL used to test credential validity

Best Practices

Use Type-Specific Credentials: Store credentials with appropriate types (password, credit_card, secret) for better organization and retrieval
Implement Custom Services for Enterprise: For large-scale deployments, implement a custom credential service for centralized management
Enable TOTP Automation: Configure TOTP handling for automated 2FA workflows
Monitor Orphaned Parameters: Regularly check for and clean up orphaned credential references
Rotate API Tokens: Periodically rotate API tokens for custom credential services
Leverage Bitwarden for Existing Teams: If your team already uses Bitwarden, integrate it to avoid credential duplication

Sources: [skyvern-frontend/src/routes/workflows/components/CredentialSelector.tsx:1-100]()

LLM Provider Configuration

Skyvern leverages Large Language Models (LLMs) as the core intelligence engine for AI-powered browser automation. The LLM Provider Configuration system provides a flexible abstraction layer that enables Skyvern to connect with multiple LLM providers including OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, and Google Gemini. This architecture decouples the automation logic from specific LLM implementations, allowing users to select their preferred provider without modifying core application code.

Supported LLM Providers

Skyvern supports a comprehensive range of LLM providers to accommodate diverse enterprise requirements and budget considerations. The framework utilizes litellm as a unified transport layer, which normalizes API interactions across different providers through a consistent interface.

Provider	Supported Models
OpenAI	GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini
Anthropic	Claude 4.7 Opus, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Haiku, Sonnet, Opus)
Azure OpenAI	Any GPT models deployed to your Azure subscription
AWS Bedrock	Claude 4.7, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Sonnet, Opus)
Google Gemini	Gemini 3.1 Pro, Gemini 3 Flash

Sources: README.md:1-20

Provider Selection Criteria

When selecting an LLM provider for Skyvern deployments, consider the following factors. OpenAI models offer strong general-purpose performance with the broadest model availability. Anthropic's Claude series excels in instruction following and extended reasoning tasks, making it particularly suitable for complex multi-step browser automation workflows. Azure OpenAI provides enterprise-grade security and compliance features with the ability to use custom model deployments. AWS Bedrock offers seamless integration with other AWS services and HIPAA-compliant deployments. Google Gemini provides competitive pricing with strong multimodal capabilities.

Configuration Architecture

The LLM Provider Configuration system follows a layered architecture that separates provider selection, credential management, and runtime dispatch. This design enables runtime provider switching and supports fallback mechanisms for production deployments.

graph TD
    A[Task Request] --> B[LLM API Handler]
    B --> C{LLM Provider Selection}
    C -->|OpenAI| D[OpenAI Transport]
    C -->|Anthropic| E[Anthropic Transport]
    C -->|Azure| F[Azure OpenAI Transport]
    C -->|AWS| G[Bedrock Transport]
    C -->|Gemini| H[Gemini Transport]
    D --> I[litellm Unified Interface]
    E --> I
    F --> I
    G --> I
    H --> I
    I --> J[Provider API Endpoint]

Core Configuration Components

The configuration system comprises several interconnected components that manage provider selection, authentication, and request handling. The API handler serves as the primary entry point for LLM interactions, coordinating between the task execution engine and the underlying transport layer. Models define the data structures for requests, responses, and provider-specific configurations. The litellm transport provides the unified interface that normalizes differences between provider APIs.

Environment Configuration

Basic Setup

LLM provider credentials are configured through environment variables in the .env file. After running skyvern quickstart or skyvern init, the setup wizard will guide you through provider selection and credential configuration.

# Required for OpenAI
OPENAI_API_KEY=sk-...

# Required for Anthropic
ANTHROPIC_API_KEY=sk-ant-...

# Required for Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-key
AZURE_OPENAI_BASE_URL=https://your-resource.openai.azure.com

# Required for AWS Bedrock
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1

# Required for Gemini
GOOGLE_GENERATIVE_AI_API_KEY=your-gemini-key

Sources: README.md:1-50

Provider-Specific Configuration

#### OpenAI Configuration

For OpenAI providers, Skyvern supports both standard OpenAI endpoints and custom base URLs for proxy or gateway scenarios. Model selection can be specified at the task level or configured as the default in the environment.

#### Anthropic Configuration

Anthropic Claude models require the ANTHROPIC_API_KEY environment variable. The setup wizard can automatically configure this during initialization. Claude models are particularly well-suited for Skyvern's browser automation tasks due to their strong instruction-following capabilities.

#### Azure OpenAI Configuration

Azure OpenAI deployments require additional configuration for deployment-specific endpoints. The AZURE_OPENAI_BASE_URL should point to your Azure OpenAI resource endpoint, and the system supports any GPT models deployed to your Azure subscription.

#### AWS Bedrock Configuration

AWS Bedrock integration uses standard AWS credential chain resolution, including environment variables, IAM roles, and AWS profile configurations. The AWS_REGION variable determines which AWS region your Bedrock endpoints are hosted in.

#### Google Gemini Configuration

Gemini models are configured using the GOOGLE_GENERATIVE_AI_API_KEY. The framework supports both Gemini 3.1 Pro for complex reasoning tasks and Gemini 3 Flash for faster, cost-effective operations.

Provider Selection in Code

When using Skyvern programmatically through the SDK, you can specify the LLM provider at task creation time. The framework will use the configured credentials for the selected provider.

from skyvern import Skyvern

skyvern = Skyvern(api_key="your-api-key")
task = await skyvern.run_task(
    prompt="Find the top post on hackernews today",
)

Sources: README.md:50-80

Cloud vs Local Configuration

Skyvern supports two operational modes for LLM configuration. In Skyvern Cloud mode, the platform manages provider configuration and billing. In local mode, you configure your own LLM provider credentials, and Skyvern routes requests through your specified provider.

For local deployments, the setup wizard configures credentials automatically during initialization. For custom configurations, you can manually edit the .env file with your provider-specific credentials.

Advanced Configuration Options

Custom Endpoint Configuration

For enterprise deployments requiring proxy servers or custom API gateways, Skyvern supports base URL customization through provider-specific environment variables. This enables integration with internal LLM deployments, specialized inference endpoints, or regional API endpoints.

Multi-Provider Fallback

Production deployments can implement multi-provider fallback strategies by configuring multiple provider credentials. When the primary provider is unavailable, Skyvern can automatically route requests to backup providers based on priority configuration.

Model Selection Per Task

Individual tasks can specify model preferences that override the default configuration. This enables cost optimization by using lighter models for simple tasks while reserving more capable models for complex automation sequences.

Credential Security

Credential management follows security best practices by storing sensitive information exclusively in environment variables. The .env file should never be committed to version control. Skyvern's initialization process creates the .env file from .env.example if it does not exist, ensuring template credentials are never exposed.

Sources: README.md:1-30

Troubleshooting

Common LLM provider configuration issues include incorrect API keys, network connectivity problems, and quota exhaustion. The setup wizard validates credentials during configuration to catch most issues early. For runtime errors, Skyvern provides detailed error messages that identify the specific provider and error type.

If you encounter authentication errors, verify that your API keys are correctly set in the .env file and that the corresponding provider account has sufficient credits or quota available.

Sources: [README.md:1-20]()

Model Context Protocol (MCP) Integration

Related topics: LLM Provider Configuration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Connection Modes

Continue reading this section for the full explanation and source context.

Section MCP Client Configuration

Continue reading this section for the full explanation and source context.

Section Browser Session Management

Continue reading this section for the full explanation and source context.

Related topics: LLM Provider Configuration

Model Context Protocol (MCP) Integration

Overview

Skyvern's Model Context Protocol (MCP) integration enables AI applications to connect to Skyvern's browser automation capabilities. This integration allows AI-powered applications to perform browser-based tasks such as filling out forms, downloading files, researching information on the web, and executing complex web automation workflows through natural language commands.

The MCP server implementation serves as a bridge between AI applications and Skyvern's browser engine, providing a standardized interface for browser automation tasks.

Sources: integrations/mcp/README.md

Architecture

The MCP integration supports multiple deployment models and connection methods:

Connection Modes

Mode	Description	Use Case
Skyvern Cloud	Connect to managed cloud service	Production without self-hosting
Local Skyvern Server	Self-hosted deployment	Development, privacy, custom infrastructure

MCP Client Configuration

#### Cloud Configuration (streamable-http)

{
  "mcpServers": {
    "skyvern": {
      "type": "streamable-http",
      "url": "https://api.skyvern.com/mcp/",
      "headers": { "x-api-key": "YOUR_API_KEY" }
    }
  }
}

#### Local Configuration

{
  "mcpServers": {
    "skyvern": {
      "command": "python3",
      "args": ["-m", "skyvern", "run", "mcp"],
      "env": {
        "SKYVERN_BASE_URL": "http://localhost:8000",
        "SKYVERN_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}

Sources: skyvern/cli/mcp_tools/README.md

Available MCP Tools

Browser Session Management

Tool	Description
`skyvern_browser_session_create`	Create a new browser session
`skyvern_browser_session_close`	Close an existing browser session
`skyvern_browser_session_list`	List all active browser sessions
`skyvern_browser_session_get`	Get details of a specific session
`skyvern_browser_session_connect`	Connect to an existing session

Browser Actions

Tool	Description
`skyvern_act`	Execute natural language actions
`skyvern_navigate`	Navigate to a URL
`skyvern_click`	Click on an element
`skyvern_type`	Type text into a field
`skyvern_hover`	Hover over an element
`skyvern_scroll`	Scroll the page
`skyvern_select_option`	Select an option from dropdown
`skyvern_press_key`	Press a keyboard key
`skyvern_drag`	Drag an element
`skyvern_file_upload`	Upload a file
`skyvern_wait`	Wait for page to load

Data Extraction & Validation

Tool	Description
`skyvern_extract`	Extract structured JSON data from page
`skyvern_screenshot`	Take a screenshot
`skyvern_find`	Find elements on the page
`skyvern_validate`	Validate page content
`skyvern_evaluate`	Run JavaScript code
`skyvern_get_html`	Get page HTML

Sources: skyvern/cli/mcp_tools/README.md

Quick Start Guide

Prerequisites

REQUIREMENT: Skyvern only runs in Python 3.11 environment today

Installation Steps

``bash pip install skyvern ``

Install Skyvern

Run the setup wizard which will guide you through the configuration process: ``bash skyvern init `` You can connect to either Skyvern Cloud or a local version of Skyvern.

Configure Skyvern

Only required in local mode: ``bash skyvern run server ``

Launch Local Server (Optional)

Sources: integrations/mcp/README.md

Claude Desktop Integration

Skyvern provides a downloadable .mcpb bundle that installs Skyvern Cloud into Claude Desktop without requiring the user to install Node.js.

Building the MCP Bundle

./scripts/package-mcpb.sh 1.0.23

Publishing to Releases

./scripts/package-mcpb.sh 1.0.23 skyvern-claude-desktop.mcpb \
  skyvern/cli/mcpb/releases/skyvern-claude-desktop.mcpb

Sources: skyvern/cli/mcpb/claude_desktop/README.md

Usage Patterns

Natural Language Actions

The skyvern_act tool allows you to describe actions in natural language, which Skyvern's AI interprets and executes:

"Click the login button"
"Fill in the email field with [email protected]"
"Select 'Premium' from the subscription dropdown"

Data Extraction

Use skyvern_extract to extract structured JSON data from web pages by describing the data you need:

"Extract all product names, prices, and ratings"

Screenshot and Validation Loops

For debugging and verification, use screenshot + validate loops:

# Take screenshot
screenshot = skyvern_screenshot()

# Validate content
validation = skyvern_validate("The login form is visible")

# If validation fails, take another screenshot for debugging
if not validation.success:
    screenshot = skyvern_screenshot()

Integration with AI Applications

The MCP integration enables AI applications to:

Automate form filling: Submit complex forms with AI-guided input
Research web content: Extract structured data from multiple sources
Download files: Navigate to and download files from websites
Execute workflows: Run browser automation workflows
Handle 2FA flows: Manage TOTP (Time-based One-Time Password) authentication

Credential Management

Skyvern's MCP tools support secure credential management for login flows:

Credential Type	Usage Pattern
Password	`{{ my_credential.username }}` / `{{ my_credential.password }}`
Secret	`{{ my_secret.secret_value }}`
Custom Service	Configure via CustomCredentialServiceConfigForm

API Reference

HTTP Request Block Tips

When using HTTP request blocks with MCP tools:

Use "Import cURL" to quickly convert API documentation examples
Use "Quick Headers" to add common authentication and content headers
The request will return response data including status, headers, and body
Reference response data in later blocks with parameters

Response Data

All MCP tool responses include:

Field	Description
`status`	HTTP status code
`headers`	Response headers
`body`	Response body content

Workflow Integration

MCP tools can be integrated into Skyvern workflows for:

Browser automation blocks: Execute MCP actions as part of workflow steps
Conditional logic: Use validation results to control workflow branching
Data extraction: Feed extracted data into subsequent workflow blocks
Scheduled execution: Run MCP-powered workflows on cron schedules

Best Practices

Session Management: Always close browser sessions when done to free resources
Error Handling: Use validation tools to check page state before proceeding
Screenshot Debugging: Take screenshots at key points for debugging failed automations
Credential Security: Use environment variables and secure credential storage
Rate Limiting: Be mindful of API rate limits when making frequent requests

Sources: [integrations/mcp/README.md]()

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high what ensures it’s the correct one in that context?

The project may affect permissions, credentials, data exposure, or host boundaries.

medium Release v1.0.29

First-time setup may fail or require extra isolation and rollback planning.

medium Task Execution Performance: Seeking guidance on optimizing execution speed

First-time setup may fail or require extra isolation and rollback planning.

medium [Feature Request] Multi-session VNC support for local/self-hosted deployments (Live view & Take Control)

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Security or permission risk: what ensures it’s the correct one in that context?

Severity: high
Finding: Security or permission risk is backed by a source signal: what ensures it’s the correct one in that context?. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/issues/5637

2. Installation risk: Release v1.0.29

Severity: medium
Finding: Installation risk is backed by a source signal: Release v1.0.29. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.29

3. Installation risk: Task Execution Performance: Seeking guidance on optimizing execution speed

Severity: medium
Finding: Installation risk is backed by a source signal: Task Execution Performance: Seeking guidance on optimizing execution speed. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/issues/4375

4. Installation risk: [Feature Request] Multi-session VNC support for local/self-hosted deployments (Live view & Take Control)

Severity: medium
Finding: Installation risk is backed by a source signal: [Feature Request] Multi-session VNC support for local/self-hosted deployments (Live view & Take Control). Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/issues/4392

5. Configuration risk: Performance bottleneck: High latency for simple form-filling workflows

Severity: medium
Finding: Configuration risk is backed by a source signal: Performance bottleneck: High latency for simple form-filling workflows. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/issues/4439

6. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | README/documentation is current enough for a first validation pass.

7. Maintenance risk: Release v1.0.34

Severity: medium
Finding: Maintenance risk is backed by a source signal: Release v1.0.34. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.34

8. Maintenance risk: Release v1.0.35

Severity: medium
Finding: Maintenance risk is backed by a source signal: Release v1.0.35. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.35

9. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | last_activity_observed missing

10. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | no_demo; severity=medium

11. Security or permission risk: No sandbox install has been executed yet; downstream must verify before user use.

Severity: medium
Finding: No sandbox install has been executed yet; downstream must verify before user use.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.safety_notes | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | No sandbox install has been executed yet; downstream must verify before user use.

12. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | no_demo; severity=medium

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using skyvern with real data or production workflows.

Clarification on the Custom credential documentation on the Delete API w - github / github_issue
GROQ error - github / github_issue
Task Execution Performance: Seeking guidance on optimizing execution spe - github / github_issue
[[Feature Request] Multi-session VNC support for local/self-hosted deploy](https://github.com/Skyvern-AI/skyvern/issues/4392) - github / github_issue
persist_browser_session flag saves sessions but never retrieves them on - github / github_issue
Performance bottleneck: High latency for simple form-filling workflows - github / github_issue
what ensures it’s the correct one in that context? - github / github_issue
Release v1.0.36 - github / github_release
Release v1.0.35 - github / github_release
Release v1.0.34 - github / github_release
Release v1.0.33 - github / github_release
Release v1.0.32 - github / github_release

Source: Project Pack community evidence and pitfall evidence

skyvern

Introduction to Skyvern

Related Pages

Introduction to Skyvern

Overview

Key Features

Multi-LLM Support

Workflow Automation

Model Context Protocol (MCP) Integration

Architecture Overview

System Components

Browser Connection Options

Getting Started

Installation and Setup

Quickstart for Contributors

SDK Usage

Python SDK

MCP Tools

Workflows

Workflow Block Types

Authentication and Credentials

Credential Services

2FA/TOTP Handling

Task Creation

Navigation Goal

Advanced Settings

Scheduling

Cloud Integration

Browser Tunneling

Claude Desktop Integration

Telemetry

License

Documentation and Support

Related Documentation

System Architecture

Related Pages

System Architecture

Overview

High-Level Architecture

Frontend Architecture

Component Structure

Key Frontend Components

Backend Architecture (Forge)

Forge Application

AI Agent Engine

Workflow Service

Browser Automation Layer

Browser Manager

Browser Configuration Options

Data Storage and External Services

AWS Integration

Workflow Scripts Storage

Task Execution Model

Task Creation Flow

Task States

Workflow System Architecture

Workflow Components

Schedule Configuration

Script Versioning

Credentials and Authentication

TOTP/2FA Management

LLM Provider Integration

Development and Deployment

Local Development Setup

Running Workflows Locally

System Data Flow

Summary

Browser Automation Engine

Related Pages

Browser Automation Engine

Overview

Architecture

System Components

Module Structure

Browser Session Management

Session Lifecycle

Persistent Browser Sessions

Session Configuration Parameters

Chrome DevTools Protocol Integration

CDP Connection