# https://github.com/Skyvern-AI/skyvern 项目说明书

生成时间：2026-05-16 07:39:12 UTC

## 目录

- [Introduction to Skyvern](#introduction)
- [System Architecture](#system-architecture)
- [Browser Automation Engine](#browser-automation)
- [Workflow System](#workflow-system)
- [AI-Powered Commands](#ai-commands)
- [Database Models](#database-models)
- [Artifact Storage](#artifact-storage)
- [Credential Management](#credential-management)
- [LLM Provider Configuration](#llm-providers)
- [Model Context Protocol (MCP) Integration](#mcp-integration)

<a id='introduction'></a>

## Introduction to Skyvern

### 相关页面

相关主题：[System Architecture](#system-architecture), [Browser Automation Engine](#browser-automation)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/Skyvern-AI/skyvern/blob/main/README.md)
- [integrations/mcp/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md)
- [skyvern/cli/mcp_tools/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcp_tools/README.md)
- [skyvern/cli/mcpb/claude_desktop/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcpb/claude_desktop/README.md)
- [skyvern/forge/sdk/api/aws.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/api/aws.py)
</details>

# Introduction to Skyvern

## Overview

Skyvern is an open-source browser automation platform that enables AI agents to interact with websites by understanding natural language instructions. The platform combines large language model (LLM) powered reasoning with browser automation capabilities, allowing developers to create workflows that can navigate websites, fill out forms, extract data, download files, and perform complex multi-step web tasks autonomously.

Skyvern operates by interpreting user prompts and executing browser actions through a CDP (Chrome DevTools Protocol) connection, providing AI applications with the ability to interact with the web just like a human user would 资料来源：[README.md:1-50]()

## Key Features

### Multi-LLM Support

Skyvern supports integration with multiple LLM providers, enabling flexible deployment options:

| Provider | Supported Models |
|----------|------------------|
| OpenAI | GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini |
| Anthropic | Claude 4.7 Opus, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Haiku, Sonnet, Opus) |
| Azure OpenAI | Any GPT models deployed to Azure subscription |
| AWS Bedrock | Claude 4.7, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Sonnet, Opus) |
| Gemini | Gemini 3.1 Pro, Gemini 3 Flash |

资料来源：[README.md:65-72]()

### Workflow Automation

Skyvern enables the creation of automated workflows that can:

- Navigate to websites and interact with web elements
- Fill out forms and submit data
- Extract structured information from web pages
- Handle authentication and credential management
- Download files and manage browser sessions
- Handle multi-factor authentication (2FA/TOTP)
- Schedule and execute tasks on a recurring basis

资料来源：[skyvern-frontend/src/routes/tasks/create/CreateNewTaskForm.tsx:1-30]()

### Model Context Protocol (MCP) Integration

Skyvern provides MCP server implementation for seamless integration with AI applications. This allows AI applications to connect to Skyvern and utilize its browser automation capabilities through a standardized protocol 资料来源：[integrations/mcp/README.md:1-25]()

## Architecture Overview

### System Components

```mermaid
graph TD
    A[AI Application] -->|MCP Protocol| B[Skyvern MCP Server]
    B --> C[Skyvern API]
    C --> D[Task Executor]
    D --> E[Browser Automation Engine]
    E --> F[CDP Browser Instance]
    
    G[LLM Provider] -->|Reasoning| D
    H[Credential Vault] -->|Auth| D
    I[Schedule Manager] -->|Trigger| C
```

### Browser Connection Options

Skyvern supports multiple browser connection modes:

1. **Local CDP Browser** - Connect to a locally running Chrome instance
2. **Skyvern Cloud Browser** - Use managed browser infrastructure
3. **Browser Tunneling** - Expose local browser to Skyvern Cloud via tunnel

资料来源：[README.md:85-120]()

## Getting Started

### Installation and Setup

Requirements: Python 3.11+ environment 资料来源：[integrations/mcp/README.md:15]()

```bash
# Install Skyvern
pip install skyvern

# Initialize configuration
skyvern init

# Run the server (local mode only)
skyvern run server
```

### Quickstart for Contributors

```bash
# Install dependencies using uv
uv sync --group dev

# Run setup wizard
uv run skyvern quickstart

# Access UI at http://localhost:8080
```

资料来源：[README.md:45-60]()

## SDK Usage

### Python SDK

```python
from skyvern import Skyvern

skyvern = Skyvern(api_key="your-api-key")
skyvern.set_browser_context(
    browser_type="cdp-connect",
    remote_debugging_url="http://127.0.0.1:9222"
)
task = await skyvern.run_task(
    prompt="Find the top post on hackernews today"
)
```

### MCP Tools

Skyvern provides comprehensive MCP tools for browser automation:

| Category | Tools |
|----------|-------|
| Navigation | `skyvern_navigate`, `skyvern_click`, `skyvern_select_option`, `skyvern_press_key`, `skyvern_drag` |
| Data Extraction | `skyvern_extract`, `skyvern_screenshot`, `skyvern_find`, `skyvern_validate`, `skyvern_get_html` |
| Authentication | `skyvern_login`, `skyvern_credential_list`, `skyvern_credential_get` |
| Tabs & Frames | `skyvern_tab_new`, `skyvern_tab_list`, `skyvern_tab_switch`, `skyvern_frame_list` |
| Network | `skyvern_console_messages`, `skyvern_network_requests`, `skyvern_network_route` |

资料来源：[skyvern/cli/mcp_tools/README.md:1-50]()

## Workflows

Skyvern supports workflow-based automation where complex tasks can be defined as a series of steps with conditional logic, evaluations, and human interaction checkpoints.

```mermaid
graph LR
    A[Start] --> B[Block 1: Action]
    B --> C[Block 2: Condition]
    C -->|True| D[Block 3: Evaluation]
    C -->|False| E[Block 4: Fallback]
    D --> F[Human Interaction]
    F --> G[Continue to Next]
    E --> G
```

### Workflow Block Types

| Block Type | Purpose |
|------------|---------|
| Action | Execute browser actions (click, type, navigate) |
| Condition | Branch logic based on page state |
| Evaluation | Run JavaScript to validate or extract data |
| Human Interaction | Pause workflow for manual input |

资料来源：[skyvern-frontend/src/routes/workflows/workflowRun/WorkflowRunTimelineBlockItem.tsx:1-60]()

## Authentication and Credentials

### Credential Services

Skyvern supports multiple credential backends:

- Skyvern Vault (built-in)
- Bitwarden
- 1Password
- Azure Key Vault
- Custom credential services via API configuration

资料来源：[skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:1-40]()

### 2FA/TOTP Handling

Skyvern provides automated TOTP code extraction and attachment to runs:

```tsx
<PushTotpCodeForm
  showAdvancedFields
  onSuccess={handleFormSuccess}
/>
```

The system extracts verification codes from push notifications and attaches them to relevant workflow runs automatically.

资料来源：[skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:1-30]()

## Task Creation

### Navigation Goal

Tasks are defined using natural language prompts that describe what Skyvern should do:

```
prompt="Find the top post on hackernews today"
```

### Advanced Settings

| Parameter | Description |
|-----------|-------------|
| Navigation Payload | JSON parameters for routes/states |
| Proxy Location | Route through geographic proxies |
| Browser Session ID | Use persistent browser sessions |
| Browser Address | CDP server address |

资料来源：[skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:1-50]()

## Scheduling

Tasks and workflows can be scheduled using cron expressions with timezone support:

```python
schedule = await skyvern.create_schedule(
    workflow_id="workflow_xxx",
    cron_expression="0 9 * * *",  # Daily at 9 AM
    timezone="America/New_York"
)
```

资料来源：[skyvern-frontend/src/routes/workflows/editor/panels/schedulePanel/CreateScheduleDialog.tsx:1-60]()

## Cloud Integration

### Browser Tunneling

Connect Skyvern Cloud to your local browser with existing cookies and extensions:

```bash
# Start Chrome with tunnel to Skyvern Cloud
skyvern browser serve --tunnel
```

This command creates a tunnel URL that can be used to run tasks with your local browser state 资料来源：[README.md:115-135]()

### Claude Desktop Integration

Skyvern provides downloadable `.mcpb` bundles for quick Claude Desktop setup:

```bash
./scripts/package-mcpb.sh 1.0.23
```

资料来源：[skyvern/cli/mcpb/claude_desktop/README.md:1-25]()

## Telemetry

By default, Skyvern collects basic usage statistics to understand how the platform is being used. To opt-out:

```bash
export SKYVERN_TELEMETRY=false
```

资料来源：[README.md:35-38]()

## License

Skyvern's open-source repository is licensed under AGPL-3.0. The core automation logic is available in this repository, with anti-bot measures available in the managed cloud offering 资料来源：[README.md:40-43]()

## Documentation and Support

- **Documentation**: [https://www.skyvern.com/docs](https://www.skyvern.com/docs)
- **Discord Community**: [https://discord.gg/fG2XXEuQX3](https://discord.gg/fG2XXEuQX3)
- **Email Support**: founders@skyvern.com
- **GitHub Issues**: [Help Wanted标签的问题](https://github.com/skyvern-ai/skyvern/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)

## Related Documentation

For more detailed information on specific features:

- [MCP Integration Guide](../integrations/mcp/README.md)
- [CLI MCP Tools Reference](../skyvern/cli/mcp_tools/README.md)
- [Claude Desktop Setup](../skyvern/cli/mcpb/claude_desktop/README.md)
- [Browser Connection Configuration](https://www.skyvern.com/docs/optimization/browser-tunneling)

---

<a id='system-architecture'></a>

## System Architecture

### 相关页面

相关主题：[Introduction to Skyvern](#introduction), [Browser Automation Engine](#browser-automation), [Workflow System](#workflow-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [skyvern-frontend/src/routes/tasks/create/SavedTaskForm.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/tasks/create/SavedTaskForm.tsx)
- [skyvern-frontend/src/routes/tasks/create/CreateNewTaskForm.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/tasks/create/CreateNewTaskForm.tsx)
- [skyvern-frontend/src/routes/workflows/WorkflowScriptsPage.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/WorkflowScriptsPage.tsx)
- [skyvern-frontend/src/routes/workflows/editor/Workspace.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/editor/Workspace.tsx)
- [skyvern-frontend/src/components/BrowserStream.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/components/BrowserStream.tsx)
- [skyvern/forge/sdk/api/aws.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/api/aws.py)
- [README.md](https://github.com/Skyvern-AI/skyvern/blob/main/README.md)
</details>

# System Architecture

## Overview

Skyvern is an AI-powered web automation framework that enables programmatic browser control through natural language instructions. The system architecture consists of three primary layers: a React-based frontend interface, a Python backend API (Forge), and a browser automation engine. This document provides a comprehensive technical overview of the system's components, data flows, and integration patterns.

## High-Level Architecture

```mermaid
graph TD
    subgraph Frontend["Frontend Layer (React/TypeScript)"]
        UI[User Interface]
        Forms[Task & Workflow Forms]
        Stream[Browser Stream Viewer]
    end
    
    subgraph Backend["Backend Layer (Python/Forge)"]
        API[Forge API]
        Agent[AI Agent Engine]
        Workflow[Workflow Engine]
        Scheduler[Scheduler Service]
    end
    
    subgraph Browser["Browser Automation Layer"]
        BrowserMgr[Browser Manager]
        CDP[Chrome DevTools Protocol]
        BrowserInst[Browser Instances]
    end
    
    subgraph Storage["Storage & External Services"]
        S3[S3 Storage]
        DB[(Database)]
        LLM[LLM Providers]
    end
    
    UI --> Forms
    Forms --> API
    UI --> Stream
    Stream --> BrowserMgr
    API --> Agent
    API --> Workflow
    API --> Scheduler
    Agent --> BrowserMgr
    Agent --> LLM
    Workflow --> S3
    Scheduler --> DB
    BrowserMgr --> CDP
    CDP --> BrowserInst
```

## Frontend Architecture

The frontend is a React-based Single Page Application (SPA) located in the `skyvern-frontend/` directory. It provides user interfaces for task creation, workflow management, credentials handling, and real-time browser streaming.

### Component Structure

| Component Category | Location | Purpose |
|-------------------|----------|---------|
| Task Forms | `src/routes/tasks/create/` | Task creation and management forms |
| Workflow Editor | `src/routes/workflows/editor/` | Visual workflow building interface |
| Credentials | `src/routes/credentials/` | Credential and TOTP management |
| Schedules | `src/routes/schedules/` | Schedule viewing and configuration |
| Shared Components | `src/components/` | Reusable UI components |

### Key Frontend Components

#### BrowserStream Component

The `BrowserStream` component handles real-time browser visualization. It displays animated loading states while establishing connections and renders rotating messages to indicate progress.

```typescript
// skyvern-frontend/src/components/BrowserStream.tsx
<RotateThrough interval={7 * 1000}>
  <span>Hm, working on the connection...</span>
  <span>Hang tight, we're almost there...</span>
  <span>Just a moment...</span>
  <span>Backpropagating...</span>
  <span>Attention is all I need...</span>
  <span>Consulting the manual...</span>
</RotateThrough>
```

资料来源：[skyvern-frontend/src/components/BrowserStream.tsx]()

#### Task Forms

Task creation is handled through two primary form components:

1. **CreateNewTaskForm**: Used for creating new tasks with navigation goals
2. **SavedTaskForm**: Used for creating tasks from saved templates

Both forms support advanced settings including navigation payloads for specifying parameters, routes, or states:

```typescript
// Navigation Payload field in SavedTaskForm
<FormField
  control={form.control}
  name="navigationPayload"
  render={({ field }) => (
    <FormItem>
      <FormLabel>
        <h1 className="text-lg">Navigation Payload</h1>
        <h2 className="text-base text-slate-400">
          Specify important parameters, routes, or states
        </h2>
      </FormLabel>
      <CodeEditor {...field} language="json" />
    </FormItem>
  )}
/>
```

资料来源：[skyvern-frontend/src/routes/tasks/create/SavedTaskForm.tsx]()

#### Workflow Editor Workspace

The workflow editor workspace provides local execution capabilities with a dialog-based interface for running code locally:

```typescript
// skyvern-frontend/src/routes/workflows/editor/Workspace.tsx
function bash(command: string, code?: string) {
  return <code className="rounded bg-slate-800 px-1.5 py-0.5">{command}</code>;
}

// Installation and setup instructions
// 1. Install skyvern: pip install skyvern
// 2. Set up skyvern: skyvern quickstart
// 3. Run the code: skyvern run code --params '{...}' main.py
```

资料来源：[skyvern-frontend/src/routes/workflows/editor/Workspace.tsx]()

## Backend Architecture (Forge)

The Forge backend is the core Python application that handles task execution, workflow orchestration, and browser automation. Key modules include:

### Forge Application

The main application entry point in `skyvern/forge/forge_app.py` initializes the FastAPI application, configures middleware, and registers routes.

### AI Agent Engine

The agent system in `skyvern/forge/agent.py` processes natural language instructions and generates executable browser actions. The agent:

1. Receives task definitions and navigation goals
2. Interacts with LLM providers for decision-making
3. Generates action sequences for browser automation
4. Handles error recovery and retry logic

### Workflow Service

Workflow definitions are managed through the SDK service in `skyvern/forge/sdk/workflow/service.py`. This module provides:

- Workflow creation and versioning
- Script management with cache keys
- Execution history tracking

## Browser Automation Layer

### Browser Manager

The browser manager (`skyvern/webeye/browser_manager.py`) orchestrates browser instances using Chrome DevTools Protocol (CDP). It provides:

- Browser pool management
- Session persistence
- Screenshot and recording capabilities
- Multi-tab support

### Browser Configuration Options

The frontend exposes several browser configuration parameters:

| Parameter | Type | Purpose |
|-----------|------|---------|
| `proxyLocation` | string | Proxy server routing |
| `browserSessionId` | string | Persistent session identifier (format: `pbs_xxx`) |
| `cdpAddress` | string | Remote CDP endpoint (e.g., `http://127.0.0.1:9222`) |

资料来源：[skyvern-frontend/src/routes/tasks/create/PromptBox.tsx]()

## Data Storage and External Services

### AWS Integration

Skyvern uses AWS services for storage and cloud operations. The `S3Uri` class provides URI parsing for S3 resources:

```python
# skyvern/forge/sdk/api/aws.py
class S3Uri:
    """Parse and manipulate S3 URIs."""
    
    def __init__(self, uri: str) -> None:
        self._parsed = urlparse(uri, allow_fragments=False)
    
    @property
    def bucket(self) -> str:
        return self._parsed.netloc
    
    @property
    def key(self) -> str:
        if self._parsed.query:
            return self._parsed.path.lstrip("/") + "?" + self._parsed.query
        return self._parsed.path.lstrip("/")
```

资料来源：[skyvern/forge/sdk/api/aws.py]()

### Workflow Scripts Storage

Scripts are stored with metadata including cache keys and revision counts:

| Field | Description |
|-------|-------------|
| `Cache Key Value` | Unique identifier for the script |
| `Total Revisions` | Number of versions |
| `Runs` | Execution count |
| `Last Updated` | Most recent modification timestamp |

资料来源：[skyvern-frontend/src/routes/workflows/WorkflowScriptsPage.tsx]()

## Task Execution Model

### Task Creation Flow

```mermaid
sequenceDiagram
    participant User
    participant Frontend
    participant Forge API
    participant Agent
    participant Browser
    
    User->>Frontend: Enter navigation goal
    User->>Frontend: Configure advanced settings
    User->>Frontend: Submit task
    Frontend->>Forge API: POST /v1/tasks
    Forge API->>Agent: Create task instance
    Agent->>Browser: Initialize browser session
    Browser-->>Agent: Session established
    Agent-->>Forge API: Task created
    Forge API-->>Frontend: Task response
    Frontend-->>User: Display task status
```

### Task States

| State | Description |
|-------|-------------|
| `Navigation Goal` | Primary instruction for the agent |
| `Navigation Payload` | Additional parameters, routes, states |
| `Proxy Location` | Optional proxy routing |
| `Browser Session ID` | Persistent session reference |

## Workflow System Architecture

### Workflow Components

| Component | Purpose |
|-----------|---------|
| Workflow Scripts | Cached code blocks with versioning |
| Schedules | Cron-based execution triggers |
| Workflow Runs | Individual execution instances |
| Workflow History | Version tracking and modification history |

### Schedule Configuration

Schedules support timezone-aware cron expressions:

```typescript
// Schedule display components
<div className="space-y-2">
  <span className="text-sm text-slate-400">Timezone</span>
  <span className="text-sm text-slate-50">{schedule.timezone}</span>
</div>
<div className="space-y-2">
  <span className="text-sm text-slate-400">Cron</span>
  <code className="font-mono text-xs">{schedule.cron_expression}</code>
</div>
```

资料来源：[skyvern-frontend/src/routes/schedules/ScheduleDetailPage.tsx]()

### Script Versioning

Each workflow script maintains a revision history:

```typescript
// Revision count calculation
{versions?.versions
  ? versions.versions.filter(
      (v) => v.version < (activeVersion ?? 0),
    ).length
  : 0}
<span className="text-sm font-normal">prior</span>
```

资料来源：[skyvern-frontend/src/routes/workflows/WorkflowScriptDetailPage.tsx]()

## Credentials and Authentication

### TOTP/2FA Management

Skyvern supports 2FA code management for authenticated workflows:

| Component | Description |
|-----------|-------------|
| `PushTotpCodeForm` | Form for submitting verification codes |
| Identifier Filter | Filter by email or phone |
| OTP Type Filter | Filter by type (TOTP/Magic Link) |

资料来源：[skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx]()

## LLM Provider Integration

Skyvern supports multiple LLM providers through a unified interface:

| Provider | Supported Models |
|----------|------------------|
| OpenAI | GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini |
| Anthropic | Claude 4.7 Opus, Claude 4.6, Claude 4.5 |
| Azure OpenAI | Any deployed GPT models |
| AWS Bedrock | Claude 4.7, Claude 4.6, Claude 4.5 |
| Gemini | Gemini 3.1 Pro, Gemini 3 Flash |

资料来源：[README.md]()

## Development and Deployment

### Local Development Setup

```bash
# 1. Create virtual environment
uv sync --group dev

# 2. Initialize configuration
uv run skyvern quickstart

# 3. Access UI
# Navigate to http://localhost:8080
```

资料来源：[README.md]()

### Running Workflows Locally

The workspace editor provides local execution capabilities:

```bash
# 1. Install skyvern
pip install skyvern

# 2. Set up skyvern
skyvern quickstart

# 3. Run workflow code
skyvern run code --params '{"param1": "val1"}' main.py
```

## System Data Flow

```mermaid
graph LR
    subgraph Input["User Input"]
        Prompt[Natural Language Prompt]
        Payload[Navigation Payload]
        Config[Configuration]
    end
    
    subgraph Processing["Forge Processing"]
        Parse[Parse & Validate]
        Agent[Agent Reasoning]
        Plan[Action Planning]
    end
    
    subgraph Execution["Browser Execution"]
        Navigate[Navigate]
        Interact[Interact]
        Extract[Extract Data]
    end
    
    subgraph Output["Results"]
        Screenshots[Screenshots]
        Data[Extracted Data]
        Logs[Execution Logs]
    end
    
    Input --> Parse
    Parse --> Agent
    Agent --> Plan
    Plan --> Execute
    Execute --> Output
    
    style Input fill:#e1f5fe
    style Processing fill:#fff3e0
    style Execution fill:#e8f5e9
    style Output fill:#f3e5f5
```

## Summary

The Skyvern system architecture follows a modular design with clear separation of concerns:

1. **Frontend Layer**: React SPA providing task creation, workflow editing, and real-time visualization
2. **Backend Layer**: Python FastAPI application handling agent orchestration, workflow management, and scheduling
3. **Browser Layer**: Chrome DevTools Protocol-based automation engine for web interaction
4. **Storage Layer**: S3 for large objects, database for structured data, and LLM providers for reasoning

The system supports multiple LLM providers, enables persistent browser sessions, and provides comprehensive workflow versioning and scheduling capabilities.

---

<a id='browser-automation'></a>

## Browser Automation Engine

### 相关页面

相关主题：[Introduction to Skyvern](#introduction), [AI-Powered Commands](#ai-commands)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [skyvern/webeye/__init__.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/__init__.py)
- [skyvern/webeye/browser_manager.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/browser_manager.py)
- [skyvern/webeye/browser_state.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/browser_state.py)
- [skyvern/webeye/real_browser_manager.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/real_browser_manager.py)
- [skyvern/webeye/actions/handler.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/actions/handler.py)
- [skyvern/webeye/cdp_connection.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/webeye/cdp_connection.py)
</details>

# Browser Automation Engine

## Overview

The Browser Automation Engine is the core component of Skyvern that enables AI agents to interact with websites through browser control. Instead of relying on fragile XPath-based selectors that break with website layout changes, Skyvern leverages Vision LLMs combined with Playwright and Chrome DevTools Protocol (CDP) to visually understand and interact with web pages.

The engine provides a unified interface for:

- Launching and managing browser sessions
- Navigating to URLs with configurable behavior
- Executing actions (click, type, scroll, hover, etc.)
- Capturing screenshots for LLM analysis
- Extracting structured data from web pages
- Handling multi-step workflows across websites

资料来源：[README.md:60-80]()

## Architecture

### System Components

```mermaid
graph TD
    A[Agent / Task Request] --> B[Browser Manager]
    B --> C[Real Browser Manager]
    C --> D[Playwright Browser]
    C --> E[CDP Connection]
    D --> F[Browser State]
    F --> G[Screenshot Capture]
    F --> H[DOM Extraction]
    E --> I[DevTools Protocol]
    G --> J[Vision LLM Analysis]
    J --> K[Action Handler]
    K --> C
```

### Module Structure

| Module | Purpose |
|--------|---------|
| `webeye/__init__.py` | Public API exports and core abstractions |
| `browser_manager.py` | Abstract browser manager interface |
| `real_browser_manager.py` | Concrete Playwright-based implementation |
| `browser_state.py` | Page state representation and snapshot |
| `actions/handler.py` | Action execution and coordination |
| `cdp_connection.py` | Chrome DevTools Protocol communication |

资料来源：[skyvern/forge/sdk/routes/agent_protocol.py:30-50]()

## Browser Session Management

### Session Lifecycle

```mermaid
stateDiagram-v2
    [*] --> Created: browser_session_id
    Created --> Launching: launch()
    Launching --> Ready: browser ready
    Ready --> Navigating: navigate(url)
    Navigating --> Ready: page loaded
    Ready --> Executing: perform_action()
    Executing --> Ready: action complete
    Ready --> Closed: close()
    Closed --> [*]
```

### Persistent Browser Sessions

Skyvern supports persistent browser sessions that maintain cookies, local storage, and login states across task executions:

```python
# Create a persistent browser session
browser_session_id = "pbs_xxxxxxxxxxxx"

# Reuse session for subsequent tasks
task = await skyvern.run_task(
    prompt="Download invoice from my account",
    browser_session_id=browser_session_id,
)
```

资料来源：[skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:40-60]()

### Session Configuration Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `browser_session_id` | string | ID of a persistent browser session |
| `cdp_address` | string | Browser DevTools address (e.g., `http://127.0.0.1:9222`) |
| `proxy_location` | string | Geographic proxy for requests |
| `extra_http_headers` | dict | Custom HTTP headers for requests |
| `totp_identifier` | string | 2FA identifier for authenticated flows |

资料来源：[skyvern/forge/sdk/routes/agent_protocol.py:40-55]()

## Chrome DevTools Protocol Integration

### CDP Connection

The CDP connection module provides low-level access to Chrome's debugging interface:

```python
# CDP connection configuration
cdp_address = "http://127.0.0.1:9222"
```

Skyvern can connect to:

1. **Local Chrome** - Chrome with remote debugging enabled
2. **Existing Browser** - Your Chrome with cookies and extensions
3. **Cloud Browser** - Skyvern-hosted browser via tunnel

资料来源：[skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:65-80]()

### Remote Debugging Setup

```bash
# Step 1: Open Chrome with remote debugging
chrome --remote-debugging-port=9222

# Or use Skyvern's CLI helper
skyvern init browser
```

The browser exposes WebSocket endpoint at `http://127.0.0.1:9222` for CDP commands.

资料来源：[README.md:45-65]()

## Browser State Representation

### State Components

```mermaid
graph LR
    A[Browser State] --> B[Current URL]
    A --> C[Screenshot]
    A --> D[DOM Tree]
    A --> E[Cookies]
    A --> F[Local Storage]
    A --> G[Viewport Info]
```

### Browser State Object

| Property | Description |
|----------|-------------|
| `url` | Current page URL |
| `title` | Page title |
| `screenshot` | Base64-encoded screenshot |
| `dom_tree` | Parsed DOM structure |
| `viewport` | Viewport dimensions |
| `elements` | Interactive element mapping |

资料来源：[skyvern/webeye/browser_state.py]()

## Action Handler

### Supported Actions

The action handler executes LLM-decided actions on the browser:

| Action | Parameters | Description |
|--------|------------|-------------|
| `click` | element_selector | Click on specified element |
| `type` | text, element_selector | Enter text into input field |
| `hover` | element_selector | Mouse hover over element |
| `scroll` | direction, amount | Scroll page view |
| `select` | value, element_selector | Select dropdown option |
| `press_key` | key | Press keyboard key |
| `wait` | duration | Wait for page to settle |
| `navigate` | url | Go to URL |
| `screenshot` | - | Capture current view |
| `extract` | schema | Extract data per schema |

资料来源：[skyvern/webeye/actions/handler.py]()

### Action Execution Flow

```mermaid
sequenceDiagram
    participant LLM as Vision LLM
    participant AH as Action Handler
    participant BM as Browser Manager
    participant Browser as Playwright/CDP
    
    LLM->>AH: Decide action from screenshot
    AH->>BM: Execute action request
    BM->>Browser: CDP/Playwright command
    Browser-->>BM: Action result
    BM-->>AH: Updated browser state
    AH-->>LLM: State + screenshot for next decision
```

## Browser Configuration Options

### Launch Configuration

| Option | Default | Description |
|--------|---------|-------------|
| `headless` | true | Run browser without visible window |
| `viewport_width` | 1280 | Browser viewport width |
| `viewport_height` | 720 | Browser viewport height |
| `user_agent` | auto | User agent string |
| `ignore_https_errors` | false | Allow invalid certs |

### Navigation Options

| Option | Type | Description |
|--------|------|-------------|
| `url` | string | Target URL |
| `navigation_payload` | object | Parameters, routes, or initial states |
| `follow_redirects` | boolean | Auto-follow HTTP redirects |
| `timeout` | int | Navigation timeout in ms |

资料来源：[skyvern-frontend/src/routes/tasks/detail/TaskParameters.tsx:20-40]()

## Integration with Agent System

### Agent Protocol Integration

The browser automation engine integrates with Skyvern's agent protocol:

```python
run_request=TaskRunRequest(
    engine=RunEngine.skyvern_v2,
    prompt=task_v2.prompt,
    url=task_v2.url,
    browser_session_id=run_request.browser_session_id,
    totp_identifier=task_v2.totp_identifier,
    proxy_location=task_v2.proxy_location,
    max_steps=run_request.max_steps,
)
```

### Workflow Block Execution

```mermaid
graph TD
    A[Workflow Run] --> B[Initialize Browser]
    B --> C[Go To URL Block]
    C --> D[Browser Navigation]
    D --> E[Action Block]
    E --> F[Extract/Process]
    F --> G{More Blocks?}
    G -->|Yes| E
    G -->|No| H[Close Browser]
    H --> I[Return Results]
```

资料来源：[skyvern-frontend/src/routes/workflows/workflowRun/TaskBlockParameters.tsx:10-50]()

## Advanced Features

### Custom Browser Connection

Connect Skyvern Cloud to a local browser running on your machine:

```bash
# Start Chrome with tunnel to Skyvern Cloud
skyvern browser serve --tunnel
```

This enables:
- Use existing cookies and logins
- Bypass VPN restrictions
- Full browser control via Skyvern API

资料来源：[README.md:80-100]()

### Proxy Support

Route browser traffic through geographic proxies:

```python
skyvern.run_task(
    prompt="Search for local restaurants",
    proxy_location="us-east-1",  # or "eu-west-1", "ap-south-1"
)
```

Available proxy locations provide access to region-specific content.

资料来源：[skyvern-frontend/src/routes/tasks/create/PromptBox.tsx:25-35]()

## Error Handling

### Browser-Specific Errors

| Error Type | Cause | Recovery |
|------------|-------|----------|
| Navigation timeout | Page fails to load | Retry with extended timeout |
| Element not found | Dynamic content issues | Re-screenshot and retry |
| Browser crash | Memory/extension issues | Restart browser session |
| CDP connection lost | Network disruption | Reconnect and resume |

### Error Code Mapping

Custom error codes can be mapped for workflow-specific handling:

```python
task = await skyvern.run_task(
    prompt="Process order",
    error_code_mapping={
        "ERR_LOGIN_FAILED": "retry_with_2fa",
        "ERR_PAYMENT_DECLINED": "notify_user",
    },
)
```

资料来源：[skyvern-frontend/src/routes/workflows/workflowRun/TaskBlockParameters.tsx:45-65]()

## Security Considerations

### Browser Tunneling Security

> [!WARNING]
> Always use `--api-key` when exposing your browser via tunnel. Without it, anyone with the URL has full control of your browser.

Best practices:
- Never expose browser tunnels publicly
- Use authenticated connections only
- Rotate tunnel URLs frequently
- Limit browser session access

资料来源：[README.md:95-105]()

### Secure Credential Management

TOTP/2FA codes are handled through secure credential storage:

```python
task = await skyvern.run_task(
    prompt="Login to bank account",
    totp_identifier="banking_2fa_user@example.com",
)
```

The system extracts codes from push notifications or SMS and attaches them to relevant workflow steps.

资料来源：[skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:10-30]()

## Summary

The Browser Automation Engine provides Skyvern's core capability to automate web interactions using Vision LLMs. Key aspects:

- **Unified abstraction** over Playwright and CDP protocols
- **Persistent sessions** for maintaining login states
- **Visual understanding** via screenshot-based LLM analysis
- **Flexible configuration** for proxy, headers, and browser options
- **Integrated with workflows** for complex multi-step automation

This architecture enables Skyvern to operate on websites it has never seen before, adapt to layout changes automatically, and apply the same workflow across many different sites.

---

<a id='workflow-system'></a>

## Workflow System

### 相关页面

相关主题：[System Architecture](#system-architecture), [Database Models](#database-models)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [skyvern/forge/sdk/workflow/models/block.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/workflow/models/block.py)
- [skyvern/forge/sdk/workflow/models/workflow.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/workflow/models/workflow.py)
- [skyvern/forge/sdk/workflow/service.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/workflow/service.py)
- [skyvern/forge/sdk/workflow/workflow_definition_converter.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/workflow/workflow_definition_converter.py)
- [skyvern/services/workflow_service.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/services/workflow_service.py)
- [skyvern/services/block_service.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/services/block_service.py)
</details>

# Workflow System

## Overview

The Skyvern Workflow System is a core automation framework that enables chaining multiple tasks together to form cohesive units of work. It allows users to create complex multi-step automations by composing reusable building blocks called "workflow blocks."

## Architecture

```mermaid
graph TD
    subgraph "Frontend Layer"
        WE[Workflow Editor]
        RR[Run Workflow Form]
        DP[Debugger Panel]
    end
    
    subgraph "API Layer"
        AP[Agent Protocol Routes]
        WS[Webhook Endpoint]
    end
    
    subgraph "Service Layer"
        WFS[Workflow Service]
        BS[Block Service]
    end
    
    subgraph "Core SDK"
        WMS[Workflow Models]
        BMS[Block Models]
        WDC[Definition Converter]
        WSS[Workflow Service SDK]
    end
    
    WE --> AP
    RR --> AP
    AP --> WFS
    WFS --> BS
    WFS --> WMS
    BS --> BMS
    WDC --> WMS
    WDC --> BMS
    WSS --> WMS
    WSS --> BMS
```

## Workflow Model

### WorkflowDefinition

The `WorkflowDefinition` is the core model representing a workflow:

```python
class WorkflowDefinition(BaseModel):
    title: str
    description: Optional[str] = None
    blocks: List[WorkflowBlockDefinition]
    parameters: List[WorkflowParameter] = []
```

| Field | Type | Description |
|-------|------|-------------|
| `title` | `str` | Human-readable workflow title |
| `description` | `Optional[str]` | Optional description of workflow purpose |
| `blocks` | `List[WorkflowBlockDefinition]` | Ordered list of block definitions |
| `parameters` | `List[WorkflowParameter]` | Input parameters for workflow execution |

资料来源：[skyvern/forge/sdk/workflow/models/workflow.py]()

### WorkflowParameter

Workflows accept typed input parameters:

```python
class WorkflowParameter(BaseModel):
    key: str
    workflow_parameter_type: WorkflowParameterType
    default_value: Optional[Any] = None
    description: Optional[str] = None
    required: bool = True
```

| Field | Type | Description |
|-------|------|-------------|
| `key` | `str` | Parameter identifier |
| `workflow_parameter_type` | `WorkflowParameterType` | Type: `string`, `integer`, `float`, `boolean`, `json` |
| `default_value` | `Optional[Any]` | Default value if not provided |
| `description` | `Optional[str]` | Parameter description |
| `required` | `bool` | Whether parameter is mandatory |

资料来源：[skyvern/forge/sdk/workflow/models/workflow.py]()

## Block Types

Skyvern supports 23 block types for multi-step automations. Each block type serves a specific purpose in workflow execution.

```mermaid
graph TD
    A[Workflow Start] --> B{Block Type}
    B --> C[Browser Tasks]
    B --> D[Data Operations]
    B --> E[Control Flow]
    B --> F[External Integration]
    
    C --> C1[Task v2]
    C --> C2[Browser Action]
    C --> C3[Navigation]
    C --> C4[Login]
    
    D --> D1[Extraction]
    D --> D2[HTTP Request]
    D --> D3[File Download]
    
    E --> E1[Conditional]
    E --> E2[For Loop]
    E --> E3[Wait]
    
    F --> F1[Email]
    F --> F2[Text Prompt]
    F --> F3[Print Page]
```

### Supported Block Types

| Block Type | Purpose | Key Parameters |
|------------|---------|----------------|
| `Taskv2` | Multi-step browser automation | `prompt`, `url`, `max_steps`, `totp_verification_url`, `disable_cache` |
| `URL` | Navigate to a URL | `url`, `continue_on_failure` |
| `Wait` | Pause execution | `duration` |
| `TextPrompt` | LLM text generation | `prompt`, `llm_key`, `json_schema` |
| `HTTPRequest` | External API calls | `url`, `method`, `headers`, `body` |
| `Extraction` | Data extraction from page | `prompt`, `llm_key` |
| `Validation` | Validate extracted data | `prompt`, `error_codes` |
| `PrintPage` | Print to PDF | `format`, `landscape`, `print_background` |
| `HumanInteraction` | Pause for human input | `instructions`, `positive_descriptor`, `negative_descriptor` |
| `Conditional` | Branch logic | `expression` |
| `ForLoop` | Iterate over items | `items`, `variable_name` |
| `FileDownload` | Download files | `url`, `follow_redirects`, `save_response_as_file` |
| `BrowserAction` | Single browser action | `action_type`, `element_id` |
| `Login` | Handle authentication | `credential_id`, `totp_identifier` |

资料来源：[skyvern/forge/sdk/workflow/models/block.py]()
资料来源：[skyvern/cli/mcp_tools/README.md]()

## Block Execution Model

### WorkflowBlockExecution

Each block execution is tracked with its status:

```python
class WorkflowBlockExecution(BaseModel):
    workflow_run_id: str
    block_id: str
    block_type: WorkflowBlockType
    status: WorkflowBlockStatus
    output: Optional[Any] = None
    failure_reason: Optional[str] = None
    executed_branch_expression: Optional[str] = None
    executed_branch_result: Optional[bool] = None
    executed_branch_next_block: Optional[str] = None
```

| Status | Description |
|--------|-------------|
| `created` | Block added to execution queue |
| `queued` | Waiting for execution |
| `running` | Currently executing |
| `completed` | Successfully finished |
| `failed` | Execution failed |
| `cancelled` | Cancelled by user |

### Block Parameters by Type

#### Taskv2BlockParameters

```python
class Taskv2BlockParameters(BaseModel):
    prompt: str
    url: Optional[str] = None
    max_steps: Optional[int] = None
    totp_verification_url: Optional[str] = None
    totp_identifier: Optional[str] = None
    disable_cache: bool = False
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `prompt` | `str` | - | Navigation goal for the browser agent |
| `url` | `Optional[str]` | `None` | Starting URL for navigation |
| `max_steps` | `Optional[int]` | `None` | Maximum steps before stopping |
| `totp_verification_url` | `Optional[str]` | `None` | URL for 2FA verification |
| `totp_identifier` | `Optional[str]` | `None` | Identifier for TOTP credentials |
| `disable_cache` | `bool` | `False` | Disable action caching |

资料来源：[skyvern/forge/sdk/workflow/models/block.py]()

#### GotoUrlBlockParameters

```python
class GotoUrlBlockParameters(BaseModel):
    url: str
    continue_on_failure: bool = False
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `url` | `str` | - | Target URL to navigate to |
| `continue_on_failure` | `bool` | `False` | Continue workflow on navigation failure |

#### WaitBlockParameters

```python
class WaitBlockParameters(BaseModel):
    duration: int
```

#### PrintPageBlockParameters

```python
class PrintPageBlockParameters(BaseModel):
    format: PrintFormat = PrintFormat.A4
    landscape: bool = False
    print_background: bool = False
    include_timestamp: bool = True
    custom_filename: Optional[str] = None
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `format` | `PrintFormat` | `A4` | Page format: `A4`, `Letter`, `Legal` |
| `landscape` | `bool` | `False` | Use landscape orientation |
| `print_background` | `bool` | `False` | Print background colors |
| `include_timestamp` | `bool` | `True` | Include timestamp in footer |
| `custom_filename` | `Optional[str]` | `None` | Custom output filename |

#### HumanInteractionBlockParameters

```python
class HumanInteractionBlockParameters(BaseModel):
    instructions: Optional[str] = None
    positive_descriptor: Optional[str] = None
    negative_descriptor: Optional[str] = None
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `instructions` | `Optional[str]` | Instructions for the human |
| `positive_descriptor` | `Optional[str]` | Label for positive confirmation |
| `negative_descriptor` | `Optional[str]` | Label for negative/cancellation action |

## Workflow Execution Flow

```mermaid
sequenceDiagram
    participant Client
    participant API
    participant WorkflowService
    participant BlockService
    participant Executor

    Client->>API: POST /workflows/{id}/run
    API->>WorkflowService: create_workflow_run()
    WorkflowService->>WorkflowService: Validate parameters
    WorkflowService->>WorkflowService: Create WorkflowRun record
    WorkflowService-->>API: WorkflowRun
    
    loop For each block
        API->>BlockService: execute_block()
        BlockService->>Executor: Process block
        Executor-->>BlockService: Block result
        BlockService-->>API: WorkflowBlockExecution
    end
    
    API->>Client: Webhook callback (optional)
```

## Workflow Service API

### Core Operations

| Method | Description | Source |
|--------|-------------|--------|
| `create_workflow` | Create new workflow | [skyvern/forge/sdk/workflow/service.py]() |
| `get_workflow` | Retrieve workflow by ID | [skyvern/forge/sdk/workflow/service.py]() |
| `update_workflow` | Update workflow definition | [skyvern/forge/sdk/workflow/service.py]() |
| `delete_workflow` | Delete workflow | [skyvern/forge/sdk/workflow/service.py]() |
| `list_workflows` | List all workflows | [skyvern/forge/sdk/workflow/service.py]() |
| `run_workflow` | Execute workflow | [skyvern/services/workflow_service.py]() |
| `cancel_workflow_run` | Cancel running workflow | [skyvern/services/workflow_service.py]() |

### Running Workflows

Workflows can be executed via:

1. **API**: `POST /workflows/{workflow_id}/run`
2. **CLI**: `skyvern_workflow_run` tool
3. **Schedule**: Cron-based scheduled execution

### Run Parameters

When running a workflow, the following parameters can be specified:

| Parameter | Type | Description |
|-----------|------|-------------|
| `parameters` | `Dict[str, Any]` | Workflow input parameters |
| `webhook_callback_url` | `Optional[str]` | URL for result callback |
| `proxy_location` | `Optional[ProxyLocation]` | Geographic proxy location |
| `run_with` | `RunWith` | `agent` or `code` execution mode |
| `ai_fallback` | `bool` | Fall back to AI if code generation fails |

资料来源：[skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx]()

## Webhook Integration

Workflows support webhook callbacks for asynchronous result delivery:

```mermaid
graph LR
    A[Workflow Run] --> B{Complete?}
    B -->|Yes| C[Send webhook]
    B -->|No| D[Retry queue]
    D --> B
    C --> E[Customer Endpoint]
```

The webhook payload includes:

```python
{
    "workflow_run_id": str,
    "workflow_id": str,
    "status": WorkflowRunStatus,
    "output": Optional[Any],
    "failure_reason": Optional[str],
    "created_at": datetime,
    "modified_at": datetime,
    "blocks": List[WorkflowBlockExecution]
}
```

## MCP Integration

Skyvern provides MCP (Model Context Protocol) tools for workflow management:

### Available Tools

| Tool | Description |
|------|-------------|
| `skyvern_workflow_create` | Create new workflow |
| `skyvern_workflow_list` | List all workflows |
| `skyvern_workflow_get` | Get workflow details |
| `skyvern_workflow_run` | Execute workflow |
| `skyvern_workflow_status` | Check run status |
| `skyvern_workflow_update` | Update workflow |
| `skyvern_workflow_delete` | Delete workflow |
| `skyvern_workflow_cancel` | Cancel running workflow |
| `skyvern_block_schema` | Get block type schema |
| `skyvern_block_validate` | Validate block definition |

资料来源：[skyvern/cli/mcp_tools/README.md]()

## Frontend Components

### Workflow Editor

Located at `/workflows/{workflow_id}/build`, the editor provides:

- Visual block composition
- Block parameter configuration
- Workflow validation
- Preview mode

### Run Workflow Form

Located at `/workflows/{workflow_id}/run`, supports:

- Parameter input with type validation
- Run method selection (`agent` or `code`)
- Webhook URL configuration
- Proxy location selection

### Debugger Panel

Located at `/workflows/{workflow_id}/debug`, provides:

- Real-time execution status
- Block-by-block output inspection
- Extracted information viewer
- Failure reason analysis

### Workflow Run Timeline

Displays execution history with:

- Block status indicators
- Execution timestamps
- Extracted data per block
- Navigation to diagnostics

## Data Flow

```mermaid
graph TD
    subgraph "Definition Layer"
        WD[Workflow Definition]
        BD[Block Definitions]
        WP[Workflow Parameters]
    end
    
    subgraph "Execution Layer"
        WR[Workflow Run]
        BR[Block Executions]
        ST[State Management]
    end
    
    subgraph "Output Layer"
        OT[Output Data]
        ER[Error Reports]
        WH[Webhook Events]
    end
    
    WD --> WR
    BD --> BR
    WP --> WR
    BR --> ST
    ST --> OT
    BR -->|on failure| ER
    WR --> WH
```

## Key Features

### Conditional Execution

The `Conditional` block evaluates expressions and branches workflow execution:

```python
class ConditionalBlockParameters(BaseModel):
    expression: str  # e.g., "data.status == 'approved'"
```

After evaluation, the system records:
- `executed_branch_expression`: The evaluated expression
- `executed_branch_result`: Boolean result
- `executed_branch_next_block`: Next block ID based on result

### For Loop Iteration

The `ForLoop` block iterates over collections:

```python
class ForLoopBlockParameters(BaseModel):
    items: List[Any]
    variable_name: str  # Variable to expose in loop context
```

### Error Handling

Blocks support `continue_on_failure` flag for graceful degradation:

```python
class GotoUrlBlockParameters:
    url: str
    continue_on_failure: bool = False
```

When enabled, workflow continues to next block on failure.

### TOTP/2FA Support

Browser tasks can handle two-factor authentication:

```python
class Taskv2BlockParameters:
    totp_verification_url: Optional[str]
    totp_identifier: Optional[str]
```

Users can push verification codes via the frontend or API.

## Security Considerations

### Webhook Signature Validation

Webhook endpoints must validate signatures:

```python
async def webhook(request: Request) -> Response:
    signature = request.headers.get("x-skyvern-signature")
    timestamp = request.headers.get("x-skyvern-timestamp")
    
    if not signature or not timestamp:
        raise HTTPException(status_code=400)
    
    payload = await request.body()
    expected = generate_skyvern_signature(
        payload.decode("utf-8"),
        settings.SKYVERN_API_KEY
    )
```

### Credential Management

Workflows requiring authentication reference stored credentials by ID rather than embedding sensitive data.

## CLI Commands

```bash
# Switch between environments
skyvern mcp switch

# List workflows
skyvern workflow list

# Run workflow
skyvern workflow run <workflow_id>
```

## See Also

- [Task System](tasks.md) - Single-task automation
- [Browser Agent](browser-agent.md) - AI-powered web navigation
- [Credential Management](credentials.md) - Secure credential storage
- [Scheduling System](schedules.md) - Cron-based workflow execution

---

<a id='ai-commands'></a>

## AI-Powered Commands

### 相关页面

相关主题：[Browser Automation Engine](#browser-automation), [LLM Provider Configuration](#llm-providers)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [skyvern/forge/agent_functions.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/agent_functions.py)
- [skyvern/forge/sdk/copilot/agent.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/copilot/agent.py)
- [skyvern/forge/sdk/copilot/tools.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/copilot/tools.py)
- [skyvern/library/skyvern_browser_page_ai.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/library/skyvern_browser_page_ai.py)
- [skyvern/cli/mcp_tools/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcp_tools/README.md)
- [skyvern/cli/skills/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/skills/README.md)
</details>

# AI-Powered Commands

Skyvern provides a comprehensive suite of AI-powered commands that enable intelligent browser automation through natural language instructions. These commands leverage Large Language Models (LLMs) to interpret user intent and execute complex browser interactions autonomously.

## Overview

AI-Powered Commands in Skyvern represent a paradigm shift from traditional scripted automation to intelligent, intent-based browser control. Instead of writing precise step-by-step instructions, users describe what they want to achieve in natural language, and Skyvern's AI agents interpret and execute the necessary browser actions.

The system integrates with multiple LLM providers including OpenAI (GPT-4.1, o3, o4-mini), Anthropic (Claude 4.5-4.7), Azure OpenAI, AWS Bedrock, and Google Gemini to power the AI decision-making engine.

## Architecture

```mermaid
graph TD
    A[User Input / Natural Language] --> B[Copilot Agent]
    B --> C[LLM Provider]
    C --> D[Decision Engine]
    D --> E[Browser Actions]
    E --> F[Element Interaction]
    F --> G[State Validation]
    G --> H[Continue / Complete]
    
    B --> I[Tool Selection]
    I --> J[Data Extraction]
    I --> K[Visual Validation]
    I --> L[Network Monitoring]
    
    subgraph Tools
        J
        K
        L
    end
```

## Core Components

### Copilot Agent (`skyvern/forge/sdk/copilot/agent.py`)

The Copilot Agent serves as the central orchestration layer for AI-powered commands. It maintains conversation context, manages tool selection, and coordinates the execution flow between user instructions and browser actions.

| Component | Responsibility |
|-----------|----------------|
| Context Manager | Maintains conversation history and state |
| Tool Selector | Chooses appropriate tools based on intent |
| Action Executor | Executes browser actions |
| Response Formatter | Formats AI responses for user consumption |

### Tool System (`skyvern/forge/sdk/copilot/tools.py`)

Skyvern's tool system provides a comprehensive set of primitives for browser automation. Each tool is designed to handle specific interaction patterns while being composable for complex workflows.

### Browser Page AI (`skyvern/library/skyvern_browser_page_ai.py`)

This module provides the foundational AI capabilities for understanding and interacting with web page content. It includes element identification, content extraction, and visual analysis capabilities.

## Navigation and Interaction Commands

### Element Interactions

| Command | Purpose | Parameters |
|---------|---------|------------|
| `skyvern_click` | Click on identified elements | `element_selector`, `options` |
| `skyvern_type` | Enter text into input fields | `text`, `element_selector` |
| `skyvern_hover` | Hover over elements | `element_selector` |
| `skyvern_scroll` | Scroll within page or elements | `direction`, `amount` |
| `skyvern_select_option` | Select dropdown options | `value`, `element_selector` |
| `skyvern_press_key` | Press keyboard keys | `key`, `modifiers` |
| `skyvern_drag` | Drag and drop operations | `source`, `target` |
| `skyvern_wait` | Wait for conditions | `condition`, `timeout` |
| `skyvern_file_upload` | Upload files to elements | `file_path`, `element_selector` |

### Browser Navigation

| Command | Purpose |
|---------|---------|
| `skyvern_navigate` | Navigate to URLs |
| `skyvern_go_back` | Navigate browser history back |
| `skyvern_go_forward` | Navigate browser history forward |
| `skyvern_reload` | Reload current page |

### Tab and Frame Management

| Command | Purpose |
|---------|---------|
| `skyvern_tab_new` | Open new browser tab |
| `skyvern_tab_list` | List all open tabs |
| `skyvern_tab_switch` | Switch to specific tab |
| `skyvern_tab_close` | Close current or specified tab |
| `skyvern_tab_wait_for_new` | Wait for new tab to open |
| `skyvern_frame_list` | List all iframes on page |
| `skyvern_frame_switch` | Switch to iframe context |

## Data Extraction Commands

Skyvern provides multiple methods for extracting structured data from web pages:

### Structured Extraction

| Command | Purpose | Output Format |
|---------|---------|---------------|
| `skyvern_extract` | Extract structured data | JSON with defined schema |
| `skyvern_get_html` | Get page HTML | Raw HTML string |
| `skyvern_get_value` | Get form element values | String or JSON |

### Visual Extraction

| Command | Purpose |
|---------|---------|
| `skyvern_screenshot` | Capture full or partial screenshots |
| `skyvern_get_styles` | Get computed CSS styles |
| `skyvern_find` | Find elements by visual similarity |

### Content Analysis

The extraction system uses AI to understand page structure and extract relevant information based on user intent. It supports:

- Dynamic schema generation based on natural language requests
- Multi-field extraction from complex layouts
- Nested data structures and repeating elements
- Confidence scoring for extracted values

## Validation and Verification

### AI-Powered Validation

| Command | Purpose |
|---------|---------|
| `skyvern_validate` | Validate element states or page conditions |
| `skyvern_evaluate` | Run JavaScript for custom validation |
| `skyvern_evaluate_async` | Execute async JavaScript operations |

Validation commands use the LLM to interpret complex conditions that would be difficult to express in traditional selectors or XPath expressions.

### Screenshot Validation

The screenshot command supports comparison against reference images and can detect visual regressions:

```python
result = await skyvern.screenshot(
    full_page=True,
    compare_with="baseline.png",
    threshold=0.1  # 10% allowed difference
)
```

## Network and Console Commands

### Network Monitoring

| Command | Purpose |
|---------|---------|
| `skyvern_network_requests` | List network requests |
| `skyvern_network_request_detail` | Get request/response details |
| `skyvern_network_route` | Intercept and modify requests |
| `skyvern_network_unroute` | Remove request interception |
| `skyvern_har_start` | Start HAR recording |
| `skyvern_har_stop` | Stop and export HAR data |

### Console Inspection

| Command | Purpose |
|---------|---------|
| `skyvern_console_messages` | Retrieve console logs |
| `skyvern_get_errors` | Get JavaScript errors |
| `skyvern_handle_dialog` | Handle browser dialogs (alert, confirm, prompt) |

## Authentication and Credentials

### Login Commands

Skyvern supports intelligent login flows with multiple authentication methods:

| Command | Purpose |
|---------|---------|
| `skyvern_login` | Execute automated login |
| `skyvern_credential_list` | List stored credentials |
| `skyvern_credential_get` | Retrieve specific credentials |
| `skyvern_credential_delete` | Remove stored credentials |

### Credential Management

The credential system integrates with:

- **Skyvern Vault**: Built-in secure storage
- **Bitwarden**: Enterprise password management
- **1Password**: Team password sharing
- **Azure Key Vault**: Cloud credential storage

### Two-Factor Authentication

Skyvern handles 2FA/TOTP flows automatically:

1. Detects OTP requirement during login
2. Extracts codes from configured sources
3. Supports magic link authentication
4. Push notification handling via `skyvern/cli/skills/README.md`

## State Management

### Session State

| Command | Purpose |
|---------|---------||
| `skyvern_state_save` | Save current browser state |
| `skyvern_state_load` | Restore saved state |
| `skyvern_get_session_storage` | Read session storage |
| `skyvern_set_session_storage` | Write to session storage |
| `skyvern_clear_session_storage` | Clear session storage |
| `skyvern_clear_local_storage` | Clear local storage |

### Clipboard Operations

| Command | Purpose |
|---------|---------||
| `skyvern_clipboard_read` | Read from clipboard |
| `skyvern_clipboard_write` | Write to clipboard |

## Workflow Integration

AI-Powered Commands can be orchestrated into complete workflows:

```mermaid
graph LR
    A[Navigation] --> B[Authentication]
    B --> C[Data Extraction]
    C --> D[Validation]
    D --> E{Success?}
    E -->|No| F[Retry Logic]
    F --> B
    E -->|Yes| G[Output Results]
```

### Workflow Commands

| Command | Purpose |
|---------|---------|
| `skyvern_workflow_create` | Create new workflow |
| `skyvern_workflow_list` | List available workflows |
| `skyvern_workflow_get` | Get workflow details |
| `skyvern_workflow_run` | Execute workflow |
| `skyvern_workflow_cancel` | Cancel running workflow |

## Agent Functions (`skyvern/forge/agent_functions.py`)

The agent functions module provides the core building blocks for AI-driven browser automation:

### Function Categories

1. **Navigation Functions**: Handle URL navigation, back/forward, and reload
2. **Interaction Functions**: Click, type, hover, scroll, and element manipulation
3. **Extraction Functions**: HTML retrieval, value extraction, screenshot capture
4. **Validation Functions**: Element presence, state verification, screenshot comparison
5. **State Functions**: Local/session storage, clipboard, authentication state

### Function Interface

All agent functions follow a consistent interface:

```python
async def agent_function(
    task_id: str,
    step_id: str,
    **kwargs  # Function-specific parameters
) -> AgentFunctionCallResult:
    """
    Execute AI-powered browser action
    
    Returns:
        AgentFunctionCallResult with:
        - success: bool
        - extracted_data: Optional[dict]
        - screenshot: Optional[str] base64
        - error: Optional[str]
    """
```

## Integration with Skills Package

The skills package (`skyvern/cli/skills/README.md`) bundles AI-powered commands for coding agents:

### Available Skills

| Skill | Description |
|-------|-------------|
| `qa` | QA test frontend changes in real browser |
| `skyvern` | Full CLI reference for browser automation |
| `smoke-test` | CI-oriented smoke testing |

### QA Skill Workflow

```mermaid
graph TD
    A[git diff] --> B[Generate Tests]
    B --> C[Run Against Dev Server]
    C --> D[Report Results]
    D --> E{Screenshots}
    E --> F[Pass/Fail Status]
```

## Configuration

### Environment Variables

| Variable | Purpose | Default |
|----------|---------|---------|
| `SKYVERN_TELEMETRY` | Enable/disable usage telemetry | `true` |
| `SKYVERN_BASE_URL` | API endpoint for Skyvern Cloud | Local server |
| `SKYVERN_API_KEY` | Authentication key | None |

### Browser Configuration

| Parameter | Purpose |
|-----------|---------|
| `BROWSER_TYPE` | Browser engine (chromium, firefox, webkit) |
| `BROWSER_HEADLESS` | Run without visible UI |
| `BROWSER_REMOTE_DEBUGGING_URL` | Connect to remote browser instance |

## Best Practices

### Effective Command Usage

1. **Be Specific with Selectors**: Use precise element identifiers when available
2. **Add Validation Steps**: Always validate state changes after actions
3. **Handle Timing**: Use wait commands for dynamic content
4. **Screenshot for Debugging**: Capture screenshots at key decision points

### Error Handling

```python
try:
    result = await skyvern.act("click", selector="#submit-button")
    if not result.success:
        # Fallback or retry logic
        await skyvern.validate("element_visible", selector="#error-message")
except Exception as e:
    await skyvern.screenshot()
    raise
```

## Summary

AI-Powered Commands in Skyvern transform browser automation from rigid scripting to intelligent, adaptive interactions. By combining natural language understanding with comprehensive browser control primitives, developers can create robust automation flows that handle complexity and edge cases gracefully.

The modular architecture allows commands to be used individually for simple tasks or combined into sophisticated workflows for enterprise-scale automation needs.

---

<a id='database-models'></a>

## Database Models

### 相关页面

相关主题：[Artifact Storage](#artifact-storage), [Workflow System](#workflow-system)

# Database Models

## Overview

Skyvern's database layer is built using SQLAlchemy ORM with Alembic for database migrations. The persistence layer is located in `skyvern/forge/sdk/db/` and provides the data models for all core entities including Tasks, Workflows, Workflow Runs, Browser Profiles, Credentials, and Schedules.

The database models define the schema for persistent storage of automation tasks, execution state, workflow definitions, and runtime data.

## Architecture

```mermaid
graph TD
    A[API Layer] --> B[Repository Layer]
    B --> C[SQLAlchemy Models]
    C --> D[(PostgreSQL Database)]
    B --> E[Task Repository]
    B --> F[Workflow Repository]
    B --> G[Workflow Run Repository]
```

## Core Entities

### Task Model

The Task model represents an automation task with its configuration and execution state.

| Field | Type | Description |
|-------|------|-------------|
| task_id | String | Unique identifier (UUID) |
| workflow_run_id | String (nullable) | Associated workflow run |
| status | TaskStatus | Current task status |
| request | JSON | Task request configuration |
| navigation_goal | String | Navigation objective |
| navigation_payload | JSON | Additional navigation parameters |
| data_extraction_goal | String | Data extraction objective |
| extracted_information_schema | JSON | Expected output schema |
| created_at | DateTime | Creation timestamp |
| modified_at | DateTime | Last modification timestamp |
| organization_id | String | Organization ownership |

资料来源：[skyvern/forge/sdk/db/models.py]()

### Workflow Model

The Workflow model stores workflow definitions and configurations.

| Field | Type | Description |
|-------|------|-------------|
| workflow_id | String | Unique workflow identifier |
| title | String | Workflow name |
| description | String | Workflow description |
| workflow_definition | JSON | Workflow structure and steps |
| webhook_callback_url | String (nullable) | Callback URL for completion |
| organization_id | String | Organization ownership |
| created_at | DateTime | Creation timestamp |
| modified_at | DateTime | Last modification timestamp |

资料来源：[skyvern/forge/sdk/db/models.py]()

### Workflow Run Model

The WorkflowRun model tracks individual executions of workflows.

| Field | Type | Description |
|-------|------|-------------|
| workflow_run_id | String | Unique run identifier |
| workflow_id | String | Parent workflow reference |
| status | WorkflowRunStatus | Run status |
| organization_id | String | Organization ownership |
| started_at | DateTime | Execution start time |
| completed_at | DateTime (nullable) | Execution completion time |
| error | String (nullable) | Error message if failed |

资料来源：[skyvern/forge/sdk/db/models.py]()

## Task Status Enum

The TaskStatus enum defines possible task states:

```python
class TaskStatus(str, Enum):
    created = "created"
    pending = "pending"
    running = "running"
    completed = "completed"
    failed = "failed"
    cancelled = "cancelled"
```

资料来源：[skyvern/forge/sdk/db/enums.py]()

### Task Status Flow

```mermaid
stateDiagram-v2
    [*] --> created: Task Created
    created --> pending: Queued for Execution
    pending --> running: Agent Starts
    running --> completed: Success
    running --> failed: Error
    running --> cancelled: User Cancelled
    completed --> [*]
    failed --> [*]
    cancelled --> [*]
```

## Repository Pattern

Skyvern uses a repository pattern to abstract database operations.

### TaskRepository

Provides CRUD operations for Task entities:

- `create_task()` - Create new task record
- `get_task()` - Retrieve task by ID
- `update_task()` - Update task fields
- `get_tasks_for_workflow_run()` - Get tasks for workflow execution
- `get_tasks_by_organization()` - List organization tasks

资料来源：[skyvern/forge/sdk/db/repositories/tasks.py]()

### WorkflowRepository

Manages Workflow entity persistence:

- `create_workflow()` - Create new workflow
- `get_workflow()` - Retrieve workflow definition
- `update_workflow()` - Update workflow
- `get_workflows_by_organization()` - List organization workflows

资料来源：[skyvern/forge/sdk/db/repositories/workflows.py]()

### WorkflowRunRepository

Handles WorkflowRun entity operations:

- `create_workflow_run()` - Start new workflow execution
- `get_workflow_run()` - Get run details
- `update_workflow_run()` - Update run status
- `get_workflow_runs_for_workflow()` - List runs for a workflow

资料来源：[skyvern/forge/sdk/db/repositories/workflow_runs.py]()

## Database Migrations

Alembic manages database schema migrations in the `alembic/versions/` directory.

Migration files follow the naming convention: `{version}_{description}.py`

Example migration operations:
- Adding new columns to existing tables
- Creating new tables for additional entities
- Index creation for query optimization
- Data type modifications

资料来源：[alembic/versions]()

## Relationships

```mermaid
erDiagram
    Organization ||--o{ Task : owns
    Organization ||--o{ Workflow : owns
    Organization ||--o{ WorkflowRun : owns
    Workflow ||--o{ WorkflowRun : executes
    WorkflowRun ||--o{ Task : contains
```

## Additional Models

The database layer also includes models for:

| Model | Purpose |
|-------|---------|
| BrowserProfile | Browser configuration settings |
| Credential | Authentication credentials storage |
| Schedule | Cron-based task scheduling |
| ScheduleRun | Scheduled execution tracking |

资料来源：[skyvern/forge/sdk/db/models.py]()

## Usage Example

```python
from skyvern.forge.sdk.db.repositories.tasks import TaskRepository
from skyvern.forge.sdk.db.models import Task

task_repo = TaskRepository()
new_task = await task_repo.create_task(
    organization_id="org_123",
    navigation_goal="Search for flights",
    navigation_payload={"origin": "SFO", "destination": "LAX"}
)
```

## Configuration

Database connection is configured via environment variables:

| Variable | Description |
|----------|-------------|
| DATABASE_URL | PostgreSQL connection string |
| SKYVERN_ORG_ID | Default organization ID |

资料来源：[skyvern/forge/sdk/db/models.py]()

---

<a id='artifact-storage'></a>

## Artifact Storage

### 相关页面

相关主题：[Database Models](#database-models)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [skyvern/forge/sdk/artifact/storage/local.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/storage/local.py)
- [skyvern/forge/sdk/artifact/models.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/models.py)
- [skyvern/forge/sdk/routes/agent_protocol.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/routes/agent_protocol.py)
- [skyvern/forge/sdk/artifact/manager.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/manager.py)
- [skyvern/forge/sdk/artifact/storage/factory.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/storage/factory.py)
- [skyvern/forge/sdk/artifact/storage/s3.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/storage/s3.py)
- [skyvern/forge/sdk/artifact/storage/azure.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/artifact/storage/azure.py)
</details>

# Artifact Storage

## Overview

Artifact Storage is a core system in Skyvern responsible for persisting and retrieving various artifacts generated during task execution and workflow runs. These artifacts include screenshots, HTML content, LLM prompts and responses, element trees, download files, and execution logs. The system provides a pluggable storage backend architecture that supports multiple storage providers while maintaining a consistent API.

The storage layer abstracts away the complexity of different storage backends (local filesystem, Amazon S3, Azure Blob Storage) from the rest of the application, allowing deployments to choose the most appropriate storage solution for their infrastructure requirements.

## Architecture

### High-Level Architecture

```mermaid
graph TD
    A[API Clients] --> B[Agent Protocol Routes]
    B --> C[Artifact Manager]
    C --> D[Storage Factory]
    D --> E[Local Storage]
    D --> F[S3 Storage]
    D --> G[Azure Blob Storage]
    
    H[Artifact Models] --> C
    C --> H
    
    I[Configuration] --> D
```

### Component Responsibilities

| Component | File | Responsibility |
|-----------|------|-----------------|
| Artifact Manager | `artifact/manager.py` | Orchestrates artifact operations, lifecycle management |
| Storage Factory | `artifact/storage/factory.py` | Creates appropriate storage backend based on configuration |
| Local Storage | `artifact/storage/local.py` | Filesystem-based storage implementation |
| S3 Storage | `artifact/storage/s3.py` | AWS S3/ S3-compatible storage implementation |
| Azure Blob Storage | `artifact/storage/azure.py` | Azure Blob Storage implementation |
| Artifact Models | `artifact/models.py` | Data models for artifacts and artifact types |

## Artifact Types

Skyvern distinguishes between multiple artifact types, each serving a specific purpose in documenting and debugging task execution.

### Supported Artifact Types

```python
class ArtifactType(str, Enum):
    SCREENSHOT_LLM = "screenshot_llm"
    SCREENSHOT_ACTION = "screenshot_action"
    HTML_SCRAPE = "html_scrape"
    ELEMENT_TREE = "element_tree"
    ELEMENT_TREE_VISIBLE = "element_tree_visible"
    LLM_PROMPT = "llm_prompt"
    LLM_RESPONSE_PARSED = "llm_response_parsed"
    DOWNLOAD = "download"
    SKYVERN_LOG = "skyvern_log"
```

| Type | Description | Content-Type |
|------|-------------|--------------|
| `SCREENSHOT_LLM` | Annotated screenshots for LLM context | image/png |
| `SCREENSHOT_ACTION` | Action screenshots captured during execution | image/png |
| `HTML_SCRAPE` | Raw HTML content from web pages | text/html |
| `ELEMENT_TREE` | Complete DOM element tree | application/json |
| `ELEMENT_TREE_VISIBLE` | Filtered visible elements tree | application/json |
| `LLM_PROMPT` | Prompt sent to LLM for decision making | text/plain |
| `LLM_RESPONSE_PARSED` | Parsed LLM response with action list | application/json |
| `DOWNLOAD` | Downloaded file content | application/octet-stream |
| `SKYVERN_LOG` | Skyvern execution logs | text/plain |

资料来源：[skyvern/forge/sdk/artifact/models.py]()

## Data Models

### Artifact Model

The `Artifact` model represents a single stored artifact with metadata:

```python
class Artifact(BaseModel):
    artifact_id: str
    organization_id: str
    run_id: str | None = None
    task_id: str | None = None
    step_id: str | None = None
    workflow_run_id: str | None = None
    workflow_block_execution_id: str | None = None
    artifact_type: ArtifactType
    uri: str
    filename: str | None = None
    content_type: str | None = None
    metadata: dict[str, Any] | None = None
    created_at: datetime
    modified_at: datetime | None = None
```

资料来源：[skyvern/forge/sdk/artifact/models.py]()

### Content-Type Mapping

```python
_ARTIFACT_CONTENT_TYPES: dict[ArtifactType, str] = {
    ArtifactType.SCREENSHOT_LLM: "image/png",
    ArtifactType.SCREENSHOT_ACTION: "image/png",
    ArtifactType.HTML_SCRAPE: "text/html",
    ArtifactType.ELEMENT_TREE: "application/json",
    ArtifactType.ELEMENT_TREE_VISIBLE: "application/json",
    ArtifactType.LLM_PROMPT: "text/plain",
    ArtifactType.LLM_RESPONSE_PARSED: "application/json",
    ArtifactType.DOWNLOAD: "application/octet-stream",
    ArtifactType.SKYVERN_LOG: "text/plain",
}
```

## Storage Backends

### Local Storage

The local storage backend stores artifacts on the filesystem, ideal for development and single-instance deployments.

```python
class LocalStorage(BaseStorage):
    def __init__(self, artifact_path: str = settings.ARTIFACT_STORAGE_PATH) -> None:
        self.artifact_path = artifact_path
```

Key implementation details:

- **Path Construction**: Uses organization and artifact IDs to create hierarchical directory structures
- **Windows Compatibility**: Replaces colons with dashes in timestamps and removes invalid filename characters on Windows systems
- **SHA256 Verification**: Computes SHA256 checksums for stored files

```python
def _safe_timestamp() -> str:
    ts = datetime.utcnow().isoformat()
    return ts.replace(":", "-") if WINDOWS else ts

def _windows_safe_filename(name: str) -> str:
    if not WINDOWS:
        return name
    invalid = '<>:"/\\|?*'
    name = "".join("-" if ch in invalid else ch for ch in name)
    return name.rstrip(" .")
```

资料来源：[skyvern/forge/sdk/artifact/storage/local.py]()

### S3 Storage

The S3 backend provides scalable object storage suitable for production deployments.

**Configuration Environment Variables:**

| Variable | Description |
|----------|-------------|
| `AWS_ACCESS_KEY_ID` | AWS access key for authentication |
| `AWS_SECRET_ACCESS_KEY` | AWS secret key for authentication |
| `AWS_REGION` | AWS region for bucket operations |
| `S3_BUCKET_NAME` | Name of the S3 bucket |
| `ARTIFACT_S3_ENDPOINT_URL` | Custom S3-compatible endpoint (optional) |

资料来源：[skyvern/forge/sdk/artifact/storage/s3.py]()

### Azure Blob Storage

The Azure backend integrates with Azure Blob Storage for cloud deployments.

**Configuration Environment Variables:**

| Variable | Description |
|----------|-------------|
| `AZURE_STORAGE_CONNECTION_STRING` | Azure storage connection string |
| `AZURE_STORAGE_CONTAINER_NAME` | Container name for artifacts |

资料来源：[skyvern/forge/sdk/artifact/storage/azure.py]()

## Storage Factory

The storage factory pattern enables runtime selection of the appropriate storage backend:

```mermaid
graph LR
    A[Configuration] --> B[Storage Factory]
    B --> C{Backend Type}
    C -->|local| D[LocalStorage]
    C -->|s3| E[S3Storage]
    C -->|azure| F[AzureBlobStorage]
```

**Backend Selection Logic:**

```python
def get_storage_backend() -> BaseStorage:
    if settings.ARTIFACT_STORAGE_BACKEND == "s3":
        return S3Storage()
    elif settings.ARTIFACT_STORAGE_BACKEND == "azure":
        return AzureBlobStorage()
    else:
        return LocalStorage()
```

资料来源：[skyvern/forge/sdk/artifact/storage/factory.py]()

## API Endpoints

### Get Artifact Content

Retrieves raw content of an artifact with support for range requests and HMAC-signed URLs.

**Endpoint**: `GET /api/v1/artifacts/{artifact_id}/content`

**Query Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `sig` | string | HMAC signature for URL authentication |
| `expiry` | string | Expiration timestamp for signed URLs |
| `kid` | string | Key identifier for signature verification |
| `artifact_name` | string | Optional filename override |
| `artifact_type` | string | Expected artifact type |
| `x-api-key` | string | API key authentication (header) |
| `authorization` | string | Bearer token authentication (header) |

**Responses:**

| Status | Description |
|--------|-------------|
| 200 | Raw artifact content |
| 206 | Partial content (Range request) |
| 403 | Invalid or expired artifact URL |
| 404 | Artifact not found |
| 416 | Range not satisfiable |

**Content-Disposition Behavior:**

```python
if artifact.artifact_type == ArtifactType.DOWNLOAD:
    # Use attachment disposition for downloads
    return media_type, _build_attachment_disposition(raw_name)
return media_type, "inline"  # Inline for all other types
```

资料来源：[skyvern/forge/sdk/routes/agent_protocol.py]()

### Range Request Support

The artifact content endpoint supports HTTP range requests for partial content retrieval:

```python
def _parse_range_header(range_header: str | None, content_length: int) -> tuple[int, int] | None:
    """Return one satisfiable byte range, _RANGE_UNSATISFIABLE when unsatisfiable, or None when ignored."""
    if not range_header:
        return None
    # Parses "bytes=start-end" format
    # Validates ASCII digits, rejects negatives
```

**Range Header Format:** `bytes=start-end` (RFC 7233 compliant)

资料来源：[skyvern/forge/sdk/routes/agent_protocol.py]()

## HMAC URL Signing

Artifact URLs can be signed using HMAC for time-limited access without requiring API key authentication:

```mermaid
sequenceDiagram
    Client->>Server: Request with sig, expiry, kid
    Server->>Server: Validate HMAC signature
    Server->>Storage: Fetch artifact
    Storage-->>Server: Artifact content
    Server-->>Client: Signed URL response
```

**Signing Requirements:**

1. HMAC keyring must be configured: `ARTIFACT_CONTENT_HMAC_KEYRING`
2. URL must include valid `sig`, `expiry`, and `kid` query parameters
3. Signature is verified before returning artifact content

资料来源：[skyvern/forge/sdk/routes/agent_protocol.py]()

## Configuration Options

### Storage Configuration

| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `ARTIFACT_STORAGE_BACKEND` | `local` | Storage backend type (local/s3/azure) |
| `ARTIFACT_STORAGE_PATH` | `/tmp/skyvern/artifacts` | Local storage path |
| `ARTIFACT_CONTENT_HMAC_KEYRING` | - | HMAC keyring for signed URLs |

### S3 Configuration

| Environment Variable | Description |
|---------------------|-------------|
| `AWS_ACCESS_KEY_ID` | AWS credentials |
| `AWS_SECRET_ACCESS_KEY` | AWS credentials |
| `AWS_REGION` | Region setting |
| `S3_BUCKET_NAME` | Target bucket |
| `ARTIFACT_S3_ENDPOINT_URL` | S3-compatible endpoint |

### Azure Configuration

| Environment Variable | Description |
|---------------------|-------------|
| `AZURE_STORAGE_CONNECTION_STRING` | Connection string |
| `AZURE_STORAGE_CONTAINER_NAME` | Container name |

## File Extension Mapping

The storage layer maintains a mapping from artifact types to file extensions for consistent naming:

```python
FILE_EXTENTSION_MAP: dict[ArtifactType, str] = {
    ArtifactType.SCREENSHOT_LLM: ".png",
    ArtifactType.SCREENSHOT_ACTION: ".png",
    ArtifactType.HTML_SCRAPE: ".html",
    ArtifactType.ELEMENT_TREE: ".json",
    ArtifactType.ELEMENT_TREE_VISIBLE: ".json",
    ArtifactType.LLM_PROMPT: ".txt",
    ArtifactType.LLM_RESPONSE_PARSED: ".json",
    ArtifactType.DOWNLOAD: ".bin",
    ArtifactType.SKYVERN_LOG: ".log",
}
```

资料来源：[skyvern/forge/sdk/artifact/storage/base.py]()

## Usage Patterns

### Storing an Artifact

```python
# Via Artifact Manager
artifact = await artifact_manager.create_artifact(
    organization_id=org_id,
    artifact_type=ArtifactType.SCREENSHOT_LLM,
    content=image_bytes,
    task_id=task_id,
    step_id=step_id,
)
```

### Retrieving an Artifact

```python
# Get artifact metadata
artifact = await artifact_manager.get_artifact(artifact_id)

# Get presigned or signed URL
url = await artifact_manager.get_artifact_url(artifact)
```

### Range Request for Large Files

```python
headers = {"Range": "bytes=0-1023"}
response = await client.get(f"/api/v1/artifacts/{id}/content", headers=headers)
```

## Security Considerations

1. **Signed URLs**: HMAC-signed URLs provide time-limited access without exposing storage credentials
2. **Attachment Disposition**: Download artifacts use `Content-Disposition: attachment` to prevent browser rendering of potentially malicious content
3. **Organization Isolation**: Artifacts are namespaced by organization ID to prevent cross-tenant access
4. **Content-Type Validation**: Responses set appropriate content-types based on artifact type

## Frontend Integration

The frontend displays artifacts through dedicated UI components:

| Component | Location | Purpose |
|-----------|----------|---------|
| `StepArtifacts.tsx` | `routes/tasks/detail/` | Task artifact viewer with tabbed interface |
| `Artifact` component | Shared | Renders different artifact types |
| `ZoomableImage` | Shared | Displays screenshots with zoom capability |

The artifact viewer supports multiple tabs for different artifact types:

- Info
- Annotated Screenshots
- Action Screenshots
- HTML Element Tree
- Element Tree
- Prompt
- Action List
- HTML (Raw)

资料来源：[skyvern-frontend/src/routes/tasks/detail/StepArtifacts.tsx]()

---

<a id='credential-management'></a>

## Credential Management

### 相关页面

相关主题：[Browser Automation Engine](#browser-automation), [Workflow System](#workflow-system)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this documentation page:

- [skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx)
- [skyvern-frontend/src/routes/workflows/components/CredentialSelector.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/components/CredentialSelector.tsx)
- [skyvern-frontend/src/routes/workflows/editor/nodes/HttpRequestNode/HttpRequestNode.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/editor/nodes/HttpRequestNode/HttpRequestNode.tsx)
- [skyvern-frontend/src/routes/workflows/editor/nodes/TaskNode/ParametersMultiSelect.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/editor/nodes/TaskNode/ParametersMultiSelect.tsx)
- [skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx)
- [skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx)
- [skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx)
- [integrations/mcp/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md)
- [skyvern/cli/mcp_tools/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcp_tools/README.md)
</details>

# Credential Management

## Overview

Credential Management in Skyvern provides a secure, unified system for storing, retrieving, and managing authentication credentials across tasks and workflows. Skyvern supports multiple credential vault types, enabling integration with external password managers and custom credential services while maintaining a native internal vault.

Credentials in Skyvern can be of three primary types:

| Credential Type | Description |
|-----------------|-------------|
| `password` | Username/password credential pairs for basic authentication |
| `credit_card` | Credit card information for payment forms |
| `secret` | Generic secret values for API keys, tokens, and other sensitive data |

资料来源：[skyvern-frontend/src/routes/workflows/components/CredentialSelector.tsx:1-100]()

## Architecture

Skyvern's credential management system is designed with a multi-vault architecture that allows seamless integration with various credential providers while maintaining a consistent internal API.

```mermaid
graph TD
    subgraph "Client Layer"
        UI[Web UI]
        API[API Client]
        MCP[MCP Tools]
    end
    
    subgraph "Credential Services"
        SkyvernVault[Skyvern Internal Vault]
        Bitwarden[Bitwarden Service]
        Azure[Azure Key Vault Service]
        Custom[Custom Credential Service]
    end
    
    subgraph "Storage Layer"
        DB[(Database)]
    end
    
    UI --> API
    MCP --> API
    API --> SkyvernVault
    API --> Bitwarden
    API --> Azure
    API --> Custom
    SkyvernVault --> DB
```

资料来源：[skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:1-50]()

## Credential Vault Types

### Skyvern Internal Vault

The default vault type stores credentials directly in Skyvern's database. This is the simplest option for getting started and requires no external configuration.

### Bitwarden Integration

Skyvern can integrate with Bitwarden to leverage existing credentials stored in your Bitwarden vault. This integration supports:
- Reading existing credentials from Bitwarden
- Writing new credentials back to Bitwarden
- Automatic 2FA/TOTP handling

资料来源：[skyvern/cli/mcp_tools/README.md:1-50]()

### Azure Key Vault

For enterprise environments, Skyvern supports Azure Key Vault integration, allowing credentials stored in Azure's secure key management system to be used in tasks and workflows.

资料来源：[skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:1-80]()

### Custom Credential Service

Organizations with proprietary credential management systems can implement a custom credential service. This requires:

1. **API Configuration**: Set up API base URL and authentication token
2. **Service Implementation**: Implement the credential service interface
3. **Vault Type Selection**: Configure parameters to use `vault_type="custom"`

The custom credential service configuration includes:
- `api_base_url`: The base URL of your credential service API
- `api_token`: Authentication token for the service

资料来源：[skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:60-75]()

## Using Credentials in Workflows

### Credential Parameter Types

Credentials can be referenced as workflow parameters, allowing secure injection of sensitive data into task execution. The system supports the following parameter types:

| Parameter Type | Usage | Example Reference |
|----------------|-------|-------------------|
| `credential` | Credential objects from vault | `{{ my_credential.username }}` |
| `context` | Context parameters from previous steps | `{{ context.source_param }}` |
| `custom` | Custom credential service credentials | Uses vault_type selection |

资料来源：[skyvern-frontend/src/routes/workflows/editor/panels/WorkflowParameterEditPanel.tsx:40-65]()

### Credential Reference Syntax

Within HTTP Request nodes, credentials are referenced using template syntax:

```
Password credential: {{ my_credential.username }} / {{ my_credential.password }}
Secret credential: {{ my_secret.secret_value }}
```

资料来源：[skyvern-frontend/src/routes/workflows/editor/nodes/HttpRequestNode/HttpRequestNode.tsx:1-50]()

### Credential Parameter Validation

When running workflows, credential parameters are validated to ensure:

1. **Required Fields**: Boolean and credential parameters must have values
2. **JSON Validation**: JSON-type credential parameters must parse correctly
3. **Missing Credential Detection**: The system detects orphaned credential parameters where the referenced credential no longer exists in the vault

```typescript
// Validation example from workflow execution
if (parameter.workflow_parameter_type === "credential") {
    if (value === null || value === undefined) {
        return "This field is required";
    }
}
```

资料来源：[skyvern-frontend/src/routes/workflows/RunWorkflowForm.tsx:1-100]()

### Orphaned Credential Detection

The system provides warnings when workflow parameters reference credentials that no longer exist in the vault:

```
⚠️ my_credential (missing credential)
```

This warning helps identify workflows that need to be updated after credential deletion or vault changes.

资料来源：[skyvern-frontend/src/routes/workflows/editor/nodes/TaskNode/ParametersMultiSelect.tsx:1-50]()

## Two-Factor Authentication (TOTP)

Skyvern supports automated Two-Factor Authentication through TOTP (Time-based One-Time Password) handling. This is critical for automating workflows that require 2FA verification.

### Push TOTP Code Flow

1. **Initiate Push**: When a task encounters a TOTP challenge, Skyvern can push a verification code to the user
2. **Code Entry**: User receives the verification message (SMS, email, or authenticator app)
3. **Code Extraction**: Skyvern extracts the code from the verification message
4. **Attachment**: The code is automatically attached to the relevant workflow run

```typescript
interface TOTPConfig {
    totp_identifier: string;  // Email or phone for receiving codes
    totp_url?: string;        // Direct verification URL if available
    totp_type: 'totp' | 'magic_link';
}
```

资料来源：[skyvern-frontend/src/routes/credentials/CredentialsTotpTab.tsx:1-80]()

### TOTP Parameter Filtering

The credential management interface supports filtering TOTP credentials by:
- **Identifier**: Filter by email or phone number
- **OTP Type**: Filter by numeric code or magic link

## MCP Integration

Skyvern's Model Context Protocol (MCP) tools provide programmatic access to credential management:

```json
{
  "mcpServers": {
    "skyvern": {
      "type": "streamable-http",
      "url": "https://api.skyvern.com/mcp/",
      "headers": { "x-api-key": "YOUR_API_KEY" }
    }
  }
}
```

### Available MCP Credential Tools

| Tool | Description |
|------|-------------|
| `skyvern_credential_list` | List all credentials in the vault |
| `skyvern_credential_get` | Retrieve a specific credential |
| `skyvern_credential_delete` | Remove a credential from the vault |
| `skyvern_login` | Authenticate using stored credentials |

Supported vault integrations: Skyvern vault, Bitwarden, 1Password, and Azure Key Vault with automatic 2FA/TOTP support.

资料来源：[integrations/mcp/README.md:1-80]()

## Security Considerations

### Browser Tunneling Security

When exposing Skyvern through browser tunneling, ensure API key authentication is enabled:

> **WARNING**: Always use `--api-key` when exposing your browser via a tunnel. Without it, anyone with the URL has full control of your browser.

资料来源：[README.md:1-100]()

### Credential Masking

Sensitive credential data is masked in UI displays:
- Tokens longer than 8 characters are truncated: `sk_live_xxx...`
- Full values are never displayed in logs or error messages

资料来源：[skyvern-frontend/src/components/CustomCredentialServiceConfigForm.tsx:20-35]()

### External Vault Security

When using external credential services:
1. Store API tokens securely (environment variables preferred)
2. Use HTTPS for all credential service communications
3. Implement IP allowlisting where supported
4. Rotate credentials regularly

## Configuration Reference

### Environment Variables

| Variable | Description |
|----------|-------------|
| `SKYVERN_API_KEY` | API key for Skyvern authentication |
| `SKYVERN_BASE_URL` | Base URL for self-hosted deployments |
| `SKYVERN_TELEMETRY` | Set to `false` to opt out of telemetry |

### Credential Service Configuration

| Field | Required | Description |
|-------|----------|-------------|
| `api_base_url` | Yes (custom) | Base URL of the credential service |
| `api_token` | Yes (custom) | Authentication token |
| `token_type` | No | Type of authentication token |
| `tested_url` | No | URL used to test credential validity |

## Best Practices

1. **Use Type-Specific Credentials**: Store credentials with appropriate types (password, credit_card, secret) for better organization and retrieval
2. **Implement Custom Services for Enterprise**: For large-scale deployments, implement a custom credential service for centralized management
3. **Enable TOTP Automation**: Configure TOTP handling for automated 2FA workflows
4. **Monitor Orphaned Parameters**: Regularly check for and clean up orphaned credential references
5. **Rotate API Tokens**: Periodically rotate API tokens for custom credential services
6. **Leverage Bitwarden for Existing Teams**: If your team already uses Bitwarden, integrate it to avoid credential duplication

---

<a id='llm-providers'></a>

## LLM Provider Configuration

### 相关页面

相关主题：[AI-Powered Commands](#ai-commands)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [skyvern/forge/sdk/api/llm/api_handler.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/api/llm/api_handler.py)
- [skyvern/forge/sdk/api/llm/models.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/api/llm/models.py)
- [skyvern/forge/sdk/api/llm/litellm_transport.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/api/llm/litellm_transport.py)
- [skyvern/forge/prompts.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/prompts.py)
- [skyvern/config.py](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/config.py)
</details>

# LLM Provider Configuration

Skyvern leverages Large Language Models (LLMs) as the core intelligence engine for AI-powered browser automation. The LLM Provider Configuration system provides a flexible abstraction layer that enables Skyvern to connect with multiple LLM providers including OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, and Google Gemini. This architecture decouples the automation logic from specific LLM implementations, allowing users to select their preferred provider without modifying core application code.

## Supported LLM Providers

Skyvern supports a comprehensive range of LLM providers to accommodate diverse enterprise requirements and budget considerations. The framework utilizes litellm as a unified transport layer, which normalizes API interactions across different providers through a consistent interface.

| Provider | Supported Models |
|----------|-----------------|
| OpenAI | GPT-5.5, GPT-5.4, GPT-5, GPT-4.1, o3, o4-mini |
| Anthropic | Claude 4.7 Opus, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Haiku, Sonnet, Opus) |
| Azure OpenAI | Any GPT models deployed to your Azure subscription |
| AWS Bedrock | Claude 4.7, Claude 4.6 (Sonnet, Opus), Claude 4.5 (Sonnet, Opus) |
| Google Gemini | Gemini 3.1 Pro, Gemini 3 Flash |

资料来源：[README.md:1-20]()

### Provider Selection Criteria

When selecting an LLM provider for Skyvern deployments, consider the following factors. OpenAI models offer strong general-purpose performance with the broadest model availability. Anthropic's Claude series excels in instruction following and extended reasoning tasks, making it particularly suitable for complex multi-step browser automation workflows. Azure OpenAI provides enterprise-grade security and compliance features with the ability to use custom model deployments. AWS Bedrock offers seamless integration with other AWS services and HIPAA-compliant deployments. Google Gemini provides competitive pricing with strong multimodal capabilities.

## Configuration Architecture

The LLM Provider Configuration system follows a layered architecture that separates provider selection, credential management, and runtime dispatch. This design enables runtime provider switching and supports fallback mechanisms for production deployments.

```mermaid
graph TD
    A[Task Request] --> B[LLM API Handler]
    B --> C{LLM Provider Selection}
    C -->|OpenAI| D[OpenAI Transport]
    C -->|Anthropic| E[Anthropic Transport]
    C -->|Azure| F[Azure OpenAI Transport]
    C -->|AWS| G[Bedrock Transport]
    C -->|Gemini| H[Gemini Transport]
    D --> I[litellm Unified Interface]
    E --> I
    F --> I
    G --> I
    H --> I
    I --> J[Provider API Endpoint]
```

### Core Configuration Components

The configuration system comprises several interconnected components that manage provider selection, authentication, and request handling. The API handler serves as the primary entry point for LLM interactions, coordinating between the task execution engine and the underlying transport layer. Models define the data structures for requests, responses, and provider-specific configurations. The litellm transport provides the unified interface that normalizes differences between provider APIs.

## Environment Configuration

### Basic Setup

LLM provider credentials are configured through environment variables in the `.env` file. After running `skyvern quickstart` or `skyvern init`, the setup wizard will guide you through provider selection and credential configuration.

```bash
# Required for OpenAI
OPENAI_API_KEY=sk-...

# Required for Anthropic
ANTHROPIC_API_KEY=sk-ant-...

# Required for Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-key
AZURE_OPENAI_BASE_URL=https://your-resource.openai.azure.com

# Required for AWS Bedrock
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1

# Required for Gemini
GOOGLE_GENERATIVE_AI_API_KEY=your-gemini-key
```

资料来源：[README.md:1-50]()

### Provider-Specific Configuration

#### OpenAI Configuration

For OpenAI providers, Skyvern supports both standard OpenAI endpoints and custom base URLs for proxy or gateway scenarios. Model selection can be specified at the task level or configured as the default in the environment.

#### Anthropic Configuration

Anthropic Claude models require the `ANTHROPIC_API_KEY` environment variable. The setup wizard can automatically configure this during initialization. Claude models are particularly well-suited for Skyvern's browser automation tasks due to their strong instruction-following capabilities.

#### Azure OpenAI Configuration

Azure OpenAI deployments require additional configuration for deployment-specific endpoints. The `AZURE_OPENAI_BASE_URL` should point to your Azure OpenAI resource endpoint, and the system supports any GPT models deployed to your Azure subscription.

#### AWS Bedrock Configuration

AWS Bedrock integration uses standard AWS credential chain resolution, including environment variables, IAM roles, and AWS profile configurations. The `AWS_REGION` variable determines which AWS region your Bedrock endpoints are hosted in.

#### Google Gemini Configuration

Gemini models are configured using the `GOOGLE_GENERATIVE_AI_API_KEY`. The framework supports both Gemini 3.1 Pro for complex reasoning tasks and Gemini 3 Flash for faster, cost-effective operations.

## Provider Selection in Code

When using Skyvern programmatically through the SDK, you can specify the LLM provider at task creation time. The framework will use the configured credentials for the selected provider.

```python
from skyvern import Skyvern

skyvern = Skyvern(api_key="your-api-key")
task = await skyvern.run_task(
    prompt="Find the top post on hackernews today",
)
```

资料来源：[README.md:50-80]()

### Cloud vs Local Configuration

Skyvern supports two operational modes for LLM configuration. In Skyvern Cloud mode, the platform manages provider configuration and billing. In local mode, you configure your own LLM provider credentials, and Skyvern routes requests through your specified provider.

For local deployments, the setup wizard configures credentials automatically during initialization. For custom configurations, you can manually edit the `.env` file with your provider-specific credentials.

## Advanced Configuration Options

### Custom Endpoint Configuration

For enterprise deployments requiring proxy servers or custom API gateways, Skyvern supports base URL customization through provider-specific environment variables. This enables integration with internal LLM deployments, specialized inference endpoints, or regional API endpoints.

### Multi-Provider Fallback

Production deployments can implement multi-provider fallback strategies by configuring multiple provider credentials. When the primary provider is unavailable, Skyvern can automatically route requests to backup providers based on priority configuration.

### Model Selection Per Task

Individual tasks can specify model preferences that override the default configuration. This enables cost optimization by using lighter models for simple tasks while reserving more capable models for complex automation sequences.

## Credential Security

Credential management follows security best practices by storing sensitive information exclusively in environment variables. The `.env` file should never be committed to version control. Skyvern's initialization process creates the `.env` file from `.env.example` if it does not exist, ensuring template credentials are never exposed.

资料来源：[README.md:1-30]()

## Troubleshooting

Common LLM provider configuration issues include incorrect API keys, network connectivity problems, and quota exhaustion. The setup wizard validates credentials during configuration to catch most issues early. For runtime errors, Skyvern provides detailed error messages that identify the specific provider and error type.

If you encounter authentication errors, verify that your API keys are correctly set in the `.env` file and that the corresponding provider account has sufficient credits or quota available.

---

<a id='mcp-integration'></a>

## Model Context Protocol (MCP) Integration

### 相关页面

相关主题：[LLM Provider Configuration](#llm-providers)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [integrations/mcp/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md)
- [skyvern/cli/mcp_tools/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcp_tools/README.md)
- [skyvern/cli/mcpb/claude_desktop/README.md](https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/cli/mcpb/claude_desktop/README.md)
</details>

# Model Context Protocol (MCP) Integration

## Overview

Skyvern's Model Context Protocol (MCP) integration enables AI applications to connect to Skyvern's browser automation capabilities. This integration allows AI-powered applications to perform browser-based tasks such as filling out forms, downloading files, researching information on the web, and executing complex web automation workflows through natural language commands.

The MCP server implementation serves as a bridge between AI applications and Skyvern's browser engine, providing a standardized interface for browser automation tasks.

资料来源：[integrations/mcp/README.md]()

## Architecture

The MCP integration supports multiple deployment models and connection methods:

### Connection Modes

| Mode | Description | Use Case |
|------|-------------|----------|
| **Skyvern Cloud** | Connect to managed cloud service | Production without self-hosting |
| **Local Skyvern Server** | Self-hosted deployment | Development, privacy, custom infrastructure |

### MCP Client Configuration

#### Cloud Configuration (streamable-http)

```json
{
  "mcpServers": {
    "skyvern": {
      "type": "streamable-http",
      "url": "https://api.skyvern.com/mcp/",
      "headers": { "x-api-key": "YOUR_API_KEY" }
    }
  }
}
```

#### Local Configuration

```json
{
  "mcpServers": {
    "skyvern": {
      "command": "python3",
      "args": ["-m", "skyvern", "run", "mcp"],
      "env": {
        "SKYVERN_BASE_URL": "http://localhost:8000",
        "SKYVERN_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}
```

资料来源：[skyvern/cli/mcp_tools/README.md]()

## Available MCP Tools

### Browser Session Management

| Tool | Description |
|------|-------------|
| `skyvern_browser_session_create` | Create a new browser session |
| `skyvern_browser_session_close` | Close an existing browser session |
| `skyvern_browser_session_list` | List all active browser sessions |
| `skyvern_browser_session_get` | Get details of a specific session |
| `skyvern_browser_session_connect` | Connect to an existing session |

### Browser Actions

| Tool | Description |
|------|-------------|
| `skyvern_act` | Execute natural language actions |
| `skyvern_navigate` | Navigate to a URL |
| `skyvern_click` | Click on an element |
| `skyvern_type` | Type text into a field |
| `skyvern_hover` | Hover over an element |
| `skyvern_scroll` | Scroll the page |
| `skyvern_select_option` | Select an option from dropdown |
| `skyvern_press_key` | Press a keyboard key |
| `skyvern_drag` | Drag an element |
| `skyvern_file_upload` | Upload a file |
| `skyvern_wait` | Wait for page to load |

### Data Extraction & Validation

| Tool | Description |
|------|-------------|
| `skyvern_extract` | Extract structured JSON data from page |
| `skyvern_screenshot` | Take a screenshot |
| `skyvern_find` | Find elements on the page |
| `skyvern_validate` | Validate page content |
| `skyvern_evaluate` | Run JavaScript code |
| `skyvern_get_html` | Get page HTML |

资料来源：[skyvern/cli/mcp_tools/README.md]()

## Quick Start Guide

### Prerequisites

> **REQUIREMENT**: Skyvern only runs in Python 3.11 environment today

### Installation Steps

1. **Install Skyvern**
   ```bash
   pip install skyvern
   ```

2. **Configure Skyvern**
   Run the setup wizard which will guide you through the configuration process:
   ```bash
   skyvern init
   ```
   You can connect to either Skyvern Cloud or a local version of Skyvern.

3. **Launch Local Server (Optional)**
   Only required in local mode:
   ```bash
   skyvern run server
   ```

资料来源：[integrations/mcp/README.md]()

## Claude Desktop Integration

Skyvern provides a downloadable `.mcpb` bundle that installs Skyvern Cloud into Claude Desktop without requiring the user to install Node.js.

### Building the MCP Bundle

```bash
./scripts/package-mcpb.sh 1.0.23
```

### Publishing to Releases

```bash
./scripts/package-mcpb.sh 1.0.23 skyvern-claude-desktop.mcpb \
  skyvern/cli/mcpb/releases/skyvern-claude-desktop.mcpb
```

资料来源：[skyvern/cli/mcpb/claude_desktop/README.md]()

## Usage Patterns

### Natural Language Actions

The `skyvern_act` tool allows you to describe actions in natural language, which Skyvern's AI interprets and executes:

```
"Click the login button"
"Fill in the email field with user@example.com"
"Select 'Premium' from the subscription dropdown"
```

### Data Extraction

Use `skyvern_extract` to extract structured JSON data from web pages by describing the data you need:

```
"Extract all product names, prices, and ratings"
```

### Screenshot and Validation Loops

For debugging and verification, use screenshot + validate loops:

```python
# Take screenshot
screenshot = skyvern_screenshot()

# Validate content
validation = skyvern_validate("The login form is visible")

# If validation fails, take another screenshot for debugging
if not validation.success:
    screenshot = skyvern_screenshot()
```

## Integration with AI Applications

The MCP integration enables AI applications to:

- **Automate form filling**: Submit complex forms with AI-guided input
- **Research web content**: Extract structured data from multiple sources
- **Download files**: Navigate to and download files from websites
- **Execute workflows**: Run browser automation workflows
- **Handle 2FA flows**: Manage TOTP (Time-based One-Time Password) authentication

## Credential Management

Skyvern's MCP tools support secure credential management for login flows:

| Credential Type | Usage Pattern |
|-----------------|---------------|
| **Password** | `{{ my_credential.username }}` / `{{ my_credential.password }}` |
| **Secret** | `{{ my_secret.secret_value }}` |
| **Custom Service** | Configure via CustomCredentialServiceConfigForm |

## API Reference

### HTTP Request Block Tips

When using HTTP request blocks with MCP tools:

- Use "Import cURL" to quickly convert API documentation examples
- Use "Quick Headers" to add common authentication and content headers
- The request will return response data including status, headers, and body
- Reference response data in later blocks with parameters

### Response Data

All MCP tool responses include:

| Field | Description |
|-------|-------------|
| `status` | HTTP status code |
| `headers` | Response headers |
| `body` | Response body content |

## Workflow Integration

MCP tools can be integrated into Skyvern workflows for:

- **Browser automation blocks**: Execute MCP actions as part of workflow steps
- **Conditional logic**: Use validation results to control workflow branching
- **Data extraction**: Feed extracted data into subsequent workflow blocks
- **Scheduled execution**: Run MCP-powered workflows on cron schedules

## Best Practices

1. **Session Management**: Always close browser sessions when done to free resources
2. **Error Handling**: Use validation tools to check page state before proceeding
3. **Screenshot Debugging**: Take screenshots at key points for debugging failed automations
4. **Credential Security**: Use environment variables and secure credential storage
5. **Rate Limiting**: Be mindful of API rate limits when making frequent requests

---

---

## Doramagic 踩坑日志

项目：Skyvern-AI/skyvern

摘要：发现 23 个潜在踩坑项，其中 1 个为 high/blocking；最高优先级：安全/权限坑 - 来源证据：what ensures it’s the correct one in that context?。

## 1. 安全/权限坑 · 来源证据：what ensures it’s the correct one in that context?

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：what ensures it’s the correct one in that context?
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_77591d02b4fa4efdb89fba55a9a7f08a | https://github.com/Skyvern-AI/skyvern/issues/5637 | 来源类型 github_issue 暴露的待验证使用条件。

## 2. 安装坑 · 来源证据：Release v1.0.29

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Release v1.0.29
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_d8e67eb179f5406fb5af75a063639216 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.29 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 3. 安装坑 · 来源证据：Task Execution Performance: Seeking guidance on optimizing execution speed

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Task Execution Performance: Seeking guidance on optimizing execution speed
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_7f47a5abded54abfb766032115bfa71c | https://github.com/Skyvern-AI/skyvern/issues/4375 | 来源类型 github_issue 暴露的待验证使用条件。

## 4. 安装坑 · 来源证据：[Feature Request] Multi-session VNC support for local/self-hosted deployments (Live view & Take Control)

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：[Feature Request] Multi-session VNC support for local/self-hosted deployments (Live view & Take Control)
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_8f2671eacb334774963837f8f7e8edf4 | https://github.com/Skyvern-AI/skyvern/issues/4392 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 5. 配置坑 · 来源证据：Performance bottleneck: High latency for simple form-filling workflows

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Performance bottleneck: High latency for simple form-filling workflows
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_9820831a47af4ca28ef7abcc0fd7095e | https://github.com/Skyvern-AI/skyvern/issues/4439 | 来源类型 github_issue 暴露的待验证使用条件。

## 6. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | README/documentation is current enough for a first validation pass.

## 7. 维护坑 · 来源证据：Release v1.0.34

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：Release v1.0.34
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_660eece743754686b70883d67faffd46 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.34 | 来源类型 github_release 暴露的待验证使用条件。

## 8. 维护坑 · 来源证据：Release v1.0.35

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：Release v1.0.35
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_53f24132e3214005ba1dc606965c0eb7 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.35 | 来源类型 github_release 暴露的待验证使用条件。

## 9. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | last_activity_observed missing

## 10. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | no_demo; severity=medium

## 11. 安全/权限坑 · 存在安全注意事项

- 严重度：medium
- 证据强度：source_linked
- 发现：No sandbox install has been executed yet; downstream must verify before user use.
- 对用户的影响：用户安装前需要知道权限边界和敏感操作。
- 建议检查：转成明确权限清单和安全审查提示。
- 防护动作：安全注意事项必须面向用户前置展示。
- 证据：risks.safety_notes | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | No sandbox install has been executed yet; downstream must verify before user use.

## 12. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | no_demo; severity=medium

## 13. 安全/权限坑 · 来源证据：Clarification on the Custom credential documentation on the Delete API with empty body

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Clarification on the Custom credential documentation on the Delete API with empty body
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_a5164e5767dc4b618ce1c862ed440eaa | https://github.com/Skyvern-AI/skyvern/issues/4256 | 来源类型 github_issue 暴露的待验证使用条件。

## 14. 安全/权限坑 · 来源证据：GROQ error

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：GROQ error
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_fb8dab5ab6224386a6b341234a8e90be | https://github.com/Skyvern-AI/skyvern/issues/4366 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 15. 安全/权限坑 · 来源证据：Release v1.0.27

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.27
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_23488c9e7f354979bdbf8e9554b4b647 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.27 | 来源类型 github_release 暴露的待验证使用条件。

## 16. 安全/权限坑 · 来源证据：Release v1.0.30

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.30
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6929e0e36e1549e9bd95f29d1d4cfbdf | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.30 | 来源类型 github_release 暴露的待验证使用条件。

## 17. 安全/权限坑 · 来源证据：Release v1.0.31

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.31
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_40ca78dbdbe141f383c8d202f08db28b | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.31 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 18. 安全/权限坑 · 来源证据：Release v1.0.32

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.32
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_4eaa09d9a74f49efb0fb81c9bdebe6a3 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.32 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 19. 安全/权限坑 · 来源证据：Release v1.0.33

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.33
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_ba7d530f080e4c88ad47ac26ba2781a7 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.33 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 20. 安全/权限坑 · 来源证据：Release v1.0.36

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Release v1.0.36
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_236e050a3837462692b739322738a7a2 | https://github.com/Skyvern-AI/skyvern/releases/tag/v1.0.36 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 21. 安全/权限坑 · 来源证据：persist_browser_session flag saves sessions but never retrieves them on subsequent runs

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：persist_browser_session flag saves sessions but never retrieves them on subsequent runs
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_b56e85161a3b4b7682c8ab13127be8d0 | https://github.com/Skyvern-AI/skyvern/issues/4390 | 来源类型 github_issue 暴露的待验证使用条件。

## 22. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | issue_or_pr_quality=unknown

## 23. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | art_9274907e6629499384a5a574e4caa877 | https://github.com/Skyvern-AI/skyvern#readme | release_recency=unknown

<!-- canonical_name: Skyvern-AI/skyvern; human_manual_source: deepwiki_human_wiki -->
