# https://github.com/nottelabs/notte 项目说明书

生成时间：2026-05-16 12:22:12 UTC

## 目录

- [Introduction to Notte](#introduction)
- [Quickstart Guide](#quickstart)
- [System Architecture](#architecture)
- [Agent Core System](#agent-core)
- [Structured Output](#structured-output)
- [Agent Fallback System](#agent-fallback)
- [Browser Sessions](#sessions)
- [Actions and Browser Controls](#actions-controls)
- [Vaults and Credential Management](#vaults-credentials)
- [Agent Personas](#personas)

<a id='introduction'></a>

## Introduction to Notte

### 相关页面

相关主题：[Quickstart Guide](#quickstart), [System Architecture](#architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/nottelabs/notte/blob/main/README.md)
- [packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)
- [packages/notte-core/src/notte_core/actions/actions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/actions/actions.py)
- [packages/notte-core/src/notte_core/errors/processing.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/errors/processing.py)
- [packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)
- [packages/notte-agent/src/notte_agent/falco/prompt.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/prompt.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/personas.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/personas.py)
</details>

# Introduction to Notte

Notte is a comprehensive software suite designed for internet-native agentic systems. It provides a robust framework for building, deploying, and managing AI agents capable of interacting with web content programmatically. The project is developed by Notte Labs, Inc. and licensed under the Server Side Public License v1.0.

资料来源：[README.md](https://github.com/nottelabs/notte/blob/main/README.md)

## Overview

Notte transforms the internet into a structured, navigable space where each website becomes an accessible map for intelligent agents. The technology enables AI systems to interpret and interact with web content with precision, creating a programmatic layer over the web.

The suite offers multiple capabilities:

- **Web Scraping**: Extract structured data from any website
- **Browser Automation**: Navigate and interact with web pages programmatically
- **Agent Framework**: Build AI agents that can perform complex web tasks
- **Session Management**: Maintain stateful browsing sessions with cookie handling

资料来源：[README.md](https://github.com/nottelabs/notte/blob/main/README.md), [packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

## Architecture

The Notte architecture consists of several interconnected packages:

```mermaid
graph TD
    A[Client SDK] --> B[notte-core]
    A --> C[notte-llm]
    A --> D[notte-agent]
    B --> E[Actions Module]
    B --> F[Error Handling]
    C --> G[Prompts]
    C --> H[Data Extraction]
    D --> I[Falco Agent]
    D --> J[Gufo Agent]
```

### Core Packages

| Package | Purpose |
|---------|---------|
| `notte-core` | Core actions, error handling, and browser interaction primitives |
| `notte-llm` | LLM integration, prompts for document analysis, data extraction, and action generation |
| `notte-agent` | Agent implementations (Falco, Gufo) with validation and execution logic |
| `notte-sdk` | Python SDK for easy integration and API consumption |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/actions/actions.py), [packages/notte-core/src/notte_core/errors/processing.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/errors/processing.py)

## Browser Actions

Notte provides a comprehensive set of browser actions that agents can execute. Actions are defined as typed classes with execution messages and parameter validation.

### Navigation Actions

| Action | Description | Parameters |
|--------|-------------|------------|
| `goto` | Navigate to a URL | `url: str` |
| `goto_new_tab` | Open URL in a new tab | `url: str` |
| `close_tab` | Close the current tab | None |

**Example Usage:**

```python
session.execute(type="goto", url="https://console.notte.cc")
session.execute(type="goto_new_tab", url="https://example.com")
session.execute(type="close_tab")
```

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:1-100](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/actions/actions.py)

## Session Management

The SDK provides a session-based interface for browser automation. Sessions maintain state across multiple interactions and support cookie persistence.

### Session Lifecycle

```mermaid
graph LR
    A[Start Session] --> B[Execute Actions]
    B --> C[Observe State]
    C --> B
    B --> D[Stop Session]
    D --> E[Save Cookies]
```

### Basic Session Usage

```python
from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://www.notte.cc")
    obs = session.observe()
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

### Observation Types

Sessions support two perception modes for observing page state:

| Mode | Description | Use Case |
|------|-------------|----------|
| `fast` | Simple page perception for quick queries | Basic element detection |
| `deep` | LLM-powered formatting for rich action spaces | Complex interactions |

```python
# Fast observation
obs = session.observe(perception_type='fast')

# Deep observation for LLM-ready action space
obs = session.observe(perception_type='deep')
print(obs.space.description)
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

### Cookie Management

Sessions automatically handle cookie persistence:

```python
client = NotteClient(cookie_file="cookies.json")
with client.Session() as session:
    # Cookies are loaded on start and saved on stop
    session.execute(type="goto", url="https://example.com")
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

## Agent System

Notte includes sophisticated agent implementations for autonomous web navigation.

### Action Identification System

Agents identify interactive elements using a structured ID system:

| Prefix | Element Type | Examples |
|--------|--------------|----------|
| `I` | Input fields | Textboxes, selects, checkboxes |
| `B` | Buttons | Clickable buttons |
| `L` | Links | Hypertext links |
| `F` | Figures/Images | Visual elements |
| `O` | Select options | Dropdown options |
| `M` | Miscellaneous | Modals, dialogs |

**ID Format:** `<role_first_letter><index>[:]` (e.g., `B1`, `I2`, `L3:button`)

> **Note:** IDs can change at each step. Agents must not assume IDs persist across observations.

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)

### CAPTCHA Handling

Agents have built-in CAPTCHA detection and handling:

- Never interact directly with CAPTCHA elements
- Use the `captcha_solve` action when detection occurs
- Supported types: reCAPTCHA, hCaptcha, image verification, checkbox verification

```json
{
  "action": "captcha_solve",
  "captcha_type": "recaptcha"
}
```

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)

### Validation System

The agent framework includes a validation pipeline:

```mermaid
graph TD
    A[Execute Action] --> B[Validate Output]
    B --> C{Has Observations?}
    C -->|No| D[Return Error]
    C -->|Yes| E[LLM Validation]
    E --> F{Is Valid?}
    F -->|Yes| G[Return Success]
    F -->|No| H[Return Failure]
```

The validator uses vision models when available to verify action outcomes against expected results.

资料来源：[packages/notte-agent/src/notte_agent/common/validator.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/common/validator.py)

## Data Extraction

Notte provides structured data extraction capabilities through LLM-powered document analysis.

### Document Analysis Pipeline

| Stage | Description |
|-------|-------------|
| Analysis | Identify sections, content types, and structured data |
| Category | Classify document type (search-results, item, other) |
| Extraction | Transform content into structured format |

### Output Format

Extracted data is organized into two sections:

1. **`<document-analysis>`**: Logical breakdown of the document structure
2. **`<data-extraction>`**: Structured Markdown output with tables and lists

**Example Categories:**

| Category | Use Case |
|----------|----------|
| `search-results` | Google Flights, search engine results |
| `item` | Recipe pages, product details |
| `other` | General content (Allrecipes homepage) |

资料来源：[packages/notte-llm/src/notte_llm/prompts/document-category/base/user.md](https://github.com/nottelabs/notte/blob/main/packages/notte-llm/src/notte_llm/prompts/document-category/base/user.md)

## API Integration

### REST API Endpoint

```bash
curl -X POST 'https://api.notte.cc/scrape' \
  -H 'Authorization: Bearer <NOTTE-API-KEY>' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://notte.cc",
    "only_main_content": false
  }'
```

### SDK Client Usage

```python
from notte_sdk import NotteClient
from pydantic import BaseModel

# Basic scraping
response = client.scrape(
    url="https://notte.cc",
    scrape_links=True,
    only_main_content=True
)

# Structured scraping
class Article(BaseModel):
    title: str
    content: str
    date: str

response = client.scrape(
    url="https://example.com/blog",
    response_format=Article,
    instructions="Extract only the title, date and content of the articles"
)
```

资料来源：[README.md](https://github.com/nottelabs/notte/blob/main/README.md)

## Personas

Notte supports persona-based operations for enhanced privacy and automation:

```python
import notte

persona = notte.Persona("<your-persona-id>")
sms = persona.sms(only_unread=True)
```

### Available Operations

| Method | Description |
|--------|-------------|
| `sms()` | Retrieve SMS messages for the persona |
| `create_number()` | Create a phone number |
| `delete_number()` | Delete the persona's phone number |

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/personas.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/personas.py)

## Error Handling

Notte defines a comprehensive error hierarchy for different failure scenarios:

### Core Error Classes

| Error | Description |
|-------|-------------|
| `InvalidA11yTreeType` | Invalid accessibility tree type |
| `InvalidA11yChildrenError` | Invalid child element count |
| `InvalidPlaceholderError` | Unhandled placeholder in vault |
| `ScrapeFailedError` | Structured data extraction failure |

All errors provide developer advice and user-facing messages for appropriate handling.

资料来源：[packages/notte-core/src/notte_core/errors/processing.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/errors/processing.py)

## Search Demo

Notte provides a live search demonstration using MCP server integration:

- **Demo URL**: [https://search.notte.cc/](https://search.notte.cc/)
- **Features**: Real-time search in LLM chatbots leveraging the scraping endpoint

资料来源：[README.md](https://github.com/nottelabs/notte/blob/main/README.md)

## License and Citation

This project is licensed under the **Server Side Public License v1.0 (SSPL-1.0)**.

For academic or commercial use, cite as:

```bibtex
@software{notte2025,
  author = {Pinto, Andrea and Giordano, Lucas and {nottelabs-team}},
  title = {Notte: Software suite for internet-native agentic systems},
  url = {https://github.com/nottelabs/notte},
  year = {2025},
  publisher = {GitHub},
  license = {SSPL-1.0},
  version = {1.4.4}
}
```

资料来源：[README.md](https://github.com/nottelabs/notte/blob/main/README.md)

---

<a id='quickstart'></a>

## Quickstart Guide

### 相关页面

相关主题：[Introduction to Notte](#introduction)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [pyproject.toml](https://github.com/nottelabs/notte/blob/main/pyproject.toml)
- [.env.example](https://github.com/nottelabs/notte/blob/main/.env.example)
- [docs/setup.md](https://github.com/nottelabs/notte/blob/main/docs/setup.md)
- [examples/quickstart.py](https://github.com/nottelabs/notte/blob/main/examples/quickstart.py)
</details>

# Quickstart Guide

The Quickstart Guide provides a streamlined path for developers to begin using the Notte SDK within minutes. It covers environment configuration, SDK initialization, and the fundamental workflows for web scraping and browser automation.

## Prerequisites

Before starting, ensure your development environment meets the following requirements:

| Requirement | Minimum Version | Notes |
|-------------|-----------------|-------|
| Python | 3.10+ | Required for type annotations and modern async features |
| pip | 21.0+ | For package installation |

## Environment Setup

### 1. Obtain API Credentials

Register at [notte.cc](https://notte.cc) to obtain your API key. The service requires authentication via Bearer token for all API requests.

### 2. Configure Environment Variables

Create a `.env` file in your project root with the following variables:

```bash
NOTTE_API_KEY=your_api_key_here
NOTTE_API_URL=https://api.notte.cc  # Optional, defaults to this value
```

资料来源：[.env.example:1-2]()

### 3. Install the SDK

Install the Notte SDK using pip:

```bash
pip install notte
```

For additional providers or extras, install from the project root:

```bash
pip install -e ".[providers]"
```

资料来源：[pyproject.toml:1-50]()

## Basic Usage

### SDK Client Initialization

Initialize the Notte client using environment variables or direct configuration:

```python
from notte import Notte

# Using environment variables (recommended)
client = Notte()

# Or with explicit parameters
client = Notte(
    api_key="your_api_key",
    base_url="https://api.notte.cc"
)
```

### Simple Web Scraping

Perform basic webpage scraping with minimal configuration:

```python
response = client.scrape(
    url="https://notte.cc",
    scrape_links=True,
    only_main_content=True
)
print(response.content)
```

资料来源：[examples/quickstart.py:1-20]()

### Structured Data Extraction

Extract structured data using Pydantic models for type-safe responses:

```python
from notte import BaseModel

class Article(BaseModel):
    title: str
    content: str
    date: str

response = client.scrape(
    url="https://example.com/blog",
    response_format=Article,
    instructions="Extract only the title, date and content of the articles"
)
```

资料来源：[README.md:1-30]()

## Session-Based Automation

For complex interactions requiring multiple steps, use the Session API:

```python
with client.session() as session:
    # Navigate to a page
    session.execute(type="goto", url="https://example.com")
    
    # Observe available actions
    obs = session.observe(perception_type="deep")
    
    # Execute form filling or clicking actions
    session.execute(type="click", id="B1")
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-50]()

## API Reference

### Client Configuration Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `api_key` | `str` | Yes | - | Your Notte API key |
| `base_url` | `str` | No | `https://api.notte.cc` | Base URL for API requests |
| `timeout` | `int` | No | `60` | Request timeout in seconds |

### Scrape Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `url` | `str` | Yes | - | Target URL to scrape |
| `only_main_content` | `bool` | No | `True` | Exclude navbars and footers |
| `scrape_links` | `bool` | No | `True` | Include hyperlinks in response |
| `response_format` | `BaseModel` | No | `None` | Pydantic model for structured output |
| `instructions` | `str` | No | `None` | Natural language extraction instructions |

## Workflow Diagram

```mermaid
graph TD
    A[Start] --> B[Install SDK]
    B --> C[Configure API Key]
    C --> D{Use Case}
    D -->|Simple Scrape| E[client.scrape]
    D -->|Structured Data| F[Define BaseModel]
    F --> G[client.scrape with response_format]
    D -->|Complex Automation| H[Create Session]
    H --> I[Observe Actions]
    I --> J[Execute Actions]
    J --> K[Return Results]
    E --> L[End]
    G --> L
    K --> L
```

## cURL Alternative

For environments without Python, use the REST API directly:

```bash
curl -X POST 'https://api.notte.cc/scrape' \
  -H 'Authorization: Bearer <NOTTE-API-KEY>' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://notte.cc",
    "only_main_content": false
  }'
```

资料来源：[README.md:40-50]()

## Next Steps

- Review the [Setup Documentation](https://github.com/nottelabs/notte/blob/main/docs/setup.md) for advanced configuration
- Explore the [Examples Directory](https://github.com/nottelabs/notte/tree/main/examples) for complete use cases
- Check the [Agent Documentation](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/README.md) for browser automation with AI agents

---

<a id='architecture'></a>

## System Architecture

### 相关页面

相关主题：[Introduction to Notte](#introduction), [Agent Core System](#agent-core), [Browser Sessions](#sessions)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-agent/src/notte_agent/__init__.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/__init__.py)
- [packages/notte-browser/src/notte_browser/__init__.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/__init__.py)
- [packages/notte-core/src/notte_core/__init__.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/__init__.py)
- [packages/notte-sdk/src/notte_sdk/__init__.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/__init__.py)
- [packages/notte-llm/src/notte_llm/__init__.py](https://github.com/nottelabs/notte/blob/main/packages/notte-llm/src/notte_llm/__init__.py)
</details>

# System Architecture

Notte is a software suite designed for internet-native agentic systems, providing a comprehensive infrastructure for browser automation, web interaction, and AI-driven document processing. The architecture follows a modular design pattern with distinct layers for browser control, agent orchestration, LLM integration, and SDK accessibility.

## Overview

The Notte system is composed of five primary packages that work together to enable autonomous web interaction:

| Package | Purpose |
|---------|---------|
| `notte-core` | Core utilities, error handling, and shared data structures |
| `notte-browser` | Browser interaction and CDP integration layer |
| `notte-agent` | AI agent orchestration (Gufo and Falco subsystems) |
| `notte-llm` | LLM prompt management and document processing |
| `notte-sdk` | Public SDK for client applications |

## High-Level Architecture

```mermaid
graph TD
    Client[Client Application]
    SDK[notte-sdk]
    Agent[notte-agent]
    Browser[notte-browser]
    LLM[notte-llm]
    Core[notte-core]
    CDP[Chrome DevTools Protocol]
    Remote[Remote Browser]

    Client --> SDK
    SDK --> Agent
    SDK --> LLM
    Agent --> Browser
    Browser --> CDP
    CDP --> Remote
    LLM --> Core
    Core --> Browser
```

## Core Package (`notte-core`)

The `notte-core` package provides foundational components used across all other packages, including error handling, accessibility tree processing, and placeholder management.

### Error Handling Architecture

The error system is built on a hierarchical class structure rooted in `NotteBaseError`:

```python
# Source: packages/notte-core/src/notte_core/errors/processing.py
class NotteBaseError(Exception):
    def __init__(self, agent_message, user_message, dev_message)
```

**Error Categories:**

| Error Class | Purpose |
|-------------|---------|
| `InvalidInternalCheckError` | Internal validation failures with developer guidance |
| `InvalidA11yTreeType` | Unsupported accessibility tree format |
| `InvalidA11yChildrenError` | Accessibility tree structure violations |
| `InvalidPlaceholderError` | Vault placeholder resolution failures |
| `ScrapeFailedError` | Structured data extraction failures |

### Placeholder System

The placeholder system enables secure credential management through a vault mechanism. When an action requires sensitive data, the system substitutes placeholders that are resolved at runtime.

```python
# Source: packages/notte-core/src/notte_core/errors/processing.py
class InvalidPlaceholderError(NotteBaseError):
    def __init__(self, placeholder: str) -> None:
        dev_message = f"The placeholder {placeholder} is not handled by your current vault."
        agent_message = f"Could not perform action with value {placeholder}. Try picking a different value"
```

## SDK Package (`notte-sdk`)

The SDK provides the primary interface for client applications to interact with the Notte system. It exposes session management, observation capabilities, and action execution through a Pythonic API.

### Session Management

Sessions are the fundamental unit of work in Notte, representing a single browser session with associated state and context.

```python
# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
class Session:
    def __init__(self, ..., timeout_minutes: int = ...):
        self.response = None
        self._cookie_file = None
```

**Session Lifecycle:**

| State | Description |
|-------|-------------|
| `created` | Session object instantiated |
| `started` | `client.start()` called, `session_id` available |
| `active` | Browser operations in progress |
| `stopped` | `client.stop()` called, session terminated |

### Observation System

The observation system retrieves the current state of the webpage and available interactive elements:

```python
# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
def observe(self, *, perception_type: str = None, instructions: str = None, **data):
    if data.get("perception_type") is None:
        data["perception_type"] = self.default_perception_type
    return self.client.page.observe(session_id=self.session_id, **data)
```

**Perception Types:**

| Type | Description | Use Case |
|------|-------------|----------|
| `fast` | Simple page perception for quick queries | Default, rapid action space generation |
| `deep` | LLM-powered element formatting | Complex pages requiring structured analysis |

### Action Execution

Actions are executed through the unified `execute()` method with type-based dispatch:

```python
# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
def execute(self, *, raise_on_failure: bool = None, **kwargs: Unpack[FormFillActionDict]) -> ExecutionResult
```

### Cookie Management

Sessions automatically persist cookies to file for session continuity:

```python
# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
if self._cookie_file is not None:
    cookies = self.get_cookies()
    create_or_append_cookies_to_file(self._cookie_file, cookies)
```

## Agent Package (`notte-agent`)

The agent package contains two distinct agent subsystems: **Gufo** and **Falco**. Both subsystems handle browser automation but use different prompt strategies and action registries.

### Agent Subsystem Comparison

| Aspect | Gufo | Falco |
|--------|------|-------|
| System Prompt | `gufo/system.md` | `falco/system.md` |
| Element Format | Markdown with backticks | `id[:]<type>text</type>` |
| Tools | Configurable via `BaseTool` | Configurable via `BaseTool` |
| Action Registry | Custom implementation | `ActionRegistry` class |

### Element Identification System

Both agents use a consistent element identification scheme for interactive elements:

| Prefix | Element Type | Examples |
|--------|--------------|----------|
| `I` | Input fields | Textbox, select, checkbox, radio |
| `B` | Buttons | Submit, clickable elements |
| `L` | Links | Hypertext navigation |
| `F` | Figures/Images | Visual content |
| `O` | Options | Select dropdown items |
| `M` | Miscellaneous | Modals, dialogs, overlays |

```json
{
  "id": "I1",
  "type": "input",
  "label": "email",
  "value": "user@example.com"
}
```

### Gufo Agent System

The Gufo agent (`packages/notte-agent/src/notte_agent/gufo/system.md`) operates through structured JSON commands:

```json
{
  "actions": [{"type": "click", "id": "B1"}],
  "reasoning": "User wants to submit the form"
}
```

### Falco Agent System

The Falco agent (`packages/notte-agent/src/notte_agent/falco/prompt.py`) uses a prompt-based approach with configurable tools:

```python
# Source: packages/notte-agent/src/notte_agent/falco/prompt.py
class FalcoPrompt(BasePrompt):
    def __init__(
        self,
        prompt_file: Path | None = None,
        tools: list[BaseTool] | None = None,
    ) -> None:
        self.action_registry: ActionRegistry = ActionRegistry(tools)
```

### CAPTCHA Handling

Both agents implement strict CAPTCHA detection and handling:

```python
# Source: packages/notte-agent/src/notte_agent/gufo/system.md
# CAPTCHA HANDLING - CRITICAL RULES:
# - NEVER click on captcha elements directly
# - NEVER use "click", "type", or any other action on captcha elements
# - If detected, use ONLY the "captcha_solve" action
```

### Action Examples

**Form Filling:**
```json
// Source: packages/notte-agent/src/notte_agent/falco/prompt.py
{
  "type": "form_fill",
  "value": {
    "address1": "<my address>",
    "city": "<my city>",
    "state": "<my state>"
  }
}
```

**Navigation and Extraction:**
```json
{
  "type": "scrape",
  "instructions": "Extract the search results from the page"
}
```

## LLM Package (`notte-llm`)

The LLM package manages prompts and processing for document analysis, categorization, and structured data extraction.

### Prompt Categories

| Category | Purpose | Output Format |
|----------|---------|---------------|
| `document-category` | Classify web documents | `<document-category>type</document-category>` |
| `data-extraction` | Extract structured data | Markdown with sections |
| `action-listing` | List available actions | JSON action array |
| `extract-without-json-schema` | LLM-native extraction | Structured JSON |

### Document Categorization

Documents are classified into categories for downstream processing:

| Category | Description | Example |
|----------|-------------|---------|
| `search-results` | Search engine results page | Google search |
| `item` | Individual item/product page | Recipe, product listing |
| `other` | Uncategorized content | General pages |

```markdown
<!-- Source: packages/notte-llm/src/notte_llm/prompts/document-category/base/user.md -->
<document-category>other</document-category>
```

### Data Extraction Templates

The system supports multiple extraction formats:

| Template | Sections | Use Case |
|----------|----------|----------|
| `two_sections` | `<document-analysis>`, `<data-extraction>` | Standard extraction |
| `all_data` | Analysis + detailed extraction | Comprehensive data |
| `user.md` (base) | Custom format | Flexible extraction |

### Structured Output Generation

```markdown
<!-- Source: packages/notte-llm/src/notte_llm/prompts/data-extraction/user.md -->
<document-analysis>
Found X menus, Y text elements, Z interactive elements
[Analysis content...]
</document-analysis>
<data-extraction>
[Extracted data in Markdown format...]
</data-extraction>
```

## Browser Package (`notte-browser`)

The browser package provides the low-level interface to the Chrome DevTools Protocol (CDP) for controlling headless browsers.

### Key Responsibilities

- Page navigation and loading
- Element interaction (click, type, scroll)
- Screenshot capture
- Accessibility tree generation
- Cookie management

### CDP Integration

```python
# Source: packages/notte-sdk/README.md
from patchright.sync_api import sync_playwright
from notte_sdk import NotteClient

with notte.Session() as session:
    # Browser operations via CDP
    _ = session.execute(type="goto", url="https://example.com")
```

## Data Flow Architecture

```mermaid
graph LR
    User[User Request]
    SDK[SDK Session]
    Agent[Agent Processor]
    LLM[LLM Processing]
    Browser[Browser Engine]
    CDP[CDP Commands]
    
    User --> SDK
    SDK --> Agent
    Agent --> LLM
    LLM --> Agent
    Agent --> Browser
    Browser --> CDP
    CDP --> Browser
    Browser --> Agent
    Agent --> SDK
    SDK --> User
```

## Session State Machine

```mermaid
graph TD
    Init[Session Created] --> Start[client.start]
    Start --> Active[Session Active]
    Active --> Observe[session.observe]
    Active --> Execute[session.execute]
    Observe --> Active
    Execute --> Active
    Active --> Stop[client.stop]
    Stop --> End[Session Ended]
    
    Start -->|Error| Error[Error State]
    Error -->|Retry| Start
```

## Configuration Options

### SDK Initialization

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `timeout_minutes` | `int` | - | Session timeout |
| `open_viewer` | `bool` | `False` | Open browser viewer |
| `proxies` | `dict` | `None` | Proxy configuration |
| `_cookie_file` | `str` | `None` | Cookie persistence file |

### Perception Configuration

| Parameter | Type | Description |
|-----------|------|-------------|
| `perception_type` | `str` | `fast` or `deep` |
| `instructions` | `str` | Natural language filtering |

## Error Handling Flow

```mermaid
graph TD
    Action[Action Request]
    Validate{Validation}
    Validate -->|Pass| Execute
    Validate -->|Fail| InvalidError[InvalidInternalCheckError]
    
    Execute --> CDP[CDP Call]
    CDP -->|Success| Result
    CDP -->|Failure| ScrapeError[ScrapeFailedError]
    
    Placeholder{Placeholder Check}
    Result --> Placeholder
    Placeholder -->|Found| PlaceholderError[InvalidPlaceholderError]
    Placeholder -->|None| Complete
```

## Integration Examples

### Basic Session Usage

```python
# Source: packages/notte-sdk/README.md
from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://example.com")
    obs = session.observe()
    action = obs.space.sample(type='click')
    result = session.execute(action)
```

### Agent Deployment

```python
# Source: packages/notte-sdk/README.md
with notte.Session(open_viewer=True) as session:
    agent = notte.Agent(session=session)
    agent.start(
        task="Summarize the content of the page",
        url="https://www.google.com"
    )
```

## Summary

The Notte architecture provides a robust, layered approach to browser automation:

1. **Core Layer** (`notte-core`): Provides shared utilities, error handling, and base data structures
2. **Browser Layer** (`notte-browser`): Abstracts Chrome DevTools Protocol for browser control
3. **Agent Layer** (`notte-agent`): Implements AI-driven automation with Gufo and Falco subsystems
4. **LLM Layer** (`notte-llm`): Manages document processing and prompt engineering
5. **SDK Layer** (`notte-sdk`): Exposes the complete API to client applications

The modular design allows each layer to be used independently or in combination, enabling flexible deployment scenarios from simple web scraping to complex autonomous agent workflows.

---

<a id='agent-core'></a>

## Agent Core System

### 相关页面

相关主题：[Structured Output](#structured-output), [Agent Fallback System](#agent-fallback), [Browser Sessions](#sessions)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-agent/src/notte_agent/agent.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/agent.py)
- [packages/notte-agent/src/notte_agent/common/conversation.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/common/conversation.py)
- [packages/notte-agent/src/notte_agent/common/perception.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/common/perception.py)
- [packages/notte-agent/src/notte_agent/falco/agent.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/agent.py)
- [packages/notte-agent/src/notte_agent/gufo/agent.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/agent.py)
- [packages/notte-core/src/notte_core/agent_types.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/agent_types.py)
</details>

# Agent Core System

The Agent Core System is the central orchestration layer in Notte that enables AI agents to autonomously navigate and interact with web pages. It provides a unified interface for browser automation tasks, handling perception of web elements, action execution, and conversation management between the agent and web content.

## Architecture Overview

The Agent Core System consists of multiple layered components that work together to enable autonomous web interaction.

```mermaid
graph TD
    A[Agent Client] --> B[Agent Core]
    B --> C[Browser Session]
    C --> D[Web Page]
    B --> E[Perception Module]
    B --> F[Conversation Module]
    E --> G[Action Registry]
    G --> H[Falco Actions]
    G --> I[Gufo Actions]
```

## Agent Types

Notte supports multiple agent implementations, each designed for specific automation scenarios. The agent type determines the underlying action execution engine and available capabilities.

| Agent Type | Description | Use Case |
|------------|-------------|----------|
| `falco` | Standard browser automation agent | General web interaction, form filling, navigation |
| `gufo` | Advanced automation with stealth features | CAPTCHA handling, proxy rotation, anti-detection |

资料来源：[packages/notte-core/src/notte_core/agent_types.py:1-50]()

## Core Components

### Agent Base Class

The base `Agent` class provides the primary interface for task execution and state management.

```python
class Agent:
    def __init__(
        self,
        session: Session,
        vault: Vault | None = None,
        max_steps: int = 10,
        agent_type: AgentType = AgentType.FALCO,
    )
    
    def run(self, task: str) -> AgentResponse
    def start(self, task: str, url: str | None = None) -> None
    def status() -> AgentStatus
    def stop() -> None
```

**Key Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `session` | `Session` | Required | Browser session for interaction |
| `vault` | `Vault \| None` | `None` | Secure credential storage |
| `max_steps` | `int` | `10` | Maximum action steps before termination |
| `agent_type` | `AgentType` | `FALCO` | Backend engine selection |

资料来源：[packages/notte-agent/src/notte_agent/agent.py:1-100]()

### Perception Module

The perception module analyzes web page structure and generates interactive action spaces for the agent.

```python
class Perception:
    def observe(
        self,
        perception_type: str = 'fast',
        instructions: str | None = None
    ) -> ObservationResponse
```

**Perception Types:**

| Type | Description | Performance |
|------|-------------|-------------|
| `fast` | Simple page element parsing | Low latency |
| `deep` | LLM-powered element formatting | Higher accuracy, slower |

The perception system generates element IDs using a role-based prefix system:

- `I` - Input fields (textbox, select, checkbox)
- `B` - Buttons
- `L` - Links
- `F` - Figures/Images
- `O` - Select options
- `M` - Miscellaneous elements (modals, dialogs)

资料来源：[packages/notte-agent/src/notte_agent/common/perception.py:1-80]()

### Conversation Module

Manages the dialogue history between the agent and web content, maintaining context across multiple interaction steps.

```python
class Conversation:
    def __init__(self, system_prompt: str)
    def add_user_message(self, content: str) -> None
    def add_agent_message(self, content: str) -> None
    def get_messages() -> list[Message]
```

The conversation system tracks:
- User task requests
- Agent reasoning and decisions
- Action execution results
- Page observations

资料来源：[packages/notte-agent/src/notte_agent/common/conversation.py:1-60]()

## Action System

### Action Types

The agent supports a comprehensive set of browser automation actions through a registry pattern.

```mermaid
graph LR
    A[Agent Decision] --> B[Action Registry]
    B --> C[FormFillAction]
    B --> D[ClickAction]
    B --> E[ScrapeAction]
    B --> F[CaptchaSolveAction]
    B --> G[GotoAction]
```

| Action | Description | Parameters |
|--------|-------------|------------|
| `goto` | Navigate to URL | `url: str` |
| `click` | Click element | `id: str` |
| `fill` | Fill form fields | `value: dict[str, str]` |
| `scrape` | Extract structured data | `instructions: str` |
| `captcha_solve` | Solve CAPTCHA | `captcha_type: str` |

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py:1-150]()

### Action Registry

The `ActionRegistry` maintains available actions and their schemas, enabling dynamic action discovery.

```python
class ActionRegistry:
    def __init__(self, tools: list[BaseTool])
    def get_action_schemas(self) -> list[ActionSchema]
    def register(self, action_cls: type[BaseTool]) -> None
```

### Supported Action Formats

Actions are serialized using JSON schema format for agent consumption:

```json
{
  "type": "object",
  "properties": {
    "id": {"type": "string", "description": "Element identifier"},
    "value": {"type": "string", "description": "Action value"}
  }
}
```

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py:30-80]()

## Agent Implementations

### Falco Agent

The default agent implementation using standard Playwright-based browser automation.

```python
class FalcoAgent(BaseAgent):
    def __init__(self, tools: list[BaseTool] | None = None)
    def execute(self, action: dict) -> ExecutionResult
```

**Features:**
- Standard form filling
- Click-based navigation
- Basic scraping operations
- Simple CAPTCHA detection

资料来源：[packages/notte-agent/src/notte_agent/falco/agent.py:1-100]()

### Gufo Agent

Advanced agent with stealth capabilities for bypassing detection systems.

```python
class GufoAgent(BaseAgent):
    def __init__(self, tools: list[BaseTool] | None = None)
    def execute(self, action: dict) -> ExecutionResult
```

**Features:**
- Automatic CAPTCHA solving
- Proxy rotation support
- User agent spoofing
- Cookie management

资料来源：[packages/notte-agent/src/notte_agent/gufo/agent.py:1-100]()

## Session Integration

Agents operate within browser sessions that provide the execution environment.

```python
with client.Session(browser_type="chrome", open_viewer=True) as session:
    agent = client.Agent(session=session, max_steps=15)
    response = agent.run(
        task="Navigate to the form and submit with sample data",
        url="https://example.com/form"
    )
```

### Session Configuration

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `browser_type` | `str` | `"chrome"` | Browser engine |
| `open_viewer` | `bool` | `False` | Display browser window |
| `timeout_minutes` | `int` | `5` | Session timeout |
| `proxies` | `bool \| str` | `False` | Proxy configuration |
| `solve_captchas` | `bool` | `False` | Auto-CAPTCHA solving |

## Workflow Execution

```mermaid
sequenceDiagram
    participant User
    participant Agent
    participant Session
    participant Browser
    User->>Agent: run(task)
    Agent->>Session: observe()
    Session->>Browser: Get page state
    Browser-->>Session: Page elements
    Session-->>Agent: Observation
    Agent->>Agent: Plan action
    Agent->>Session: execute(action)
    Session->>Browser: Perform action
    Browser-->>Session: Result
    Session-->>Agent: ExecutionResult
    Agent->>Agent: Check completion
    Note over Agent,Browser: Loop until task complete or max_steps
```

## Error Handling

The system provides structured error handling for various failure scenarios.

| Error Class | Description | Resolution |
|-------------|-------------|------------|
| `InvalidPlaceholderError` | Vault credential unavailable | Select alternative value |
| `ScrapeFailedError` | Data extraction failed | Retry with different instructions |
| `InvalidA11yTreeType` | Unknown accessibility tree type | Check code implementation |
| `InvalidA11yChildrenError` | Element hierarchy mismatch | Verify page structure |

资料来源：[packages/notte-core/src/notte_core/errors/processing.py:1-100]()

## SDK Usage Example

```python
from notte_sdk import NotteClient

client = NotteClient()

# Basic agent usage
with client.Session(open_viewer=True) as session:
    agent = client.Agent(session=session, max_steps=10)
    response = agent.run(
        task="Find the search box and search for 'python tutorials'",
        url="https://www.google.com"
    )
    print(response.answer)

# With persona (digital identity)
with client.Persona(create_phone_number=True) as persona:
    with client.Session(browser_type="chrome") as session:
        agent = client.Agent(session=session, persona=persona, max_steps=15)
        response = agent.run(
            task="Complete the registration form",
            url="https://example.com/register"
        )
```

## Best Practices

1. **Set appropriate max_steps** - Balance between task completion and resource usage
2. **Use fast perception** for simple tasks, deep perception for complex page analysis
3. **Implement vault storage** for reusable credentials across sessions
4. **Handle errors gracefully** - Check execution results before proceeding
5. **Use stealth features** when bypassing detection is required

---

<a id='structured-output'></a>

## Structured Output

### 相关页面

相关主题：[Agent Core System](#agent-core)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-core/src/notte_core/utils/pydantic_schema.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/utils/pydantic_schema.py)
- [packages/notte-browser/src/notte_browser/scraping/schema.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/scraping/schema.py)
- [packages/notte-llm/src/notte_llm/prompts/generate-json-schema/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-llm/src/notte_llm/prompts/generate-json-schema/system.md)
- [packages/notte-llm/src/notte_llm/prompts/extract-without-json-schema/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-llm/src/notte_llm/prompts/extract-without-json-schema/system.md)
</details>

# Structured Output

Structured Output enables Notte to extract web page content and return it as typed, structured data using Pydantic models. This feature bridges the gap between unstructured web content and programmatic data processing, allowing developers to define expected output schemas and receive validated, type-safe data.

## Overview

Notte's Structured Output system consists of two complementary approaches:

1. **Schema-Based Extraction**: Uses dynamically generated JSON schemas from Pydantic models
2. **Natural Language Extraction**: Uses instructions-based extraction without strict schema enforcement

The system leverages Pydantic for schema definition, ensuring type safety and validation at the application layer. When a `response_format` is provided, Notte generates a corresponding JSON Schema that guides the LLM in producing correctly structured output.

资料来源：[packages/notte-core/src/notte_core/utils/pydantic_schema.py](packages/notte-core/src/notte_core/utils/pydantic_schema.py)

## Architecture

```mermaid
graph TD
    A[User Request] --> B[Define Pydantic Model]
    B --> C[response_format Parameter]
    C --> D{Schema Type}
    D -->|With Schema| E[Generate JSON Schema]
    D -->|Without Schema| F[Instructions-Only Extraction]
    E --> G[Prompt Engineering]
    F --> G
    G --> H[LLM Processing]
    H --> I[Output Validation]
    I --> J[Typed Response]
```

## Core Components

### Pydantic Schema Generation

The `pydantic_schema.py` module provides utilities for converting Python Pydantic models into JSON schemas that can be consumed by LLMs.

| Function | Purpose |
|----------|---------|
| `model_to_json_schema()` | Converts Pydantic model class to JSON schema |
| `validate_response()` | Validates LLM output against expected schema |
| `extract_structured_data()` | Extracts and parses structured data from response |

资料来源：[packages/notte-core/src/notte_core/utils/pydantic_schema.py](packages/notte-core/src/notte_core/utils/pydantic_schema.py)

### Scraping Schema

The `schema.py` module defines the scraping configuration and response handling for structured output.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `response_format` | `type[BaseModel]` | `None` | Pydantic model for structured output |
| `instructions` | `str` | `None` | Natural language extraction instructions |
| `raise_on_failure` | `bool` | `True` | Raise exception on extraction failure |

资料来源：[packages/notte-browser/src/notte_browser/scraping/schema.py](packages/notte-browser/src/notte_browser/scraping/schema.py)

## Usage Patterns

### Basic Structured Extraction

```python
from notte_sdk import NotteClient
from pydantic import BaseModel

class Product(BaseModel):
    title: str
    price: str
    description: str | None = None

client = NotteClient()
data = client.scrape(
    url="https://example.com/product",
    response_format=Product,
    instructions="Extract the product title, price, and description"
)
```

资料来源：[packages/notte-sdk/src/notte_sdk/client.py](packages/notte-sdk/src/notte_sdk/client.py)

### Session-Based Extraction

```python
from notte_sdk import NotteClient
from pydantic import BaseModel

class Article(BaseModel):
    title: str
    content: str
    date: str

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://example.com/blog")
    data = session.scrape(
        response_format=Article,
        instructions="Extract the title, date and content of the articles"
    )
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

### Error Handling

```python
from notte_sdk import NotteClient
from notte_core.errors import ScrapeFailedError

client = NotteClient()

# With raise_on_failure=False, returns StructuredData wrapper
result = client.scrape(
    url="https://example.com",
    response_format=Product,
    raise_on_failure=False
)

if not result.success:
    print(f"Extraction failed: {result.error}")
else:
    data = result.data
```

资料来源：[packages/notte-core/src/notte_core/errors/processing.py](packages/notte-core/src/notte_core/errors/processing.py)

## Prompt Engineering

### JSON Schema Generation Prompt

The `generate-json-schema/system.md` template guides the LLM in producing valid JSON output conforming to a specified schema. This prompt includes:

- Success examples showing correct JSON output
- Failure examples demonstrating invalid output handling
- Timestamp context for time-sensitive extraction

```markdown
Today is: {{timestamp}}

Transform the following document into structured JSON output based on the provided user request:

```markdown
{{& document}}
```
```

资料来源：[packages/notte-llm/src/notte_llm/prompts/generate-json-schema/system.md](packages/notte-llm/src/notte_llm/prompts/generate-json-schema/system.md)

### Extraction Without Schema

The `extract-without-json-schema/system.md` template provides an alternative approach for natural language extraction:

```markdown
```json
{{& success_example}}
```

Example of a valid output if you cannot answer the user request:
```json
{{& failure_example}}
```
```

This approach allows flexible extraction when strict schema conformance is not required.

资料来源：[packages/notte-llm/src/notte_llm/prompts/extract-without-json-schema/system.md](packages/notte-llm/src/notte_llm/prompts/extract-without-json-schema/system.md)

## Scrape Action Configuration

The underlying scrape action provides granular control over extraction:

```python
session.execute(type="scrape", only_images=True)  # Scrape only images
session.execute(type="scrape", response_format={"type": "object", "properties": {...}})  # With JSON schema
```

| Action Parameter | Description |
|-----------------|-------------|
| `instructions` | Natural language instructions for extraction |
| `only_main_content` | Exclude navbars, footers (default: `True`) |
| `selector` | Playwright selector to scope extraction |
| `only_images` | Extract images only |
| `scrape_links` | Include links in output (default: `True`) |
| `scrape_images` | Include image data |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py](packages/notte-core/src/notte_core/actions/actions.py)

## Workflow

```mermaid
sequenceDiagram
    participant User
    participant SDK as NotteClient
    participant LLM as LLM Engine
    participant API as Notte API

    User->>SDK: scrape(url, response_format=Model)
    SDK->>LLM: Generate JSON Schema from Model
    SDK->>API: POST /scrape with schema + instructions
    API->>LLM: Process page with schema
    LLM-->>API: Structured JSON response
    API-->>SDK: Validated response
    SDK-->>User: Typed Model instance
```

## Best Practices

1. **Define Clear Schemas**: Use descriptive field names and include type annotations
2. **Provide Contextual Instructions**: Give the LLM context about what to extract
3. **Handle Optional Fields**: Use `| None` for fields that may not always be present
4. **Validate Output**: Enable `raise_on_failure=True` for production use
5. **Scope Extraction**: Use selectors when extracting from specific page regions

## Error Handling

| Error Type | Cause | Resolution |
|------------|-------|------------|
| `ScrapeFailedError` | Extraction validation failed | Check instructions and schema compatibility |
| `LLMParsingError` | Malformed JSON in response | Ensure schema is properly generated |
| `InvalidPlaceholderError` | Missing credential reference | Configure vault for required credentials |

资料来源：[packages/notte-core/src/notte_core/errors/processing.py](packages/notte-core/src/notte_core/errors/processing.py)

---

<a id='agent-fallback'></a>

## Agent Fallback System

### 相关页面

相关主题：[Agent Core System](#agent-core), [Actions and Browser Controls](#actions-controls)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-agent/src/notte_agent/agent_fallback.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/agent_fallback.py)
- [packages/notte-sdk/src/notte_sdk/agent_fallback.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/agent_fallback.py)
- [packages/notte-agent/src/notte_agent/falco/prompt.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/prompt.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)
- [packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)
</details>

# Agent Fallback System

## Overview

The Agent Fallback System in Notte provides a robust mechanism for handling agent execution failures and gracefully recovering from errors during browser automation tasks. When an agent encounters execution issues—whether due to captcha detection, action failures, or other runtime errors—the fallback system intercepts these failures, analyzes the error context, and determines the appropriate recovery strategy.

The system operates across two primary layers: the SDK layer (`notte_sdk`) and the agent layer (`notte_agent`), ensuring consistent error handling and recovery whether agents are executed locally or through the Notte API.

## Architecture

```mermaid
graph TD
    A[Agent Execution] --> B{Action Execution Success?}
    B -->|Yes| C[Continue Normal Flow]
    B -->|No| D[Fallback System Triggered]
    D --> E{Error Type Classification}
    E -->|Captcha| F[Captcha Handler]
    E -->|Action Failure| G[Retry Strategy]
    E -->|Fatal Error| H[Graceful Degradation]
    F --> I[Attempt Resolution]
    G --> J[Apply Retry Policy]
    H --> K[Report to User]
    I --> C
    J --> C
    K --> L[Session Cleanup]
```

## Core Components

### FallbackAction Class

The `FallbackAction` class represents the fundamental unit of fallback handling. It encapsulates the error context and provides structured data for downstream processing.

| Property | Type | Description |
|----------|------|-------------|
| `error_type` | `str` | Classification of the error (e.g., "captcha", "action_failure") |
| `error_message` | `str` | Detailed description of the failure |
| `metadata` | `dict` | Additional context including HTML element data, playwright code, and execution state |
| `retry_count` | `int` | Number of retry attempts performed |
| `timestamp` | `datetime` | When the error occurred |

资料来源：[packages/notte-agent/src/notte_agent/agent_fallback.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/agent_fallback.py)

### AgentFallbackManager

The `AgentFallbackManager` serves as the central coordinator for fallback operations, managing the lifecycle of error handling and recovery strategies.

```mermaid
sequenceDiagram
    participant Agent as Agent
    participant FallbackManager as FallbackManager
    participant RecoveryStrategy as RecoveryStrategy
    participant Session as Session
    
    Agent->>FallbackManager: Register failure context
    FallbackManager->>FallbackManager: Classify error type
    FallbackManager->>RecoveryStrategy: Select appropriate strategy
    RecoveryStrategy->>Session: Apply recovery action
    Session-->>Agent: Resume or terminate
```

#### Key Methods

| Method | Purpose |
|--------|---------|
| `register_failure()` | Records a new failure event in the fallback system |
| `classify_error()` | Determines the error category based on error characteristics |
| `select_recovery_strategy()` | Chooses the optimal recovery approach |
| `apply_recovery()` | Executes the chosen recovery mechanism |
| `should_retry()` | Evaluates whether additional attempts are warranted |

资料来源：[packages/notte-sdk/src/notte_sdk/agent_fallback.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/agent_fallback.py)

## Error Classification

The fallback system categorizes failures into distinct types, each with tailored handling strategies:

| Error Type | Description | Default Recovery |
|------------|-------------|------------------|
| `captcha` | CAPTCHA or verification challenges detected | Initiate captcha solving flow |
| `action_failure` | Interactive element action failed | Retry with modified selectors |
| `navigation_error` | Page navigation or URL resolution failure | Retry with exponential backoff |
| `timeout_error` | Operation exceeded time limits | Extend timeout and retry |
| `invalid_state` | Agent reached an inconsistent state | Reset to known good state |
| `fatal_error` | Unrecoverable error requiring termination | Graceful session cleanup |

## Recovery Strategies

### 1. Captcha Resolution Strategy

When the system detects CAPTCHA challenges, it automatically triggers the captcha solving mechanism.

```python
# Captcha detection triggers automatic resolution
if error_type == "captcha":
    captcha_type = detect_captcha_type(metadata)
    captcha_action = CaptchaSolveAction(captcha_type=captcha_type)
    result = session.execute(captcha_action)
```

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)

### 2. Retry with Backoff Strategy

For transient failures, the system implements configurable retry logic with exponential backoff:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_retries` | `3` | Maximum retry attempts |
| `base_delay` | `1000` | Initial delay in milliseconds |
| `backoff_factor` | `2.0` | Multiplier for each retry |
| `jitter` | `True` | Randomization to prevent thundering herd |

### 3. Graceful Degradation Strategy

When recovery is not possible, the system performs controlled cleanup:

```python
# Graceful termination sequence
try:
    session.stop()
except Exception as e:
    logger.warning(f"Session cleanup warning: {e}")
finally:
    fallback_manager.record_final_state(error_context)
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)

## Integration with Session Management

The fallback system integrates tightly with the Notte session lifecycle:

```mermaid
graph LR
    A[Session Start] --> B[Agent Initialization]
    B --> C[Task Execution]
    C --> D{Success?}
    D -->|Yes| E[Complete Task]
    D -->|No| F[Fallback Check]
    F --> G{Retryable?}
    G -->|Yes| H[Apply Recovery]
    G -->|No| I[Log Failure]
    H --> C
    I --> J[Session Cleanup]
    E --> J
```

### Session Callback Integration

The SDK exposes callback hooks for fallback integration:

```python
session.on_failure(callback=fallback_manager.handle_failure)
session.on_retry(callback=fallback_manager.prepare_retry)
session.on_success(callback=fallback_manager.record_success)
```

## Configuration Options

| Configuration | Type | Default | Description |
|---------------|------|---------|-------------|
| `fallback_enabled` | `bool` | `True` | Enable/disable fallback system |
| `max_retry_attempts` | `int` | `3` | Global retry limit |
| `fallback_timeout_ms` | `int` | `30000` | Timeout for fallback operations |
| `capture_screenshots` | `bool` | `True` | Screenshot on failure for debugging |
| `verbose_logging` | `bool` | `False` | Detailed fallback logging |

## Usage Patterns

### Basic Usage with SDK

```python
from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(session=session)
    agent.start(task="Navigate and extract data", url="https://example.com")
    
    # Fallback system handles failures automatically
    status = agent.status()
```

资料来源：[packages/notte-sdk/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/README.md)

### Custom Fallback Handler

```python
class CustomFallbackHandler:
    def handle_failure(self, fallback_action: FallbackAction) -> RecoveryResult:
        # Custom logic for specific error types
        if fallback_action.error_type == "captcha":
            return RecoveryResult(strategy="custom_captcha_solver")
        return RecoveryResult(strategy="default")
```

## Error Reporting and Monitoring

The fallback system generates structured error reports containing:

| Field | Description |
|-------|-------------|
| `session_id` | Unique identifier for the session |
| `agent_id` | Identifier of the agent that failed |
| `error_type` | Classification of the failure |
| `error_message` | Human-readable error description |
| `playwright_code` | Relevant browser automation code |
| `html_element` | DOM element context when available |
| `timestamp` | ISO 8601 timestamp of the failure |
| `retry_history` | Array of previous retry attempts |

## Best Practices

1. **Enable Verbose Logging During Development**: Set `verbose_logging=True` to capture detailed fallback behavior
2. **Configure Appropriate Timeouts**: Match `fallback_timeout_ms` to your expected operation durations
3. **Monitor Retry Counts**: Track `retry_count` to identify persistent issues
4. **Preserve Error Context**: Always include `metadata` for effective debugging
5. **Test Fallback Paths**: Regularly validate fallback behavior under failure conditions

## Related Components

| Component | Purpose |
|-----------|---------|
| `FalcoPrompt` | Agent instruction and action generation |
| `SessionManager` | Browser session lifecycle |
| `CaptchaSolveAction` | Specialized captcha resolution |
| `ScrapeAction` | Data extraction with fallback support |
| `ErrorProcessing` | Low-level error handling |

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/prompt.py)

---

<a id='sessions'></a>

## Browser Sessions

### 相关页面

相关主题：[Actions and Browser Controls](#actions-controls), [Vaults and Credential Management](#vaults-credentials)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-browser/src/notte_browser/session.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/session.py)
- [packages/notte-browser/src/notte_browser/captcha.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/captcha.py)
- [packages/notte-browser/src/notte_browser/controller.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/controller.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)
- [packages/notte-core/src/notte_core/actions/actions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/actions/actions.py)
- [packages/notte-mcp/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-mcp/README.md)
- [packages/notte-sdk/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/README.md)
</details>

# Browser Sessions

## Overview

Browser Sessions in Notte provide a managed environment for automating web interactions through a cloud-based browser infrastructure. Sessions encapsulate the state of a browser instance, allowing developers to navigate websites, interact with elements, extract data, and handle complex web automation tasks programmatically.

The session system abstracts away the complexities of browser automation, providing a high-level API for:
- Navigating to URLs and managing browser tabs
- Observing page elements and available actions
- Executing automated interactions (clicks, form fills, scrolling)
- Scraping structured and unstructured data from web pages
- Solving CAPTCHA challenges automatically

资料来源：[packages/notte-sdk/README.md]()

## Architecture

```mermaid
graph TD
    A[NotteClient] --> B[Session]
    B --> C[Browser Controller]
    C --> D[Cloud Browser Instance]
    E[Actions] --> B
    B --> F[Observations]
    B --> G[Scrape Results]
    H[Captcha Handler] --> C
```

### Core Components

| Component | Package | Responsibility |
|-----------|---------|----------------|
| `Session` | `notte-sdk` | High-level API for session management |
| `BrowserController` | `notte-browser` | Low-level browser control and state |
| `CaptchaSolver` | `notte-browser` | Automatic CAPTCHA resolution |
| `ActionExecutor` | `notte-core` | Action execution and validation |

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py]()
资料来源：[packages/notte-browser/src/notte_browser/controller.py]()

## Session Lifecycle

### Starting a Session

Sessions can be initialized using the context manager pattern for automatic cleanup:

```python
from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://www.example.com")
    # Perform actions...
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:30-40]()

### Session States

```mermaid
stateDiagram-v2
    [*] --> Idle: Client initialized
    Idle --> Active: Session started
    Active --> Active: Actions executed
    Active --> Stopping: stop() called
    Stopping --> Stopped: Cleanup complete
    Stopped --> [*]: Context exited
```

### Stopping a Session

When a session stops, cookies are automatically saved if configured:

```python
def stop(self) -> None:
    if self._cookie_file is not None:
        try:
            cookies = self.get_cookies()
            create_or_append_cookies_to_file(self._cookie_file, cookies)
        except Exception as e:
            logger.error(f"🍪 Error saving cookies to {self._cookie_file}: {e}")
    self.response = self.client.stop(session_id=self.session_id)
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:50-65]()

## Session Configuration

### Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `timeout_minutes` | `int` | `5` | Session timeout in minutes |
| `proxies` | `bool` | `True` | Enable proxy support |
| `cookie_file` | `str` | `None` | Path to cookie persistence file |
| `open_viewer` | `bool` | `False` | Open visual browser viewer |

### Creating a Configured Session

```python
with client.Session(timeout_minutes=10, open_viewer=True) as session:
    status = session.status()
    session.viewer()
```

资料来源：[packages/notte-sdk/README.md]()

## Page Interaction

### Observation

The `observe()` method retrieves interactive elements on the current page. Notte supports two perception modes:

#### Fast Perception
A quick, simple page scan for basic element identification:

```python
obs = session.observe(perception_type='fast')
```

#### Deep Perception
LLM-powered analysis for richer action space generation:

```python
obs = session.observe(perception_type='deep')
print(obs.space.description)
```

#### Focused Observation
Use `instructions` to narrow the action space to a specific intent:

```python
actions = session.observe(instructions="Fill the email input")
print(actions[0].model_dump())
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:90-110]()

### Element Identification

Elements are identified using structured IDs with the format `<role_first_letter><index>[:]`:

| Role Letter | Element Type |
|-------------|--------------|
| `I` | Input fields (textbox, select, checkbox) |
| `B` | Buttons |
| `L` | Links |
| `F` | Figures and images |
| `O` | Options in select elements |
| `M` | Miscellaneous (modals, dialogs) |

Example element ID: `I2[:]<button>` represents an input field at index 2.

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md]()

## Browser Actions

### Action Types

Notte provides comprehensive browser automation actions:

#### Navigation Actions

| Action | Description | Parameters |
|--------|-------------|------------|
| `goto` | Navigate to a URL | `url: str` |
| `goto_new_tab` | Open URL in new tab | `url: str` |
| `close_tab` | Close current tab | - |
| `scroll` | Scroll the page | `direction: str`, `amount: int` |
| `scroll_to` | Scroll to element | `id: str` |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py]()

#### Interaction Actions

| Action | Description | Parameters |
|--------|-------------|------------|
| `click` | Click an element | `id: str` |
| `fill` | Fill input field | `id: str`, `value: str` |
| `select` | Select option | `id: str`, `option: str` |
| `check` | Check checkbox | `id: str` |
| `press` | Press keyboard key | `key: str` |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py]()

#### Data Extraction Actions

```python
# Scrape entire page
markdown = session.scrape()

# Scrape with instructions
result = session.scrape(instructions="Extract title and price")

# Scrape only images
session.scrape(only_images=True)

# Structured scraping with JSON schema
session.scrape(response_format={"type": "object", "properties": {...}})
```

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:30-60]()

### Executing Actions

```python
# Get observations
obs = session.observe()

# Sample and execute an action
action = obs.space.sample(type='click')
result = session.execute(action)
assert result.success
```

资料来源：[packages/notte-sdk/README.md]()

## CAPTCHA Handling

Notte includes automatic CAPTCHA solving capabilities:

```python
session.execute(type="captcha_solve", captcha_type="recaptcha")
session.execute(type="captcha_solve")  # Auto-detect
```

### Supported CAPTCHA Types

| Type | Description |
|------|-------------|
| `recaptcha` | Google reCAPTCHA |
| `hcaptcha` | hCaptcha |
| `image` | Image-based CAPTCHA |
| `text` | Text-based CAPTCHA |
| `auth0` | Auth0 CAPTCHA |
| `cloudflare` | Cloudflare CAPTCHA |
| `datadome` | DataDome CAPTCHA |
| `arkose labs` | Arkose Labs CAPTCHA |
| `geetest` | Geetest CAPTCHA |
| `press&hold` | Press and hold challenge |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:70-95]()

## MCP Server Integration

Notte sessions can be accessed via the Model Context Protocol (MCP):

### Available Tools

| Tool | Description |
|------|-------------|
| `notte_new_session` | Start a new cloud browser session |
| `notte_list_sessions` | List all active sessions |
| `notte_stop_session` | Stop the current session |
| `notte_observe` | Observe elements on current page |
| `notte_screenshot` | Take a screenshot |
| `notte_scrape` | Extract structured data |
| `notte_step` | Execute an action |

资料来源：[packages/notte-mcp/README.md]()

### Server Setup

```bash
pip install notte-mcp
export NOTTE_API_KEY="your-api-key"
python -m notte_mcp.server
```

资料来源：[packages/notte-mcp/README.md]()

## Cookie Management

Sessions automatically persist cookies for authenticated workflows:

```python
with client.Session(cookie_file="cookies.json") as session:
    session.execute(type="goto", url="https://example.com/login")
    # Login once - cookies saved automatically on exit
```

On subsequent runs, cookies are loaded automatically:

```python
def get_cookies(self) -> dict:
    """Load cookies from file"""
    # Implementation handles file existence
    pass

def create_or_append_cookies_to_file(self, cookies: dict) -> None:
    """Persist cookies after session"""
    pass
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:45-65]()

## Error Handling

### Session Errors

```python
try:
    with client.Session() as session:
        session.execute(type="goto", url="https://example.com")
except ValueError as e:
    # Session not started
    print(f"Error: {e}")
except RuntimeError as e:
    # Session failed to close
    print(f"Error: {e}")
```

### Action Execution Errors

```python
result = session.execute(action, raise_on_failure=True)
if not result.success:
    print(f"Action failed: {result.error}")
```

### Graceful Shutdown

The context manager ensures proper cleanup even on exceptions:

```python
with client.Session() as session:
    try:
        # Perform actions
        session.execute(type="goto", url="https://example.com")
    except Exception as e:
        logger.error(f"Session error: {e}")
    finally:
        # Cleanup happens automatically
        pass
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:50-70]()

## Best Practices

### Resource Management
- Always use the `with` statement for automatic session cleanup
- Set appropriate `timeout_minutes` based on task complexity
- Enable `open_viewer=True` for debugging complex interactions

### Performance Optimization
- Use `perception_type='fast'` for simple, quick operations
- Use `perception_type='deep'` when LLM interpretation is needed
- Filter observations with `instructions` to reduce processing overhead

### Reliability
- Implement retry logic for flaky network conditions
- Handle CAPTCHAs proactively using the built-in solver
- Persist cookies for authenticated workflows

### Security
- Store API keys in environment variables
- Never commit cookie files with sensitive credentials to version control
- Rotate proxy configurations periodically

## API Reference

### NotteClient.Session

| Method | Description |
|--------|-------------|
| `__enter__()` | Start session |
| `__exit__()` | Stop session |
| `execute()` | Execute browser action |
| `observe()` | Get page elements |
| `scrape()` | Extract page data |
| `status()` | Get session status |
| `viewer()` | Open visual viewer |
| `cdp_url()` | Get CDP connection URL |

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py]()

---

<a id='actions-controls'></a>

## Actions and Browser Controls

### 相关页面

相关主题：[Browser Sessions](#sessions), [Agent Core System](#agent-core)

<details>
<summary>Related Source Files</summary>

以下源码文件用于生成本页说明：

- [packages/notte-core/src/notte_core/actions/actions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/actions/actions.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/sessions.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/sessions.py)
- [packages/notte-agent/src/notte_agent/falco/prompt.py](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/prompt.py)
- [packages/notte-agent/src/notte_agent/gufo/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/gufo/system.md)
- [packages/notte-agent/src/notte_agent/falco/system.md](https://github.com/nottelabs/notte/blob/main/packages/notte-agent/src/notte_agent/falco/system.md)
</details>

# Actions and Browser Controls

## Overview

Actions and Browser Controls form the core interaction layer of the Notte system, enabling AI agents to navigate, interact with, and extract data from web pages. The action system provides a structured, type-safe mechanism for executing browser operations through a unified API.

Actions in Notte are classified into two primary categories:

1. **Browser Actions** - High-level navigation and tab management operations
2. **Interaction Actions** - Element-level interactions such as clicking, filling forms, and solving captchas

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:1-100]()

## Action Hierarchy

Notte implements a class-based action system using Python's type system for validation and safety. All actions inherit from base classes that define common behavior.

```mermaid
graph TD
    A[BaseAction] --> B[BrowserAction]
    A --> C[InteractionAction]
    
    B --> D[GotoAction]
    B --> E[GotoNewTabAction]
    B --> F[CloseTabAction]
    
    C --> G[ClickAction]
    C --> H[FillAction]
    C --> I[ScrapeAction]
    C --> J[CaptchaSolveAction]
    C --> K[FormFillAction]
```

### Base Classes

| Class | Purpose | Key Properties |
|-------|---------|----------------|
| `BaseAction` | Abstract base for all actions | `type`, `description`, `param` |
| `BrowserAction` | Navigation and tab operations | `execution_message()` |
| `InteractionAction` | Element-level interactions | `id`, `text_label` |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:1-200]()

## Browser Actions

Browser Actions manage page navigation and tab lifecycle. These actions operate at the browser level rather than the page element level.

### GotoAction

Navigates the current tab to a specified URL.

```python
session.execute(type="goto", url="https://www.example.com")
```

| Property | Type | Description |
|----------|------|-------------|
| `type` | `Literal["goto"]` | Action type identifier |
| `url` | `str` | Target URL to navigate to |
| `param` | `ActionParameter` | Parameter definition for LLM tooling |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:50-80]()

### GotoNewTabAction

Opens a URL in a new browser tab. The action returns immediately without waiting for navigation completion.

```python
session.execute(type="goto_new_tab", url="https://www.example.com")
```

| Property | Type | Description |
|----------|------|-------------|
| `type` | `Literal["goto_new_tab"]` | Action type identifier |
| `url` | `str` | Target URL to open in new tab |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:82-115]()

### CloseTabAction

Closes the currently active browser tab.

```python
session.execute(type="close_tab")
```

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:117-140]()

## Interaction Actions

Interaction Actions target specific page elements identified by their DOM IDs. These actions require the agent to first observe the page to obtain valid element identifiers.

### ClickAction

Simulates a mouse click on a page element.

```python
session.execute(type="click", id="submit-button")
session.execute(type="click", id="B1", text_label="Submit Form")
```

| Property | Type | Description |
|----------|------|-------------|
| `type` | `Literal["click"]` | Action type identifier |
| `id` | `str` | Element identifier from page observation |
| `text_label` | `str \| None` | Optional text label for logging |

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:150-180]()

### FillAction

Fills an input field with a specified value. By default, the field is cleared before filling.

```python
session.execute(type="fill", id="email-input", value="user@example.com")
session.execute(type="fill", id="name-input", value="John Doe", clear_before_fill=False)
```

| Property | Type | Default | Description |
|----------|------|---------|-------------|
| `id` | `str` | - | Element identifier |
| `value` | `str \| ValueWithPlaceholder` | - | Value to fill |
| `clear_before_fill` | `bool` | `True` | Clear field before filling |
| `text_label` | `str \| None` | `None` | Descriptive label |

The `ValueWithPlaceholder` type allows for complex fill values with embedded placeholders that may be resolved dynamically.

资料来源：[packages/notte-core/src/notte_core/actions/actions.py:182-230]()

### FormFillAction

Supports batch filling of multiple form fields in a single action, improving efficiency for multi-field forms.

```python
form_values = {
    "address1": "123 Main St",
    "city": "San Francisco",
    "state": "CA",
}
session.execute(type="form_fill", value=form_values)
```

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py:50-70]()

### CaptchaSolveAction

Attempts to solve captcha challenges detected on the page.

```python
session.execute(type="captcha_solve", captcha_type="recaptcha")
```

**Supported Captcha Types:**
- `recaptcha` - Google reCAPTCHA
- `hcaptcha` - hCaptcha
- Image verification challenges

> **Critical Rule:** Agents must never click on captcha elements directly. When any captcha is detected, the `captcha_solve` action must be used exclusively.

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md:40-60]()

### ScrapeAction

Extracts structured data from the current page based on natural language instructions.

```python
session.execute(
    type="scrape",
    instructions="Extract the search results from the Google search page"
)
```

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py:80-90]()

## Action Execution Flow

The following diagram illustrates how actions flow through the Notte system from agent decision to browser execution:

```mermaid
sequenceDiagram
    participant Agent
    participant SDK as Notte SDK
    participant API as Notte API
    participant Browser as Browser Engine
    
    Agent->>SDK: session.execute(type="click", id="B1")
    SDK->>API: POST /sessions/{id}/execute
    API->>Browser: playwright.click("#B1")
    Browser-->>API: Success/Failure
    API-->>SDK: ExecutionResult
    SDK-->>Agent: ExecutionResult
    
    Note over Agent,Browser: Action parameters validated<br/>at SDK layer
```

### Session-Level Execution

Actions are executed within a `Session` context that maintains browser state and provides observation capabilities:

```python
from notte_sdk import Notte

notte = Notte(api_key="your-api-key")
session = notte.sessions.create()

# Execute actions
session.execute(type="goto", url="https://example.com")

# Observe available actions
obs = session.observe()
print(obs.space.description)
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-60]()

### Perception Types

When observing available actions, the system supports different perception depths:

| Perception Type | Description | Use Case |
|-----------------|-------------|----------|
| `fast` | Simple page perception | Quick action queries |
| `deep` | LLM-powered element formatting | Rich, structured action space |

```python
# Fast perception for quick queries
obs = session.observe(perception_type="fast")

# Deep perception for comprehensive action space
obs = session.observe(perception_type="deep")
```

The `instructions` parameter can narrow the action space to a specific intent:

```python
actions = session.observe(instructions="Fill the email input")
```

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:40-70]()

## Action Registry

The `ActionRegistry` manages available actions and generates JSON schemas for agent tooling:

```python
class ActionRegistry:
    def __init__(self, tools: list[BaseTool] | None = None) -> None:
        self.tools = tools or []
    
    def get_schema(self, action_cls: type) -> dict[str, Any]:
        # Generates JSON schema for action class
```

The registry processes each tool to extract action descriptions and parameter schemas, enabling dynamic action space generation for agents.

资料来源：[packages/notte-agent/src/notte_agent/falco/prompt.py:1-50]()

## Element ID Format

Interactive elements are referenced using a structured ID format that encodes element type and position:

| Prefix | Element Type | Example |
|--------|--------------|---------|
| `I` | Input fields (textbox, select, checkbox) | `I1`, `I2` |
| `B` | Buttons | `B1`, `B2` |
| `L` | Links | `L1`, `L2` |
| `F` | Figures and images | `F1` |
| `O` | Options in select elements | `O1` |
| `M` | Miscellaneous (modals, dialogs) | `M1` |

> **Important:** Element IDs can and will change at each page observation. Agents must not cache or assume ID persistence across steps.

资料来源：[packages/notte-agent/src/notte_agent/gufo/system.md:10-30]()

## Action Response Format

All action executions return an `ExecutionResult` containing:

| Field | Type | Description |
|-------|------|-------------|
| `success` | `bool` | Whether the action succeeded |
| `message` | `str` | Human-readable result description |
| `error` | `str \| None` | Error details if failed |

When `raise_on_failure` is set, execution will raise an exception on failure; otherwise, the result is returned with error information included.

## Best Practices

1. **Always observe before acting** - Use `session.observe()` to get current element IDs before executing interaction actions
2. **Handle captchas properly** - Use `captcha_solve` when captchas are detected; never click captcha elements directly
3. **Validate IDs per step** - Element IDs change between observations; never assume ID stability
4. **Use batch operations** - Prefer `FormFillAction` over multiple `FillAction` calls for multi-field forms
5. **Set appropriate perception types** - Use `fast` for quick checks, `deep` when comprehensive element understanding is needed

---

<a id='vaults-credentials'></a>

## Vaults and Credential Management

### 相关页面

相关主题：[Agent Personas](#personas), [Browser Sessions](#sessions)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-browser/src/notte_browser/vault.py](https://github.com/nottelabs/notte/blob/main/packages/notte-browser/src/notte_browser/vault.py)
- [packages/notte-core/src/notte_core/credentials/base.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/credentials/base.py)
- [packages/notte-core/src/notte_core/credentials/types.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/credentials/types.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/vaults.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/vaults.py)
- [packages/notte-sdk/src/notte_sdk/endpoints/personas.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/personas.py)
- [packages/notte-integrations/src/notte_integrations/credentials/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-integrations/src/notte_integrations/credentials/README.md)
</details>

# Vaults and Credential Management

## Overview

Notte's Vaults and Credential Management system provides a secure, centralized mechanism for storing, retrieving, and automatically injecting website credentials during browser automation tasks. The system eliminates the need for agents to manually handle authentication credentials, reducing security risks and improving automation reliability.

The vault architecture spans multiple layers:

| Layer | Package | Purpose |
|-------|---------|---------|
| Core Types | `notte-core` | Defines credential data models and base classes |
| Browser Integration | `notte-browser` | Handles credential injection during page interactions |
| SDK API | `notte-sdk` | Provides REST API endpoints for vault operations |
| External Integrations | `notte-integrations` | Supports third-party vault solutions (e.g., HashiCorp Vault) |

## Architecture

```mermaid
graph TD
    A[NotteClient] --> B[VaultsClient]
    B --> C[Cloud Vault API]
    C --> D[NotteVault]
    
    E[Agent Session] --> F[Browser Session]
    F --> G[NotteBrowser Vault]
    G --> D
    
    H[Persona] --> I[Vault Association]
    I --> D
    
    J[HashiCorpVault] --> K[External Vault Server]
    
    D -.-> L[Credential Replacement]
    L --> F
```

## Credential Types

### Base Credential Model

The credential system is built on a flexible data model defined in `notte-core`. Credentials are identified by URL and support multiple authentication fields:

```python
# Simplified from packages/notte-core/src/notte_core/credentials/types.py
class Credential(BaseModel):
    url: str                    # Target website URL
    username: str | None = None # Username or email
    email: str | None = None    # Email address
    password: str | None = None # Password
    totp_secret: str | None = None # TOTP 2FA secret
    notes: str | None = None    # Additional notes
    metadata: dict | None = None # Custom metadata
```

### Credential Actions

When an agent needs to authenticate, the system creates a `CredentialAction` that describes the required credential field:

```python
# From packages/notte-core/src/notte_core/credentials/types.py
class CredentialAction(BaseModel):
    url: str                          # Target URL
    action: Literal["fill", "verify"] # Operation type
    field: Literal["username", "password", "email", "totp"] # Required field
    locator: LocatorAttributes | None = None # DOM element context
```

## Vault Implementation

### Browser Vault (`notte-browser`)

The browser-side vault (`NotteBrowserVault`) manages credential lifecycle within browser sessions:

```python
# From packages/notte-browser/src/notte_browser/vault.py
class NotteBrowserVault:
    def __init__(
        self,
        vault_id: str | None,
        api_key: str | None,
        server_url: str | None,
        verbose: bool = False,
    ) -> None:
        self.vault_id = vault_id
        self.vault: NotteVault | None = None
        self._api_key = api_key
        self._server_url = server_url
```

**Key Methods:**

| Method | Purpose | Source |
|--------|---------|--------|
| `request_credentials()` | Retrieves credentials from vault for current URL | [vault.py:1-100]() |
| `replace_credentials()` | Injects credentials into action attributes | [vault.py:1-100]() |
| `add_credentials()` | Stores new credentials in vault | [vault.py:1-100]() |
| `generate_password()` | Creates secure random passwords | [vault.py:1-100]() |

### Credential Replacement Flow

When executing form-fill actions, the vault automatically replaces credential placeholders:

```mermaid
sequenceDiagram
    participant Agent
    participant Session
    participant Vault
    participant Browser
    
    Agent->>Session: Execute FormFillAction
    Session->>Vault: request_credentials(url)
    Vault->>Session: Return Credential
    Session->>Browser: Replace placeholders with actual values
    Browser->>Website: Submit filled form
```

The replacement logic in `session.py` demonstrates this:

```python
# From packages/notte-browser/src/notte_browser/session.py
if locator is not None:
    attrs = LocatorAttributes(
        type=await locator.get_attribute("type"),
        autocomplete=await locator.get_attribute("autocomplete"),
        outerHTML=await locator.evaluate("el => el.outerHTML"),
    )
return await self.vault.replace_credentials(action, attrs, snapshot)
```

## SDK API Endpoints

### VaultsClient

The `VaultsClient` provides programmatic access to vault operations:

```python
# From packages/notte-sdk/src/notte_sdk/endpoints/vaults.py
class VaultsClient:
    CREATE_VAULT = "vaults"
    
    @track_usage("cloud.vault.create")
    def create(self, **data: Unpack[VaultCreateRequestDict]) -> Vault:
        """Create a new vault"""
        
    def get(self, vault_id: str) -> str:
        """Retrieve vault by ID"""
        
    @track_usage("cloud.vault.credentials.add")
    def add_or_update_credentials(
        self, vault_id: str, **data: Unpack[AddCredentialsRequestDict]
    ) -> AddCredentialsResponse:
        """Add or update credentials in vault"""
```

### API Request Models

| Model | Fields | Purpose |
|-------|--------|---------|
| `VaultCreateRequest` | `name`, `description` | Create new vault |
| `AddCredentialsRequest` | `url`, `username`, `email`, `password`, `totp_secret` | Add credential entry |
| `AddCredentialsResponse` | `id`, `url`, `created_at` | Response confirmation |

## Persona Integration

Personas can be associated with vaults for persistent identity management:

```python
# From packages/notte-sdk/src/notte_sdk/endpoints/personas.py
class NottePersona:
    def _get_vault(self) -> NotteVault | None:
        """Get vault associated with this persona"""
        if self.info.vault_id is None:
            return None
        return NotteVault(self.info.vault_id, _client=self.vault_client)
    
    def add_credentials(self, url: str) -> None:
        """Add credentials to the persona's vault"""
        vault = self._get_vault()
        password = vault.generate_password()
        vault.add_credentials(url, email=self.info.email, password=password)
```

## External Vault Integration

### HashiCorp Vault

Notte supports integration with HashiCorp Vault for enterprise credential management:

```python
# From packages/notte-integrations/src/notte_integrations/credentials/README.md
from notte_agent.main import Agent
from notte_integrations.credentials.hashicorp.vault import HashiCorpVault
import os

vault = HashiCorpVault(
    url=os.getenv("VAULT_URL"),
    token=os.getenv("VAULT_DEV_ROOT_TOKEN_ID")
)

vault.add_credentials(
    url="https://x.com",
    username=os.getenv("TWITTER_USERNAME"),
    password=os.getenv("TWITTER_PASSWORD")
)

agent = Agent(vault=vault)
```

**Setup Requirements:**

1. Start HashiCorp Vault server:
   ```bash
   cd packages/notte-integrations/src/notte_integrations/credentials/hashicorp
   docker-compose --env-file ../../../../../.env up
   ```

2. Configure environment variables:
   ```bash
   VAULT_URL=http://0.0.0.0:8200
   VAULT_DEV_ROOT_TOKEN_ID=<your-token>
   ```

## Error Handling

The vault system handles credential-related errors gracefully:

```python
# From packages/notte-core/src/notte_core/errors/processing.py
class VaultCredentialError(NotteBaseError):
    def __init__(self, error_message: str) -> None:
        dev_message = "Unexpected error while requesting credentials from vault"
        super().__init__(
            agent_message=agent_message,
            user_message=user_message,
            dev_message=dev_message,
        )
```

**Common Error Scenarios:**

| Error | Cause | Resolution |
|-------|-------|------------|
| `CredentialNotFoundError` | No credentials stored for URL | Add credentials via SDK or console |
| `VaultCredentialError` | Vault unavailable or API failure | Check API key and network connectivity |
| `FieldMismatchError` | Vault lacks required credential field | Ensure credential has all required fields |

## Usage Patterns

### Basic Agent with Vault

```python
from notte_sdk import NotteClient

client = NotteClient()

with client.Vault() as vault:
    vault.add_credentials(
        url="https://github.com",
        email="user@example.com",
        password="secure-password"
    )
    
    with client.Session() as session:
        agent = client.Agent(session=session, vault=vault, max_steps=10)
        response = agent.run(
            task="go to twitter; login and go to my messages",
        )
```

### Auth Vault Agent Example

The `auth-vault-agent` example demonstrates secure GitHub authentication:

```python
# From examples/auth-vault-agent/README.md
with client.Vault() as vault:
    with client.Session(browser_type="chrome", open_viewer=True) as session:
        agent = client.Agent(session=session, vault=vault, max_steps=10)
        response = agent.run(
            task="go to twitter; login and go to my messages",
        )
```

## Security Considerations

1. **Credential Encryption**: Vaults store credentials encrypted on Notte's servers
2. **API Key Authentication**: All vault operations require valid API key
3. **Persona Isolation**: Each persona has its own vault for identity separation
4. **Session Cookies**: Cookies are automatically saved and restored with vault credentials 资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-100]()

## Summary

Notte's Vaults and Credential Management system provides:

- **Secure Storage**: Centralized credential repository with encryption
- **Automatic Injection**: Seamless credential replacement during browser automation
- **Multi-Provider Support**: Built-in Notte vaults and HashiCorp Vault integration
- **Persona Association**: Per-identity credential isolation
- **API-First Design**: Full programmatic control via SDK

---

<a id='personas'></a>

## Agent Personas

### 相关页面

相关主题：[Vaults and Credential Management](#vaults-credentials), [Browser Sessions](#sessions)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [packages/notte-sdk/src/notte_sdk/endpoints/personas.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/personas.py)
- [packages/notte-core/src/notte_core/credentials/base.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/credentials/base.py)
- [packages/notte-sdk/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/README.md)
</details>

# Agent Personas

Agent Personas provide digital identities that can be attached to Agent instances, enabling automated browser interactions with unique credentials, phone numbers, and automated 2FA handling.

## Overview

Agent Personas are specialized identity objects that provide your AI agents with realistic digital identities for web automation tasks. They solve the common challenge of websites requiring phone number verification, email confirmation, or 2FA codes by providing automated handling of these verification steps.

资料来源：[packages/notte-sdk/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/README.md)

## Key Features

| Feature | Description |
|---------|-------------|
| **Email Addresses** | Unique email addresses associated with the persona |
| **Phone Numbers** | Optional phone numbers for SMS verification (configurable) |
| **Automated 2FA** | Automatic handling of two-factor authentication codes |
| **Identity Persistence** | Persona persists across session boundaries |

## Architecture

```mermaid
graph TD
    A[NotteClient] --> B[Persona]
    B --> C[Email Identity]
    B --> D[Phone Identity]
    B --> E[2FA Handler]
    F[Agent] --> B
    G[Session] --> F
    H[Vault] --> E
```

## Usage Pattern

Personas are used as context managers alongside Sessions to provide identity context for agent operations:

```python
from notte_sdk import NotteClient

client = NotteClient()

with client.Persona(create_phone_number=False) as persona:
    with client.Session(browser_type="chrome", open_viewer=True) as session:
        agent = client.Agent(session=session, persona=persona, max_steps=15)
        response = agent.run(
            task="Open the Google form and RSVP yes with your name",
            url="https://forms.google.com/your-form-url",
        )
print(response.answer)
```

资料来源：[packages/notte-sdk/README.md](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/README.md)

## Configuration Options

### Persona Initialization Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `create_phone_number` | bool | `True` | Whether to provision a phone number for this persona |

### Agent Configuration with Persona

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | Session | Active browser session |
| `persona` | Persona | Digital identity to attach |
| `max_steps` | int | Maximum steps for task completion |

## Credential Management

Personas integrate with Notte's credential vault system for secure storage and retrieval of authentication credentials:

```mermaid
sequenceDiagram
    participant A as Agent
    participant P as Persona
    participant V as Vault
    participant W as Website
    
    A->>P: Request 2FA code
    P->>V: Retrieve credential
    V-->>P: Credential data
    P->>W: Submit 2FA code
```

资料来源：[packages/notte-core/src/notte_core/credentials/base.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/credentials/base.py)

## API Endpoints

The Personas functionality is exposed through the SDK's endpoint interface:

| Endpoint | Purpose |
|----------|---------|
| `personas.create()` | Create a new persona with identity credentials |
| `personas.list()` | List available personas |
| `personas.get()` | Retrieve persona details |
| `personas.delete()` | Remove a persona |

资料来源：[packages/notte-sdk/src/notte_sdk/endpoints/personas.py](https://github.com/nottelabs/notte/blob/main/packages/notte-sdk/src/notte_sdk/endpoints/personas.py)

## Use Cases

### Form Filling with Identity

When completing web forms that require personal information:

```python
with client.Persona() as persona:
    with client.Session() as session:
        agent = client.Agent(session=session, persona=persona)
        agent.run(
            task="Complete the registration form with your details",
            url="https://example.com/register"
        )
```

### 2FA-Protected Actions

For websites requiring two-factor authentication:

```python
with client.Persona(create_phone_number=True) as persona:
    with client.Session() as session:
        agent = client.Agent(session=session, persona=persona, max_steps=20)
        agent.run(
            task="Login to your account and download your data",
            url="https://secure-site.com/dashboard"
        )
```

### Anonymous Browsing

When phone verification is not needed:

```python
with client.Persona(create_phone_number=False) as persona:
    agent = client.Agent(persona=persona)
    # Persona provides email without phone number
```

## Best Practices

1. **Resource Management**: Always use Personas within context managers (`with` statement) to ensure proper cleanup
2. **Phone Number Provisioning**: Disable phone number creation (`create_phone_number=False`) when only email identity is needed to reduce costs
3. **Session Coordination**: Pair Persona usage with Session context for proper browser automation
4. **Step Limits**: Set appropriate `max_steps` values when using Personas with complex multi-step workflows

## Error Handling

When persona-related operations fail, the SDK provides structured error responses:

```python
try:
    with client.Persona() as persona:
        # operations
except Exception as e:
    # Persona errors are handled through NotteBaseError hierarchy
    logger.error(f"Persona operation failed: {e}")
```

资料来源：[packages/notte-core/src/notte_core/errors/processing.py](https://github.com/nottelabs/notte/blob/main/packages/notte-core/src/notte_core/errors/processing.py)

---

---

## Doramagic 踩坑日志

项目：nottelabs/notte

摘要：发现 13 个潜在踩坑项，其中 0 个为 high/blocking；最高优先级：安装坑 - 来源证据：v1.8.8。

## 1. 安装坑 · 来源证据：v1.8.8

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：v1.8.8
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_5224b46e2312471c976888e8a2f8f4a6 | https://github.com/nottelabs/notte/releases/tag/v1.8.8 | 来源类型 github_release 暴露的待验证使用条件。

## 2. 配置坑 · 来源证据：v1.8.13

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：v1.8.13
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_56505da48cd74758ab7a9a1d82092e18 | https://github.com/nottelabs/notte/releases/tag/v1.8.13 | 来源类型 github_release 暴露的待验证使用条件。

## 3. 配置坑 · 来源证据：v1.8.14

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：v1.8.14
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_8f78a2591cea4f5aad8bfa0cb0ea15a0 | https://github.com/nottelabs/notte/releases/tag/v1.8.14 | 来源类型 github_release 暴露的待验证使用条件。

## 4. 配置坑 · 来源证据：v1.8.15

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：v1.8.15
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f611f1d0dd2440af8e84e9544ab1bdb6 | https://github.com/nottelabs/notte/releases/tag/v1.8.15 | 来源类型 github_release 暴露的待验证使用条件。

## 5. 配置坑 · 来源证据：v1.8.6

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：v1.8.6
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_0ed3c86ff7a34e5896a4daeb75552d4d | https://github.com/nottelabs/notte/releases/tag/v1.8.6 | 来源类型 github_release 暴露的待验证使用条件。

## 6. 配置坑 · 来源证据：v1.8.9

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：v1.8.9
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6195971390e8494f88083c862aef7eb6 | https://github.com/nottelabs/notte/releases/tag/v1.8.9 | 来源类型 github_release 暴露的待验证使用条件。

## 7. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:900152988 | https://github.com/nottelabs/notte | README/documentation is current enough for a first validation pass.

## 8. 运行坑 · 来源证据：v1.8.7

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：v1.8.7
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_ab102ac99c2441118475601c34ab0ee9 | https://github.com/nottelabs/notte/releases/tag/v1.8.7 | 来源类型 github_release 暴露的待验证使用条件。

## 9. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:900152988 | https://github.com/nottelabs/notte | last_activity_observed missing

## 10. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:900152988 | https://github.com/nottelabs/notte | no_demo; severity=medium

## 11. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:900152988 | https://github.com/nottelabs/notte | no_demo; severity=medium

## 12. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:900152988 | https://github.com/nottelabs/notte | issue_or_pr_quality=unknown

## 13. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:900152988 | https://github.com/nottelabs/notte | release_recency=unknown

<!-- canonical_name: nottelabs/notte; human_manual_source: deepwiki_human_wiki -->
