notte Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

notte

Notte transforms the internet into a structured, navigable space where each website becomes an accessible map for intelligent agents. The technology enables AI systems to interpret and int...

Introduction to Notte

Related topics: Quickstart Guide, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Packages

Continue reading this section for the full explanation and source context.

Section Navigation Actions

Continue reading this section for the full explanation and source context.

Section Session Lifecycle

Continue reading this section for the full explanation and source context.

Related topics: Quickstart Guide, System Architecture

Introduction to Notte

Notte is a comprehensive software suite designed for internet-native agentic systems. It provides a robust framework for building, deploying, and managing AI agents capable of interacting with web content programmatically. The project is developed by Notte Labs, Inc. and licensed under the Server Side Public License v1.0.

Sources: README.md

Overview

Notte transforms the internet into a structured, navigable space where each website becomes an accessible map for intelligent agents. The technology enables AI systems to interpret and interact with web content with precision, creating a programmatic layer over the web.

The suite offers multiple capabilities:

Web Scraping: Extract structured data from any website
Browser Automation: Navigate and interact with web pages programmatically
Agent Framework: Build AI agents that can perform complex web tasks
Session Management: Maintain stateful browsing sessions with cookie handling

Sources: README.md, packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Architecture

The Notte architecture consists of several interconnected packages:

graph TD
    A[Client SDK] --> B[notte-core]
    A --> C[notte-llm]
    A --> D[notte-agent]
    B --> E[Actions Module]
    B --> F[Error Handling]
    C --> G[Prompts]
    C --> H[Data Extraction]
    D --> I[Falco Agent]
    D --> J[Gufo Agent]

Core Packages

Package	Purpose
`notte-core`	Core actions, error handling, and browser interaction primitives
`notte-llm`	LLM integration, prompts for document analysis, data extraction, and action generation
`notte-agent`	Agent implementations (Falco, Gufo) with validation and execution logic
`notte-sdk`	Python SDK for easy integration and API consumption

Sources: packages/notte-core/src/notte_core/actions/actions.py, packages/notte-core/src/notte_core/errors/processing.py

Browser Actions

Notte provides a comprehensive set of browser actions that agents can execute. Actions are defined as typed classes with execution messages and parameter validation.

Action	Description	Parameters
`goto`	Navigate to a URL	`url: str`
`goto_new_tab`	Open URL in a new tab	`url: str`
`close_tab`	Close the current tab	None

Example Usage:

session.execute(type="goto", url="https://console.notte.cc")
session.execute(type="goto_new_tab", url="https://example.com")
session.execute(type="close_tab")

Sources: packages/notte-core/src/notte_core/actions/actions.py:1-100

Session Management

The SDK provides a session-based interface for browser automation. Sessions maintain state across multiple interactions and support cookie persistence.

Session Lifecycle

graph LR
    A[Start Session] --> B[Execute Actions]
    B --> C[Observe State]
    C --> B
    B --> D[Stop Session]
    D --> E[Save Cookies]

Basic Session Usage

from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://www.notte.cc")
    obs = session.observe()

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Observation Types

Sessions support two perception modes for observing page state:

Mode	Description	Use Case
`fast`	Simple page perception for quick queries	Basic element detection
`deep`	LLM-powered formatting for rich action spaces	Complex interactions

# Fast observation
obs = session.observe(perception_type='fast')

# Deep observation for LLM-ready action space
obs = session.observe(perception_type='deep')
print(obs.space.description)

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Sessions automatically handle cookie persistence:

client = NotteClient(cookie_file="cookies.json")
with client.Session() as session:
    # Cookies are loaded on start and saved on stop
    session.execute(type="goto", url="https://example.com")

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Agent System

Notte includes sophisticated agent implementations for autonomous web navigation.

Action Identification System

Agents identify interactive elements using a structured ID system:

Prefix	Element Type	Examples
`I`	Input fields	Textboxes, selects, checkboxes
`B`	Buttons	Clickable buttons
`L`	Links	Hypertext links
`F`	Figures/Images	Visual elements
`O`	Select options	Dropdown options
`M`	Miscellaneous	Modals, dialogs

ID Format: <role_first_letter><index>[:] (e.g., B1, I2, L3:button)

Note: IDs can change at each step. Agents must not assume IDs persist across observations.

Sources: packages/notte-agent/src/notte_agent/gufo/system.md

CAPTCHA Handling

Agents have built-in CAPTCHA detection and handling:

Never interact directly with CAPTCHA elements
Use the captcha_solve action when detection occurs
Supported types: reCAPTCHA, hCaptcha, image verification, checkbox verification

{
  "action": "captcha_solve",
  "captcha_type": "recaptcha"
}

Sources: packages/notte-agent/src/notte_agent/gufo/system.md

Validation System

The agent framework includes a validation pipeline:

graph TD
    A[Execute Action] --> B[Validate Output]
    B --> C{Has Observations?}
    C -->|No| D[Return Error]
    C -->|Yes| E[LLM Validation]
    E --> F{Is Valid?}
    F -->|Yes| G[Return Success]
    F -->|No| H[Return Failure]

The validator uses vision models when available to verify action outcomes against expected results.

Sources: packages/notte-agent/src/notte_agent/common/validator.py

Data Extraction

Notte provides structured data extraction capabilities through LLM-powered document analysis.

Document Analysis Pipeline

Stage	Description
Analysis	Identify sections, content types, and structured data
Category	Classify document type (search-results, item, other)
Extraction	Transform content into structured format

Output Format

Extracted data is organized into two sections:

<document-analysis>: Logical breakdown of the document structure
<data-extraction>: Structured Markdown output with tables and lists

Example Categories:

Category	Use Case
`search-results`	Google Flights, search engine results
`item`	Recipe pages, product details
`other`	General content (Allrecipes homepage)

Sources: packages/notte-llm/src/notte_llm/prompts/document-category/base/user.md

API Integration

REST API Endpoint

curl -X POST 'https://api.notte.cc/scrape' \
  -H 'Authorization: Bearer <NOTTE-API-KEY>' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://notte.cc",
    "only_main_content": false
  }'

SDK Client Usage

from notte_sdk import NotteClient
from pydantic import BaseModel

# Basic scraping
response = client.scrape(
    url="https://notte.cc",
    scrape_links=True,
    only_main_content=True
)

# Structured scraping
class Article(BaseModel):
    title: str
    content: str
    date: str

response = client.scrape(
    url="https://example.com/blog",
    response_format=Article,
    instructions="Extract only the title, date and content of the articles"
)

Sources: README.md

Personas

Notte supports persona-based operations for enhanced privacy and automation:

import notte

persona = notte.Persona("<your-persona-id>")
sms = persona.sms(only_unread=True)

Available Operations

Method	Description
`sms()`	Retrieve SMS messages for the persona
`create_number()`	Create a phone number
`delete_number()`	Delete the persona's phone number

Sources: packages/notte-sdk/src/notte_sdk/endpoints/personas.py

Error Handling

Notte defines a comprehensive error hierarchy for different failure scenarios:

Core Error Classes

Error	Description
`InvalidA11yTreeType`	Invalid accessibility tree type
`InvalidA11yChildrenError`	Invalid child element count
`InvalidPlaceholderError`	Unhandled placeholder in vault
`ScrapeFailedError`	Structured data extraction failure

All errors provide developer advice and user-facing messages for appropriate handling.

Sources: packages/notte-core/src/notte_core/errors/processing.py

Search Demo

Notte provides a live search demonstration using MCP server integration:

Demo URL: https://search.notte.cc/
Features: Real-time search in LLM chatbots leveraging the scraping endpoint

Sources: README.md

License and Citation

This project is licensed under the Server Side Public License v1.0 (SSPL-1.0).

For academic or commercial use, cite as:

@software{notte2025,
  author = {Pinto, Andrea and Giordano, Lucas and {nottelabs-team}},
  title = {Notte: Software suite for internet-native agentic systems},
  url = {https://github.com/nottelabs/notte},
  year = {2025},
  publisher = {GitHub},
  license = {SSPL-1.0},
  version = {1.4.4}
}

Sources: README.md

Quickstart Guide

The Quickstart Guide provides a streamlined path for developers to begin using the Notte SDK within minutes. It covers environment configuration, SDK initialization, and the fundamental workflows for web scraping and browser automation.

Prerequisites

Before starting, ensure your development environment meets the following requirements:

Requirement	Minimum Version	Notes
Python	3.10+	Required for type annotations and modern async features
pip	21.0+	For package installation

Environment Setup

1. Obtain API Credentials

2. Configure Environment Variables

Create a .env file in your project root with the following variables:

NOTTE_API_KEY=your_api_key_here
NOTTE_API_URL=https://api.notte.cc  # Optional, defaults to this value

Sources: .env.example:1-2

3. Install the SDK

Install the Notte SDK using pip:

pip install notte

For additional providers or extras, install from the project root:

pip install -e ".[providers]"

Sources: pyproject.toml:1-50

Basic Usage

SDK Client Initialization

Initialize the Notte client using environment variables or direct configuration:

from notte import Notte

# Using environment variables (recommended)
client = Notte()

# Or with explicit parameters
client = Notte(
    api_key="your_api_key",
    base_url="https://api.notte.cc"
)

Simple Web Scraping

Perform basic webpage scraping with minimal configuration:

response = client.scrape(
    url="https://notte.cc",
    scrape_links=True,
    only_main_content=True
)
print(response.content)

Sources: examples/quickstart.py:1-20

Structured Data Extraction

Extract structured data using Pydantic models for type-safe responses:

from notte import BaseModel

class Article(BaseModel):
    title: str
    content: str
    date: str

response = client.scrape(
    url="https://example.com/blog",
    response_format=Article,
    instructions="Extract only the title, date and content of the articles"
)

Sources: README.md:1-30

Session-Based Automation

For complex interactions requiring multiple steps, use the Session API:

with client.session() as session:
    # Navigate to a page
    session.execute(type="goto", url="https://example.com")
    
    # Observe available actions
    obs = session.observe(perception_type="deep")
    
    # Execute form filling or clicking actions
    session.execute(type="click", id="B1")

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-50

API Reference

Client Configuration Parameters

Parameter	Type	Required	Default	Description
`api_key`	`str`	Yes	-	Your Notte API key
`base_url`	`str`	No	`https://api.notte.cc`	Base URL for API requests
`timeout`	`int`	No	`60`	Request timeout in seconds

Scrape Parameters

Parameter	Type	Required	Default	Description
`url`	`str`	Yes	-	Target URL to scrape
`only_main_content`	`bool`	No	`True`	Exclude navbars and footers
`scrape_links`	`bool`	No	`True`	Include hyperlinks in response
`response_format`	`BaseModel`	No	`None`	Pydantic model for structured output
`instructions`	`str`	No	`None`	Natural language extraction instructions

Workflow Diagram

graph TD
    A[Start] --> B[Install SDK]
    B --> C[Configure API Key]
    C --> D{Use Case}
    D -->|Simple Scrape| E[client.scrape]
    D -->|Structured Data| F[Define BaseModel]
    F --> G[client.scrape with response_format]
    D -->|Complex Automation| H[Create Session]
    H --> I[Observe Actions]
    I --> J[Execute Actions]
    J --> K[Return Results]
    E --> L[End]
    G --> L
    K --> L

cURL Alternative

For environments without Python, use the REST API directly:

curl -X POST 'https://api.notte.cc/scrape' \
  -H 'Authorization: Bearer <NOTTE-API-KEY>' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://notte.cc",
    "only_main_content": false
  }'

Sources: README.md:40-50

Next Steps

Review the Setup Documentation for advanced configuration
Explore the Examples Directory for complete use cases
Check the Agent Documentation for browser automation with AI agents

Sources: .env.example:1-2

System Architecture

Related topics: Introduction to Notte, Agent Core System, Browser Sessions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Error Handling Architecture

Continue reading this section for the full explanation and source context.

Section Placeholder System

Continue reading this section for the full explanation and source context.

Section Session Management

Continue reading this section for the full explanation and source context.

System Architecture

Notte is a software suite designed for internet-native agentic systems, providing a comprehensive infrastructure for browser automation, web interaction, and AI-driven document processing. The architecture follows a modular design pattern with distinct layers for browser control, agent orchestration, LLM integration, and SDK accessibility.

Overview

The Notte system is composed of five primary packages that work together to enable autonomous web interaction:

Package	Purpose
`notte-core`	Core utilities, error handling, and shared data structures
`notte-browser`	Browser interaction and CDP integration layer
`notte-agent`	AI agent orchestration (Gufo and Falco subsystems)
`notte-llm`	LLM prompt management and document processing
`notte-sdk`	Public SDK for client applications

High-Level Architecture

graph TD
    Client[Client Application]
    SDK[notte-sdk]
    Agent[notte-agent]
    Browser[notte-browser]
    LLM[notte-llm]
    Core[notte-core]
    CDP[Chrome DevTools Protocol]
    Remote[Remote Browser]

    Client --> SDK
    SDK --> Agent
    SDK --> LLM
    Agent --> Browser
    Browser --> CDP
    CDP --> Remote
    LLM --> Core
    Core --> Browser

Core Package (`notte-core`)

The notte-core package provides foundational components used across all other packages, including error handling, accessibility tree processing, and placeholder management.

Error Handling Architecture

The error system is built on a hierarchical class structure rooted in NotteBaseError:

# Source: packages/notte-core/src/notte_core/errors/processing.py
class NotteBaseError(Exception):
    def __init__(self, agent_message, user_message, dev_message)

Error Categories:

Error Class	Purpose
`InvalidInternalCheckError`	Internal validation failures with developer guidance
`InvalidA11yTreeType`	Unsupported accessibility tree format
`InvalidA11yChildrenError`	Accessibility tree structure violations
`InvalidPlaceholderError`	Vault placeholder resolution failures
`ScrapeFailedError`	Structured data extraction failures

Placeholder System

The placeholder system enables secure credential management through a vault mechanism. When an action requires sensitive data, the system substitutes placeholders that are resolved at runtime.

# Source: packages/notte-core/src/notte_core/errors/processing.py
class InvalidPlaceholderError(NotteBaseError):
    def __init__(self, placeholder: str) -> None:
        dev_message = f"The placeholder {placeholder} is not handled by your current vault."
        agent_message = f"Could not perform action with value {placeholder}. Try picking a different value"

SDK Package (`notte-sdk`)

The SDK provides the primary interface for client applications to interact with the Notte system. It exposes session management, observation capabilities, and action execution through a Pythonic API.

Session Management

Sessions are the fundamental unit of work in Notte, representing a single browser session with associated state and context.

# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
class Session:
    def __init__(self, ..., timeout_minutes: int = ...):
        self.response = None
        self._cookie_file = None

Session Lifecycle:

State	Description
`created`	Session object instantiated
`started`	`client.start()` called, `session_id` available
`active`	Browser operations in progress
`stopped`	`client.stop()` called, session terminated

Observation System

The observation system retrieves the current state of the webpage and available interactive elements:

# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
def observe(self, *, perception_type: str = None, instructions: str = None, **data):
    if data.get("perception_type") is None:
        data["perception_type"] = self.default_perception_type
    return self.client.page.observe(session_id=self.session_id, **data)

Perception Types:

Type	Description	Use Case
`fast`	Simple page perception for quick queries	Default, rapid action space generation
`deep`	LLM-powered element formatting	Complex pages requiring structured analysis

Action Execution

Actions are executed through the unified execute() method with type-based dispatch:

# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
def execute(self, *, raise_on_failure: bool = None, **kwargs: Unpack[FormFillActionDict]) -> ExecutionResult

Sessions automatically persist cookies to file for session continuity:

# Source: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py
if self._cookie_file is not None:
    cookies = self.get_cookies()
    create_or_append_cookies_to_file(self._cookie_file, cookies)

Agent Package (`notte-agent`)

The agent package contains two distinct agent subsystems: Gufo and Falco. Both subsystems handle browser automation but use different prompt strategies and action registries.

Agent Subsystem Comparison

Aspect	Gufo	Falco
System Prompt	`gufo/system.md`	`falco/system.md`
Element Format	Markdown with backticks	`id[:]<type>text</type>`
Tools	Configurable via `BaseTool`	Configurable via `BaseTool`
Action Registry	Custom implementation	`ActionRegistry` class

Element Identification System

Both agents use a consistent element identification scheme for interactive elements:

Prefix	Element Type	Examples
`I`	Input fields	Textbox, select, checkbox, radio
`B`	Buttons	Submit, clickable elements
`L`	Links	Hypertext navigation
`F`	Figures/Images	Visual content
`O`	Options	Select dropdown items
`M`	Miscellaneous	Modals, dialogs, overlays

{
  "id": "I1",
  "type": "input",
  "label": "email",
  "value": "[email protected]"
}

Gufo Agent System

The Gufo agent (packages/notte-agent/src/notte_agent/gufo/system.md) operates through structured JSON commands:

{
  "actions": [{"type": "click", "id": "B1"}],
  "reasoning": "User wants to submit the form"
}

Falco Agent System

The Falco agent (packages/notte-agent/src/notte_agent/falco/prompt.py) uses a prompt-based approach with configurable tools:

# Source: packages/notte-agent/src/notte_agent/falco/prompt.py
class FalcoPrompt(BasePrompt):
    def __init__(
        self,
        prompt_file: Path | None = None,
        tools: list[BaseTool] | None = None,
    ) -> None:
        self.action_registry: ActionRegistry = ActionRegistry(tools)

CAPTCHA Handling

Both agents implement strict CAPTCHA detection and handling:

# Source: packages/notte-agent/src/notte_agent/gufo/system.md
# CAPTCHA HANDLING - CRITICAL RULES:
# - NEVER click on captcha elements directly
# - NEVER use "click", "type", or any other action on captcha elements
# - If detected, use ONLY the "captcha_solve" action

Action Examples

Form Filling:

// Source: packages/notte-agent/src/notte_agent/falco/prompt.py
{
  "type": "form_fill",
  "value": {
    "address1": "<my address>",
    "city": "<my city>",
    "state": "<my state>"
  }
}

Navigation and Extraction:

{
  "type": "scrape",
  "instructions": "Extract the search results from the page"
}

LLM Package (`notte-llm`)

The LLM package manages prompts and processing for document analysis, categorization, and structured data extraction.

Prompt Categories

Category	Purpose	Output Format
`document-category`	Classify web documents	`<document-category>type</document-category>`
`data-extraction`	Extract structured data	Markdown with sections
`action-listing`	List available actions	JSON action array
`extract-without-json-schema`	LLM-native extraction	Structured JSON

Document Categorization

Documents are classified into categories for downstream processing:

Category	Description	Example
`search-results`	Search engine results page	Google search
`item`	Individual item/product page	Recipe, product listing
`other`	Uncategorized content	General pages

<!-- Source: packages/notte-llm/src/notte_llm/prompts/document-category/base/user.md -->
<document-category>other</document-category>

Data Extraction Templates

The system supports multiple extraction formats:

Template	Sections	Use Case
`two_sections`	`<document-analysis>`, `<data-extraction>`	Standard extraction
`all_data`	Analysis + detailed extraction	Comprehensive data
`user.md` (base)	Custom format	Flexible extraction

Structured Output Generation

<!-- Source: packages/notte-llm/src/notte_llm/prompts/data-extraction/user.md -->
<document-analysis>
Found X menus, Y text elements, Z interactive elements
[Analysis content...]
</document-analysis>
<data-extraction>
[Extracted data in Markdown format...]
</data-extraction>

Browser Package (`notte-browser`)

The browser package provides the low-level interface to the Chrome DevTools Protocol (CDP) for controlling headless browsers.

Key Responsibilities

Page navigation and loading
Element interaction (click, type, scroll)
Screenshot capture
Accessibility tree generation
Cookie management

CDP Integration

# Source: packages/notte-sdk/README.md
from patchright.sync_api import sync_playwright
from notte_sdk import NotteClient

with notte.Session() as session:
    # Browser operations via CDP
    _ = session.execute(type="goto", url="https://example.com")

Data Flow Architecture

graph LR
    User[User Request]
    SDK[SDK Session]
    Agent[Agent Processor]
    LLM[LLM Processing]
    Browser[Browser Engine]
    CDP[CDP Commands]
    
    User --> SDK
    SDK --> Agent
    Agent --> LLM
    LLM --> Agent
    Agent --> Browser
    Browser --> CDP
    CDP --> Browser
    Browser --> Agent
    Agent --> SDK
    SDK --> User

Session State Machine

graph TD
    Init[Session Created] --> Start[client.start]
    Start --> Active[Session Active]
    Active --> Observe[session.observe]
    Active --> Execute[session.execute]
    Observe --> Active
    Execute --> Active
    Active --> Stop[client.stop]
    Stop --> End[Session Ended]
    
    Start -->|Error| Error[Error State]
    Error -->|Retry| Start

Configuration Options

SDK Initialization

Parameter	Type	Default	Description
`timeout_minutes`	`int`	-	Session timeout
`open_viewer`	`bool`	`False`	Open browser viewer
`proxies`	`dict`	`None`	Proxy configuration
`_cookie_file`	`str`	`None`	Cookie persistence file

Perception Configuration

Parameter	Type	Description
`perception_type`	`str`	`fast` or `deep`
`instructions`	`str`	Natural language filtering

Error Handling Flow

graph TD
    Action[Action Request]
    Validate{Validation}
    Validate -->|Pass| Execute
    Validate -->|Fail| InvalidError[InvalidInternalCheckError]
    
    Execute --> CDP[CDP Call]
    CDP -->|Success| Result
    CDP -->|Failure| ScrapeError[ScrapeFailedError]
    
    Placeholder{Placeholder Check}
    Result --> Placeholder
    Placeholder -->|Found| PlaceholderError[InvalidPlaceholderError]
    Placeholder -->|None| Complete

Integration Examples

Basic Session Usage

# Source: packages/notte-sdk/README.md
from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://example.com")
    obs = session.observe()
    action = obs.space.sample(type='click')
    result = session.execute(action)

Agent Deployment

# Source: packages/notte-sdk/README.md
with notte.Session(open_viewer=True) as session:
    agent = notte.Agent(session=session)
    agent.start(
        task="Summarize the content of the page",
        url="https://www.google.com"
    )

Summary

The Notte architecture provides a robust, layered approach to browser automation:

Core Layer (notte-core): Provides shared utilities, error handling, and base data structures
Browser Layer (notte-browser): Abstracts Chrome DevTools Protocol for browser control
Agent Layer (notte-agent): Implements AI-driven automation with Gufo and Falco subsystems
LLM Layer (notte-llm): Manages document processing and prompt engineering
SDK Layer (notte-sdk): Exposes the complete API to client applications

The modular design allows each layer to be used independently or in combination, enabling flexible deployment scenarios from simple web scraping to complex autonomous agent workflows.

Source: https://github.com/nottelabs/notte / Human Manual

Agent Core System

Related topics: Structured Output, Agent Fallback System, Browser Sessions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Agent Base Class

Continue reading this section for the full explanation and source context.

Section Perception Module

Continue reading this section for the full explanation and source context.

Section Conversation Module

Continue reading this section for the full explanation and source context.

Agent Core System

The Agent Core System is the central orchestration layer in Notte that enables AI agents to autonomously navigate and interact with web pages. It provides a unified interface for browser automation tasks, handling perception of web elements, action execution, and conversation management between the agent and web content.

Architecture Overview

The Agent Core System consists of multiple layered components that work together to enable autonomous web interaction.

graph TD
    A[Agent Client] --> B[Agent Core]
    B --> C[Browser Session]
    C --> D[Web Page]
    B --> E[Perception Module]
    B --> F[Conversation Module]
    E --> G[Action Registry]
    G --> H[Falco Actions]
    G --> I[Gufo Actions]

Agent Types

Notte supports multiple agent implementations, each designed for specific automation scenarios. The agent type determines the underlying action execution engine and available capabilities.

Agent Type	Description	Use Case
`falco`	Standard browser automation agent	General web interaction, form filling, navigation
`gufo`	Advanced automation with stealth features	CAPTCHA handling, proxy rotation, anti-detection

Sources: packages/notte-core/src/notte_core/agent_types.py:1-50

Core Components

Agent Base Class

The base Agent class provides the primary interface for task execution and state management.

class Agent:
    def __init__(
        self,
        session: Session,
        vault: Vault | None = None,
        max_steps: int = 10,
        agent_type: AgentType = AgentType.FALCO,
    )
    
    def run(self, task: str) -> AgentResponse
    def start(self, task: str, url: str | None = None) -> None
    def status() -> AgentStatus
    def stop() -> None

Key Parameters:

Parameter	Type	Default	Description
`session`	`Session`	Required	Browser session for interaction
`vault`	`Vault \	None`	`None`	Secure credential storage
`max_steps`	`int`	`10`	Maximum action steps before termination
`agent_type`	`AgentType`	`FALCO`	Backend engine selection

Sources: packages/notte-agent/src/notte_agent/agent.py:1-100

Perception Module

The perception module analyzes web page structure and generates interactive action spaces for the agent.

class Perception:
    def observe(
        self,
        perception_type: str = 'fast',
        instructions: str | None = None
    ) -> ObservationResponse

Perception Types:

Type	Description	Performance
`fast`	Simple page element parsing	Low latency
`deep`	LLM-powered element formatting	Higher accuracy, slower

The perception system generates element IDs using a role-based prefix system:

I - Input fields (textbox, select, checkbox)
B - Buttons
L - Links
F - Figures/Images
O - Select options
M - Miscellaneous elements (modals, dialogs)

Sources: packages/notte-agent/src/notte_agent/common/perception.py:1-80

Conversation Module

Manages the dialogue history between the agent and web content, maintaining context across multiple interaction steps.

class Conversation:
    def __init__(self, system_prompt: str)
    def add_user_message(self, content: str) -> None
    def add_agent_message(self, content: str) -> None
    def get_messages() -> list[Message]

The conversation system tracks:

User task requests
Agent reasoning and decisions
Action execution results
Page observations

Sources: packages/notte-agent/src/notte_agent/common/conversation.py:1-60

Action System

Action Types

The agent supports a comprehensive set of browser automation actions through a registry pattern.

graph LR
    A[Agent Decision] --> B[Action Registry]
    B --> C[FormFillAction]
    B --> D[ClickAction]
    B --> E[ScrapeAction]
    B --> F[CaptchaSolveAction]
    B --> G[GotoAction]

Action	Description	Parameters
`goto`	Navigate to URL	`url: str`
`click`	Click element	`id: str`
`fill`	Fill form fields	`value: dict[str, str]`
`scrape`	Extract structured data	`instructions: str`
`captcha_solve`	Solve CAPTCHA	`captcha_type: str`

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py:1-150

Action Registry

The ActionRegistry maintains available actions and their schemas, enabling dynamic action discovery.

class ActionRegistry:
    def __init__(self, tools: list[BaseTool])
    def get_action_schemas(self) -> list[ActionSchema]
    def register(self, action_cls: type[BaseTool]) -> None

Supported Action Formats

Actions are serialized using JSON schema format for agent consumption:

{
  "type": "object",
  "properties": {
    "id": {"type": "string", "description": "Element identifier"},
    "value": {"type": "string", "description": "Action value"}
  }
}

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py:30-80

Agent Implementations

Falco Agent

The default agent implementation using standard Playwright-based browser automation.

class FalcoAgent(BaseAgent):
    def __init__(self, tools: list[BaseTool] | None = None)
    def execute(self, action: dict) -> ExecutionResult

Features:

Standard form filling
Click-based navigation
Basic scraping operations
Simple CAPTCHA detection

Sources: packages/notte-agent/src/notte_agent/falco/agent.py:1-100

Gufo Agent

Advanced agent with stealth capabilities for bypassing detection systems.

class GufoAgent(BaseAgent):
    def __init__(self, tools: list[BaseTool] | None = None)
    def execute(self, action: dict) -> ExecutionResult

Features:

Automatic CAPTCHA solving
Proxy rotation support
User agent spoofing
Cookie management

Sources: packages/notte-agent/src/notte_agent/gufo/agent.py:1-100

Session Integration

Agents operate within browser sessions that provide the execution environment.

with client.Session(browser_type="chrome", open_viewer=True) as session:
    agent = client.Agent(session=session, max_steps=15)
    response = agent.run(
        task="Navigate to the form and submit with sample data",
        url="https://example.com/form"
    )

Session Configuration

Parameter	Type	Default	Description
`browser_type`	`str`	`"chrome"`	Browser engine
`open_viewer`	`bool`	`False`	Display browser window
`timeout_minutes`	`int`	`5`	Session timeout
`proxies`	`bool \	str`	`False`	Proxy configuration
`solve_captchas`	`bool`	`False`	Auto-CAPTCHA solving

Workflow Execution

sequenceDiagram
    participant User
    participant Agent
    participant Session
    participant Browser
    User->>Agent: run(task)
    Agent->>Session: observe()
    Session->>Browser: Get page state
    Browser-->>Session: Page elements
    Session-->>Agent: Observation
    Agent->>Agent: Plan action
    Agent->>Session: execute(action)
    Session->>Browser: Perform action
    Browser-->>Session: Result
    Session-->>Agent: ExecutionResult
    Agent->>Agent: Check completion
    Note over Agent,Browser: Loop until task complete or max_steps

Error Handling

The system provides structured error handling for various failure scenarios.

Error Class	Description	Resolution
`InvalidPlaceholderError`	Vault credential unavailable	Select alternative value
`ScrapeFailedError`	Data extraction failed	Retry with different instructions
`InvalidA11yTreeType`	Unknown accessibility tree type	Check code implementation
`InvalidA11yChildrenError`	Element hierarchy mismatch	Verify page structure

Sources: packages/notte-core/src/notte_core/errors/processing.py:1-100

SDK Usage Example

from notte_sdk import NotteClient

client = NotteClient()

# Basic agent usage
with client.Session(open_viewer=True) as session:
    agent = client.Agent(session=session, max_steps=10)
    response = agent.run(
        task="Find the search box and search for 'python tutorials'",
        url="https://www.google.com"
    )
    print(response.answer)

# With persona (digital identity)
with client.Persona(create_phone_number=True) as persona:
    with client.Session(browser_type="chrome") as session:
        agent = client.Agent(session=session, persona=persona, max_steps=15)
        response = agent.run(
            task="Complete the registration form",
            url="https://example.com/register"
        )

Best Practices

Set appropriate max_steps - Balance between task completion and resource usage
Use fast perception for simple tasks, deep perception for complex page analysis
Implement vault storage for reusable credentials across sessions
Handle errors gracefully - Check execution results before proceeding
Use stealth features when bypassing detection is required

Sources: packages/notte-core/src/notte_core/agent_types.py:1-50

Structured Output

Structured Output enables Notte to extract web page content and return it as typed, structured data using Pydantic models. This feature bridges the gap between unstructured web content and programmatic data processing, allowing developers to define expected output schemas and receive validated, type-safe data.

Overview

Notte's Structured Output system consists of two complementary approaches:

Schema-Based Extraction: Uses dynamically generated JSON schemas from Pydantic models
Natural Language Extraction: Uses instructions-based extraction without strict schema enforcement

The system leverages Pydantic for schema definition, ensuring type safety and validation at the application layer. When a response_format is provided, Notte generates a corresponding JSON Schema that guides the LLM in producing correctly structured output.

Sources: packages/notte-core/src/notte_core/utils/pydantic_schema.py

Architecture

graph TD
    A[User Request] --> B[Define Pydantic Model]
    B --> C[response_format Parameter]
    C --> D{Schema Type}
    D -->|With Schema| E[Generate JSON Schema]
    D -->|Without Schema| F[Instructions-Only Extraction]
    E --> G[Prompt Engineering]
    F --> G
    G --> H[LLM Processing]
    H --> I[Output Validation]
    I --> J[Typed Response]

Core Components

Pydantic Schema Generation

The pydantic_schema.py module provides utilities for converting Python Pydantic models into JSON schemas that can be consumed by LLMs.

Function	Purpose
`model_to_json_schema()`	Converts Pydantic model class to JSON schema
`validate_response()`	Validates LLM output against expected schema
`extract_structured_data()`	Extracts and parses structured data from response

Sources: packages/notte-core/src/notte_core/utils/pydantic_schema.py

Scraping Schema

The schema.py module defines the scraping configuration and response handling for structured output.

Parameter	Type	Default	Description
`response_format`	`type[BaseModel]`	`None`	Pydantic model for structured output
`instructions`	`str`	`None`	Natural language extraction instructions
`raise_on_failure`	`bool`	`True`	Raise exception on extraction failure

Sources: packages/notte-browser/src/notte_browser/scraping/schema.py

Usage Patterns

Basic Structured Extraction

from notte_sdk import NotteClient
from pydantic import BaseModel

class Product(BaseModel):
    title: str
    price: str
    description: str | None = None

client = NotteClient()
data = client.scrape(
    url="https://example.com/product",
    response_format=Product,
    instructions="Extract the product title, price, and description"
)

Sources: packages/notte-sdk/src/notte_sdk/client.py

Session-Based Extraction

from notte_sdk import NotteClient
from pydantic import BaseModel

class Article(BaseModel):
    title: str
    content: str
    date: str

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://example.com/blog")
    data = session.scrape(
        response_format=Article,
        instructions="Extract the title, date and content of the articles"
    )

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Error Handling

from notte_sdk import NotteClient
from notte_core.errors import ScrapeFailedError

client = NotteClient()

# With raise_on_failure=False, returns StructuredData wrapper
result = client.scrape(
    url="https://example.com",
    response_format=Product,
    raise_on_failure=False
)

if not result.success:
    print(f"Extraction failed: {result.error}")
else:
    data = result.data

Sources: packages/notte-core/src/notte_core/errors/processing.py

Prompt Engineering

JSON Schema Generation Prompt

The generate-json-schema/system.md template guides the LLM in producing valid JSON output conforming to a specified schema. This prompt includes:

Success examples showing correct JSON output
Failure examples demonstrating invalid output handling
Timestamp context for time-sensitive extraction

Today is: {{timestamp}}

Transform the following document into structured JSON output based on the provided user request:

Sources: packages/notte-llm/src/notte_llm/prompts/generate-json-schema/system.md

Extraction Without Schema

The extract-without-json-schema/system.md template provides an alternative approach for natural language extraction:


Example of a valid output if you cannot answer the user request:

This approach allows flexible extraction when strict schema conformance is not required.

Sources: packages/notte-llm/src/notte_llm/prompts/extract-without-json-schema/system.md

Scrape Action Configuration

The underlying scrape action provides granular control over extraction:

session.execute(type="scrape", only_images=True)  # Scrape only images
session.execute(type="scrape", response_format={"type": "object", "properties": {...}})  # With JSON schema

Action Parameter	Description
`instructions`	Natural language instructions for extraction
`only_main_content`	Exclude navbars, footers (default: `True`)
`selector`	Playwright selector to scope extraction
`only_images`	Extract images only
`scrape_links`	Include links in output (default: `True`)
`scrape_images`	Include image data

Sources: packages/notte-core/src/notte_core/actions/actions.py

Workflow

sequenceDiagram
    participant User
    participant SDK as NotteClient
    participant LLM as LLM Engine
    participant API as Notte API

    User->>SDK: scrape(url, response_format=Model)
    SDK->>LLM: Generate JSON Schema from Model
    SDK->>API: POST /scrape with schema + instructions
    API->>LLM: Process page with schema
    LLM-->>API: Structured JSON response
    API-->>SDK: Validated response
    SDK-->>User: Typed Model instance

Best Practices

Define Clear Schemas: Use descriptive field names and include type annotations
Provide Contextual Instructions: Give the LLM context about what to extract
Handle Optional Fields: Use | None for fields that may not always be present
Validate Output: Enable raise_on_failure=True for production use
Scope Extraction: Use selectors when extracting from specific page regions

Error Handling

Error Type	Cause	Resolution
`ScrapeFailedError`	Extraction validation failed	Check instructions and schema compatibility
`LLMParsingError`	Malformed JSON in response	Ensure schema is properly generated
`InvalidPlaceholderError`	Missing credential reference	Configure vault for required credentials

Sources: packages/notte-core/src/notte_core/errors/processing.py

Sources: packages/notte-core/src/notte_core/utils/pydantic_schema.py

Agent Fallback System

Related topics: Agent Core System, Actions and Browser Controls

Section Related Pages

Continue reading this section for the full explanation and source context.

Section FallbackAction Class

Continue reading this section for the full explanation and source context.

Section AgentFallbackManager

Continue reading this section for the full explanation and source context.

Section 1. Captcha Resolution Strategy

Continue reading this section for the full explanation and source context.

Agent Fallback System

Overview

The Agent Fallback System in Notte provides a robust mechanism for handling agent execution failures and gracefully recovering from errors during browser automation tasks. When an agent encounters execution issues—whether due to captcha detection, action failures, or other runtime errors—the fallback system intercepts these failures, analyzes the error context, and determines the appropriate recovery strategy.

The system operates across two primary layers: the SDK layer (notte_sdk) and the agent layer (notte_agent), ensuring consistent error handling and recovery whether agents are executed locally or through the Notte API.

Architecture

graph TD
    A[Agent Execution] --> B{Action Execution Success?}
    B -->|Yes| C[Continue Normal Flow]
    B -->|No| D[Fallback System Triggered]
    D --> E{Error Type Classification}
    E -->|Captcha| F[Captcha Handler]
    E -->|Action Failure| G[Retry Strategy]
    E -->|Fatal Error| H[Graceful Degradation]
    F --> I[Attempt Resolution]
    G --> J[Apply Retry Policy]
    H --> K[Report to User]
    I --> C
    J --> C
    K --> L[Session Cleanup]

Core Components

FallbackAction Class

The FallbackAction class represents the fundamental unit of fallback handling. It encapsulates the error context and provides structured data for downstream processing.

Property	Type	Description
`error_type`	`str`	Classification of the error (e.g., "captcha", "action_failure")
`error_message`	`str`	Detailed description of the failure
`metadata`	`dict`	Additional context including HTML element data, playwright code, and execution state
`retry_count`	`int`	Number of retry attempts performed
`timestamp`	`datetime`	When the error occurred

Sources: packages/notte-agent/src/notte_agent/agent_fallback.py

AgentFallbackManager

The AgentFallbackManager serves as the central coordinator for fallback operations, managing the lifecycle of error handling and recovery strategies.

sequenceDiagram
    participant Agent as Agent
    participant FallbackManager as FallbackManager
    participant RecoveryStrategy as RecoveryStrategy
    participant Session as Session
    
    Agent->>FallbackManager: Register failure context
    FallbackManager->>FallbackManager: Classify error type
    FallbackManager->>RecoveryStrategy: Select appropriate strategy
    RecoveryStrategy->>Session: Apply recovery action
    Session-->>Agent: Resume or terminate

#### Key Methods

Method	Purpose
`register_failure()`	Records a new failure event in the fallback system
`classify_error()`	Determines the error category based on error characteristics
`select_recovery_strategy()`	Chooses the optimal recovery approach
`apply_recovery()`	Executes the chosen recovery mechanism
`should_retry()`	Evaluates whether additional attempts are warranted

Sources: packages/notte-sdk/src/notte_sdk/agent_fallback.py

Error Classification

The fallback system categorizes failures into distinct types, each with tailored handling strategies:

Error Type	Description	Default Recovery
`captcha`	CAPTCHA or verification challenges detected	Initiate captcha solving flow
`action_failure`	Interactive element action failed	Retry with modified selectors
`navigation_error`	Page navigation or URL resolution failure	Retry with exponential backoff
`timeout_error`	Operation exceeded time limits	Extend timeout and retry
`invalid_state`	Agent reached an inconsistent state	Reset to known good state
`fatal_error`	Unrecoverable error requiring termination	Graceful session cleanup

Recovery Strategies

1. Captcha Resolution Strategy

When the system detects CAPTCHA challenges, it automatically triggers the captcha solving mechanism.

# Captcha detection triggers automatic resolution
if error_type == "captcha":
    captcha_type = detect_captcha_type(metadata)
    captcha_action = CaptchaSolveAction(captcha_type=captcha_type)
    result = session.execute(captcha_action)

Sources: packages/notte-agent/src/notte_agent/gufo/system.md

2. Retry with Backoff Strategy

For transient failures, the system implements configurable retry logic with exponential backoff:

Parameter	Default	Description
`max_retries`	`3`	Maximum retry attempts
`base_delay`	`1000`	Initial delay in milliseconds
`backoff_factor`	`2.0`	Multiplier for each retry
`jitter`	`True`	Randomization to prevent thundering herd

3. Graceful Degradation Strategy

When recovery is not possible, the system performs controlled cleanup:

# Graceful termination sequence
try:
    session.stop()
except Exception as e:
    logger.warning(f"Session cleanup warning: {e}")
finally:
    fallback_manager.record_final_state(error_context)

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Integration with Session Management

The fallback system integrates tightly with the Notte session lifecycle:

graph LR
    A[Session Start] --> B[Agent Initialization]
    B --> C[Task Execution]
    C --> D{Success?}
    D -->|Yes| E[Complete Task]
    D -->|No| F[Fallback Check]
    F --> G{Retryable?}
    G -->|Yes| H[Apply Recovery]
    G -->|No| I[Log Failure]
    H --> C
    I --> J[Session Cleanup]
    E --> J

Session Callback Integration

The SDK exposes callback hooks for fallback integration:

session.on_failure(callback=fallback_manager.handle_failure)
session.on_retry(callback=fallback_manager.prepare_retry)
session.on_success(callback=fallback_manager.record_success)

Configuration Options

Configuration	Type	Default	Description
`fallback_enabled`	`bool`	`True`	Enable/disable fallback system
`max_retry_attempts`	`int`	`3`	Global retry limit
`fallback_timeout_ms`	`int`	`30000`	Timeout for fallback operations
`capture_screenshots`	`bool`	`True`	Screenshot on failure for debugging
`verbose_logging`	`bool`	`False`	Detailed fallback logging

Usage Patterns

Basic Usage with SDK

from notte_sdk import NotteClient

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(session=session)
    agent.start(task="Navigate and extract data", url="https://example.com")
    
    # Fallback system handles failures automatically
    status = agent.status()

Sources: packages/notte-sdk/README.md

Custom Fallback Handler

class CustomFallbackHandler:
    def handle_failure(self, fallback_action: FallbackAction) -> RecoveryResult:
        # Custom logic for specific error types
        if fallback_action.error_type == "captcha":
            return RecoveryResult(strategy="custom_captcha_solver")
        return RecoveryResult(strategy="default")

Error Reporting and Monitoring

The fallback system generates structured error reports containing:

Field	Description
`session_id`	Unique identifier for the session
`agent_id`	Identifier of the agent that failed
`error_type`	Classification of the failure
`error_message`	Human-readable error description
`playwright_code`	Relevant browser automation code
`html_element`	DOM element context when available
`timestamp`	ISO 8601 timestamp of the failure
`retry_history`	Array of previous retry attempts

Best Practices

Enable Verbose Logging During Development: Set verbose_logging=True to capture detailed fallback behavior
Configure Appropriate Timeouts: Match fallback_timeout_ms to your expected operation durations
Monitor Retry Counts: Track retry_count to identify persistent issues
Preserve Error Context: Always include metadata for effective debugging
Test Fallback Paths: Regularly validate fallback behavior under failure conditions

Component	Purpose
`FalcoPrompt`	Agent instruction and action generation
`SessionManager`	Browser session lifecycle
`CaptchaSolveAction`	Specialized captcha resolution
`ScrapeAction`	Data extraction with fallback support
`ErrorProcessing`	Low-level error handling

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py

Sources: packages/notte-agent/src/notte_agent/agent_fallback.py

Browser Sessions

Related topics: Actions and Browser Controls, Vaults and Credential Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Starting a Session

Continue reading this section for the full explanation and source context.

Section Session States

Continue reading this section for the full explanation and source context.

Browser Sessions

Overview

Browser Sessions in Notte provide a managed environment for automating web interactions through a cloud-based browser infrastructure. Sessions encapsulate the state of a browser instance, allowing developers to navigate websites, interact with elements, extract data, and handle complex web automation tasks programmatically.

The session system abstracts away the complexities of browser automation, providing a high-level API for:

Navigating to URLs and managing browser tabs
Observing page elements and available actions
Executing automated interactions (clicks, form fills, scrolling)
Scraping structured and unstructured data from web pages
Solving CAPTCHA challenges automatically

Sources: packages/notte-sdk/README.md

Architecture

graph TD
    A[NotteClient] --> B[Session]
    B --> C[Browser Controller]
    C --> D[Cloud Browser Instance]
    E[Actions] --> B
    B --> F[Observations]
    B --> G[Scrape Results]
    H[Captcha Handler] --> C

Core Components

Component	Package	Responsibility
`Session`	`notte-sdk`	High-level API for session management
`BrowserController`	`notte-browser`	Low-level browser control and state
`CaptchaSolver`	`notte-browser`	Automatic CAPTCHA resolution
`ActionExecutor`	`notte-core`	Action execution and validation

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py Sources: packages/notte-browser/src/notte_browser/controller.py

Session Lifecycle

Starting a Session

Sessions can be initialized using the context manager pattern for automatic cleanup:

from notte_sdk import NotteClient

client = NotteClient()
with client.Session() as session:
    session.execute(type="goto", url="https://www.example.com")
    # Perform actions...

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:30-40

Session States

stateDiagram-v2
    [*] --> Idle: Client initialized
    Idle --> Active: Session started
    Active --> Active: Actions executed
    Active --> Stopping: stop() called
    Stopping --> Stopped: Cleanup complete
    Stopped --> [*]: Context exited

Stopping a Session

When a session stops, cookies are automatically saved if configured:

def stop(self) -> None:
    if self._cookie_file is not None:
        try:
            cookies = self.get_cookies()
            create_or_append_cookies_to_file(self._cookie_file, cookies)
        except Exception as e:
            logger.error(f"🍪 Error saving cookies to {self._cookie_file}: {e}")
    self.response = self.client.stop(session_id=self.session_id)

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:50-65

Session Configuration

Configuration Parameters

Parameter	Type	Default	Description
`timeout_minutes`	`int`	`5`	Session timeout in minutes
`proxies`	`bool`	`True`	Enable proxy support
`cookie_file`	`str`	`None`	Path to cookie persistence file
`open_viewer`	`bool`	`False`	Open visual browser viewer

Creating a Configured Session

with client.Session(timeout_minutes=10, open_viewer=True) as session:
    status = session.status()
    session.viewer()

Sources: packages/notte-sdk/README.md

Page Interaction

Observation

The observe() method retrieves interactive elements on the current page. Notte supports two perception modes:

#### Fast Perception A quick, simple page scan for basic element identification:

obs = session.observe(perception_type='fast')

#### Deep Perception LLM-powered analysis for richer action space generation:

obs = session.observe(perception_type='deep')
print(obs.space.description)

#### Focused Observation Use instructions to narrow the action space to a specific intent:

actions = session.observe(instructions="Fill the email input")
print(actions[0].model_dump())

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:90-110

Element Identification

Elements are identified using structured IDs with the format <role_first_letter><index>[:]:

Role Letter	Element Type
`I`	Input fields (textbox, select, checkbox)
`B`	Buttons
`L`	Links
`F`	Figures and images
`O`	Options in select elements
`M`	Miscellaneous (modals, dialogs)

Example element ID: I2[:]<button> represents an input field at index 2.

Sources: packages/notte-agent/src/notte_agent/gufo/system.md

Browser Actions

Action Types

Notte provides comprehensive browser automation actions:

#### Navigation Actions

Action	Description	Parameters
`goto`	Navigate to a URL	`url: str`
`goto_new_tab`	Open URL in new tab	`url: str`
`close_tab`	Close current tab	-
`scroll`	Scroll the page	`direction: str`, `amount: int`
`scroll_to`	Scroll to element	`id: str`

Sources: packages/notte-core/src/notte_core/actions/actions.py

#### Interaction Actions

Action	Description	Parameters
`click`	Click an element	`id: str`
`fill`	Fill input field	`id: str`, `value: str`
`select`	Select option	`id: str`, `option: str`
`check`	Check checkbox	`id: str`
`press`	Press keyboard key	`key: str`

Sources: packages/notte-core/src/notte_core/actions/actions.py

#### Data Extraction Actions

# Scrape entire page
markdown = session.scrape()

# Scrape with instructions
result = session.scrape(instructions="Extract title and price")

# Scrape only images
session.scrape(only_images=True)

# Structured scraping with JSON schema
session.scrape(response_format={"type": "object", "properties": {...}})

Sources: packages/notte-core/src/notte_core/actions/actions.py:30-60

Executing Actions

# Get observations
obs = session.observe()

# Sample and execute an action
action = obs.space.sample(type='click')
result = session.execute(action)
assert result.success

Sources: packages/notte-sdk/README.md

CAPTCHA Handling

Notte includes automatic CAPTCHA solving capabilities:

session.execute(type="captcha_solve", captcha_type="recaptcha")
session.execute(type="captcha_solve")  # Auto-detect

Supported CAPTCHA Types

Type	Description
`recaptcha`	Google reCAPTCHA
`hcaptcha`	hCaptcha
`image`	Image-based CAPTCHA
`text`	Text-based CAPTCHA
`auth0`	Auth0 CAPTCHA
`cloudflare`	Cloudflare CAPTCHA
`datadome`	DataDome CAPTCHA
`arkose labs`	Arkose Labs CAPTCHA
`geetest`	Geetest CAPTCHA
`press&hold`	Press and hold challenge

Sources: packages/notte-core/src/notte_core/actions/actions.py:70-95

MCP Server Integration

Notte sessions can be accessed via the Model Context Protocol (MCP):

Available Tools

Tool	Description
`notte_new_session`	Start a new cloud browser session
`notte_list_sessions`	List all active sessions
`notte_stop_session`	Stop the current session
`notte_observe`	Observe elements on current page
`notte_screenshot`	Take a screenshot
`notte_scrape`	Extract structured data
`notte_step`	Execute an action

Sources: packages/notte-mcp/README.md

Server Setup

pip install notte-mcp
export NOTTE_API_KEY="your-api-key"
python -m notte_mcp.server

Sources: packages/notte-mcp/README.md

Sessions automatically persist cookies for authenticated workflows:

with client.Session(cookie_file="cookies.json") as session:
    session.execute(type="goto", url="https://example.com/login")
    # Login once - cookies saved automatically on exit

On subsequent runs, cookies are loaded automatically:

def get_cookies(self) -> dict:
    """Load cookies from file"""
    # Implementation handles file existence
    pass

def create_or_append_cookies_to_file(self, cookies: dict) -> None:
    """Persist cookies after session"""
    pass

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:45-65

Error Handling

Session Errors

try:
    with client.Session() as session:
        session.execute(type="goto", url="https://example.com")
except ValueError as e:
    # Session not started
    print(f"Error: {e}")
except RuntimeError as e:
    # Session failed to close
    print(f"Error: {e}")

Action Execution Errors

result = session.execute(action, raise_on_failure=True)
if not result.success:
    print(f"Action failed: {result.error}")

Graceful Shutdown

The context manager ensures proper cleanup even on exceptions:

with client.Session() as session:
    try:
        # Perform actions
        session.execute(type="goto", url="https://example.com")
    except Exception as e:
        logger.error(f"Session error: {e}")
    finally:
        # Cleanup happens automatically
        pass

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:50-70

Best Practices

Resource Management

Always use the with statement for automatic session cleanup
Set appropriate timeout_minutes based on task complexity
Enable open_viewer=True for debugging complex interactions

Performance Optimization

Use perception_type='fast' for simple, quick operations
Use perception_type='deep' when LLM interpretation is needed
Filter observations with instructions to reduce processing overhead

Reliability

Implement retry logic for flaky network conditions
Handle CAPTCHAs proactively using the built-in solver
Persist cookies for authenticated workflows

Security

Store API keys in environment variables
Never commit cookie files with sensitive credentials to version control
Rotate proxy configurations periodically

API Reference

NotteClient.Session

Method	Description
`__enter__()`	Start session
`__exit__()`	Stop session
`execute()`	Execute browser action
`observe()`	Get page elements
`scrape()`	Extract page data
`status()`	Get session status
`viewer()`	Open visual viewer
`cdp_url()`	Get CDP connection URL

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py

Sources: packages/notte-sdk/README.md

Actions and Browser Controls

Related topics: Browser Sessions, Agent Core System

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Base Classes

Continue reading this section for the full explanation and source context.

Section GotoAction

Continue reading this section for the full explanation and source context.

Section GotoNewTabAction

Continue reading this section for the full explanation and source context.

Related topics: Browser Sessions, Agent Core System

Actions and Browser Controls

Overview

Actions and Browser Controls form the core interaction layer of the Notte system, enabling AI agents to navigate, interact with, and extract data from web pages. The action system provides a structured, type-safe mechanism for executing browser operations through a unified API.

Actions in Notte are classified into two primary categories:

Browser Actions - High-level navigation and tab management operations
Interaction Actions - Element-level interactions such as clicking, filling forms, and solving captchas

Sources: packages/notte-core/src/notte_core/actions/actions.py:1-100

Action Hierarchy

Notte implements a class-based action system using Python's type system for validation and safety. All actions inherit from base classes that define common behavior.

graph TD
    A[BaseAction] --> B[BrowserAction]
    A --> C[InteractionAction]
    
    B --> D[GotoAction]
    B --> E[GotoNewTabAction]
    B --> F[CloseTabAction]
    
    C --> G[ClickAction]
    C --> H[FillAction]
    C --> I[ScrapeAction]
    C --> J[CaptchaSolveAction]
    C --> K[FormFillAction]

Base Classes

Class	Purpose	Key Properties
`BaseAction`	Abstract base for all actions	`type`, `description`, `param`
`BrowserAction`	Navigation and tab operations	`execution_message()`
`InteractionAction`	Element-level interactions	`id`, `text_label`

Sources: packages/notte-core/src/notte_core/actions/actions.py:1-200

Browser Actions

Browser Actions manage page navigation and tab lifecycle. These actions operate at the browser level rather than the page element level.

GotoAction

Navigates the current tab to a specified URL.

session.execute(type="goto", url="https://www.example.com")

Property	Type	Description
`type`	`Literal["goto"]`	Action type identifier
`url`	`str`	Target URL to navigate to
`param`	`ActionParameter`	Parameter definition for LLM tooling

Sources: packages/notte-core/src/notte_core/actions/actions.py:50-80

GotoNewTabAction

Opens a URL in a new browser tab. The action returns immediately without waiting for navigation completion.

session.execute(type="goto_new_tab", url="https://www.example.com")

Property	Type	Description
`type`	`Literal["goto_new_tab"]`	Action type identifier
`url`	`str`	Target URL to open in new tab

Sources: packages/notte-core/src/notte_core/actions/actions.py:82-115

CloseTabAction

Closes the currently active browser tab.

session.execute(type="close_tab")

Sources: packages/notte-core/src/notte_core/actions/actions.py:117-140

Interaction Actions

Interaction Actions target specific page elements identified by their DOM IDs. These actions require the agent to first observe the page to obtain valid element identifiers.

ClickAction

Simulates a mouse click on a page element.

session.execute(type="click", id="submit-button")
session.execute(type="click", id="B1", text_label="Submit Form")

Property	Type	Description
`type`	`Literal["click"]`	Action type identifier
`id`	`str`	Element identifier from page observation
`text_label`	`str \	None`	Optional text label for logging

Sources: packages/notte-core/src/notte_core/actions/actions.py:150-180

FillAction

Fills an input field with a specified value. By default, the field is cleared before filling.

session.execute(type="fill", id="email-input", value="[email protected]")
session.execute(type="fill", id="name-input", value="John Doe", clear_before_fill=False)

Property	Type	Default	Description
`id`	`str`	-	Element identifier
`value`	`str \	ValueWithPlaceholder`	-	Value to fill
`clear_before_fill`	`bool`	`True`	Clear field before filling
`text_label`	`str \	None`	`None`	Descriptive label

The ValueWithPlaceholder type allows for complex fill values with embedded placeholders that may be resolved dynamically.

Sources: packages/notte-core/src/notte_core/actions/actions.py:182-230

FormFillAction

Supports batch filling of multiple form fields in a single action, improving efficiency for multi-field forms.

form_values = {
    "address1": "123 Main St",
    "city": "San Francisco",
    "state": "CA",
}
session.execute(type="form_fill", value=form_values)

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py:50-70

CaptchaSolveAction

Attempts to solve captcha challenges detected on the page.

session.execute(type="captcha_solve", captcha_type="recaptcha")

Supported Captcha Types:

recaptcha - Google reCAPTCHA
hcaptcha - hCaptcha
Image verification challenges

Critical Rule: Agents must never click on captcha elements directly. When any captcha is detected, the captcha_solve action must be used exclusively.

Sources: packages/notte-agent/src/notte_agent/gufo/system.md:40-60

ScrapeAction

Extracts structured data from the current page based on natural language instructions.

session.execute(
    type="scrape",
    instructions="Extract the search results from the Google search page"
)

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py:80-90

Action Execution Flow

The following diagram illustrates how actions flow through the Notte system from agent decision to browser execution:

sequenceDiagram
    participant Agent
    participant SDK as Notte SDK
    participant API as Notte API
    participant Browser as Browser Engine
    
    Agent->>SDK: session.execute(type="click", id="B1")
    SDK->>API: POST /sessions/{id}/execute
    API->>Browser: playwright.click("#B1")
    Browser-->>API: Success/Failure
    API-->>SDK: ExecutionResult
    SDK-->>Agent: ExecutionResult
    
    Note over Agent,Browser: Action parameters validated<br/>at SDK layer

Session-Level Execution

Actions are executed within a Session context that maintains browser state and provides observation capabilities:

from notte_sdk import Notte

notte = Notte(api_key="your-api-key")
session = notte.sessions.create()

# Execute actions
session.execute(type="goto", url="https://example.com")

# Observe available actions
obs = session.observe()
print(obs.space.description)

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-60

Perception Types

When observing available actions, the system supports different perception depths:

Perception Type	Description	Use Case
`fast`	Simple page perception	Quick action queries
`deep`	LLM-powered element formatting	Rich, structured action space

# Fast perception for quick queries
obs = session.observe(perception_type="fast")

# Deep perception for comprehensive action space
obs = session.observe(perception_type="deep")

The instructions parameter can narrow the action space to a specific intent:

actions = session.observe(instructions="Fill the email input")

Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:40-70

Action Registry

The ActionRegistry manages available actions and generates JSON schemas for agent tooling:

class ActionRegistry:
    def __init__(self, tools: list[BaseTool] | None = None) -> None:
        self.tools = tools or []
    
    def get_schema(self, action_cls: type) -> dict[str, Any]:
        # Generates JSON schema for action class

The registry processes each tool to extract action descriptions and parameter schemas, enabling dynamic action space generation for agents.

Sources: packages/notte-agent/src/notte_agent/falco/prompt.py:1-50

Element ID Format

Interactive elements are referenced using a structured ID format that encodes element type and position:

Prefix	Element Type	Example
`I`	Input fields (textbox, select, checkbox)	`I1`, `I2`
`B`	Buttons	`B1`, `B2`
`L`	Links	`L1`, `L2`
`F`	Figures and images	`F1`
`O`	Options in select elements	`O1`
`M`	Miscellaneous (modals, dialogs)	`M1`

Important: Element IDs can and will change at each page observation. Agents must not cache or assume ID persistence across steps.

Sources: packages/notte-agent/src/notte_agent/gufo/system.md:10-30

Action Response Format

All action executions return an ExecutionResult containing:

Field	Type	Description
`success`	`bool`	Whether the action succeeded
`message`	`str`	Human-readable result description
`error`	`str \	None`	Error details if failed

When raise_on_failure is set, execution will raise an exception on failure; otherwise, the result is returned with error information included.

Best Practices

Always observe before acting - Use session.observe() to get current element IDs before executing interaction actions
Handle captchas properly - Use captcha_solve when captchas are detected; never click captcha elements directly
Validate IDs per step - Element IDs change between observations; never assume ID stability
Use batch operations - Prefer FormFillAction over multiple FillAction calls for multi-field forms
Set appropriate perception types - Use fast for quick checks, deep when comprehensive element understanding is needed

Sources: packages/notte-core/src/notte_core/actions/actions.py:1-100

Vaults and Credential Management

Related topics: Agent Personas, Browser Sessions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Base Credential Model

Continue reading this section for the full explanation and source context.

Section Credential Actions

Continue reading this section for the full explanation and source context.

Section Browser Vault (notte-browser)

Continue reading this section for the full explanation and source context.

Related topics: Agent Personas, Browser Sessions

Vaults and Credential Management

Overview

Notte's Vaults and Credential Management system provides a secure, centralized mechanism for storing, retrieving, and automatically injecting website credentials during browser automation tasks. The system eliminates the need for agents to manually handle authentication credentials, reducing security risks and improving automation reliability.

The vault architecture spans multiple layers:

Layer	Package	Purpose
Core Types	`notte-core`	Defines credential data models and base classes
Browser Integration	`notte-browser`	Handles credential injection during page interactions
SDK API	`notte-sdk`	Provides REST API endpoints for vault operations
External Integrations	`notte-integrations`	Supports third-party vault solutions (e.g., HashiCorp Vault)

Architecture

graph TD
    A[NotteClient] --> B[VaultsClient]
    B --> C[Cloud Vault API]
    C --> D[NotteVault]
    
    E[Agent Session] --> F[Browser Session]
    F --> G[NotteBrowser Vault]
    G --> D
    
    H[Persona] --> I[Vault Association]
    I --> D
    
    J[HashiCorpVault] --> K[External Vault Server]
    
    D -.-> L[Credential Replacement]
    L --> F

Credential Types

Base Credential Model

The credential system is built on a flexible data model defined in notte-core. Credentials are identified by URL and support multiple authentication fields:

# Simplified from packages/notte-core/src/notte_core/credentials/types.py
class Credential(BaseModel):
    url: str                    # Target website URL
    username: str | None = None # Username or email
    email: str | None = None    # Email address
    password: str | None = None # Password
    totp_secret: str | None = None # TOTP 2FA secret
    notes: str | None = None    # Additional notes
    metadata: dict | None = None # Custom metadata

Credential Actions

When an agent needs to authenticate, the system creates a CredentialAction that describes the required credential field:

# From packages/notte-core/src/notte_core/credentials/types.py
class CredentialAction(BaseModel):
    url: str                          # Target URL
    action: Literal["fill", "verify"] # Operation type
    field: Literal["username", "password", "email", "totp"] # Required field
    locator: LocatorAttributes | None = None # DOM element context

Vault Implementation

Browser Vault (`notte-browser`)

The browser-side vault (NotteBrowserVault) manages credential lifecycle within browser sessions:

# From packages/notte-browser/src/notte_browser/vault.py
class NotteBrowserVault:
    def __init__(
        self,
        vault_id: str | None,
        api_key: str | None,
        server_url: str | None,
        verbose: bool = False,
    ) -> None:
        self.vault_id = vault_id
        self.vault: NotteVault | None = None
        self._api_key = api_key
        self._server_url = server_url

Key Methods:

Method	Purpose	Source
`request_credentials()`	Retrieves credentials from vault for current URL	vault.py:1-100
`replace_credentials()`	Injects credentials into action attributes	vault.py:1-100
`add_credentials()`	Stores new credentials in vault	vault.py:1-100
`generate_password()`	Creates secure random passwords	vault.py:1-100

Credential Replacement Flow

When executing form-fill actions, the vault automatically replaces credential placeholders:

sequenceDiagram
    participant Agent
    participant Session
    participant Vault
    participant Browser
    
    Agent->>Session: Execute FormFillAction
    Session->>Vault: request_credentials(url)
    Vault->>Session: Return Credential
    Session->>Browser: Replace placeholders with actual values
    Browser->>Website: Submit filled form

The replacement logic in session.py demonstrates this:

# From packages/notte-browser/src/notte_browser/session.py
if locator is not None:
    attrs = LocatorAttributes(
        type=await locator.get_attribute("type"),
        autocomplete=await locator.get_attribute("autocomplete"),
        outerHTML=await locator.evaluate("el => el.outerHTML"),
    )
return await self.vault.replace_credentials(action, attrs, snapshot)

SDK API Endpoints

VaultsClient

The VaultsClient provides programmatic access to vault operations:

# From packages/notte-sdk/src/notte_sdk/endpoints/vaults.py
class VaultsClient:
    CREATE_VAULT = "vaults"
    
    @track_usage("cloud.vault.create")
    def create(self, **data: Unpack[VaultCreateRequestDict]) -> Vault:
        """Create a new vault"""
        
    def get(self, vault_id: str) -> str:
        """Retrieve vault by ID"""
        
    @track_usage("cloud.vault.credentials.add")
    def add_or_update_credentials(
        self, vault_id: str, **data: Unpack[AddCredentialsRequestDict]
    ) -> AddCredentialsResponse:
        """Add or update credentials in vault"""

API Request Models

Model	Fields	Purpose
`VaultCreateRequest`	`name`, `description`	Create new vault
`AddCredentialsRequest`	`url`, `username`, `email`, `password`, `totp_secret`	Add credential entry
`AddCredentialsResponse`	`id`, `url`, `created_at`	Response confirmation

Persona Integration

Personas can be associated with vaults for persistent identity management:

# From packages/notte-sdk/src/notte_sdk/endpoints/personas.py
class NottePersona:
    def _get_vault(self) -> NotteVault | None:
        """Get vault associated with this persona"""
        if self.info.vault_id is None:
            return None
        return NotteVault(self.info.vault_id, _client=self.vault_client)
    
    def add_credentials(self, url: str) -> None:
        """Add credentials to the persona's vault"""
        vault = self._get_vault()
        password = vault.generate_password()
        vault.add_credentials(url, email=self.info.email, password=password)

External Vault Integration

HashiCorp Vault

Notte supports integration with HashiCorp Vault for enterprise credential management:

# From packages/notte-integrations/src/notte_integrations/credentials/README.md
from notte_agent.main import Agent
from notte_integrations.credentials.hashicorp.vault import HashiCorpVault
import os

vault = HashiCorpVault(
    url=os.getenv("VAULT_URL"),
    token=os.getenv("VAULT_DEV_ROOT_TOKEN_ID")
)

vault.add_credentials(
    url="https://x.com",
    username=os.getenv("TWITTER_USERNAME"),
    password=os.getenv("TWITTER_PASSWORD")
)

agent = Agent(vault=vault)

Setup Requirements:

``bash cd packages/notte-integrations/src/notte_integrations/credentials/hashicorp docker-compose --env-file ../../../../../.env up ``

Start HashiCorp Vault server:

``bash VAULT_URL=http://0.0.0.0:8200 VAULT_DEV_ROOT_TOKEN_ID=<your-token> ``

Configure environment variables:

Error Handling

The vault system handles credential-related errors gracefully:

# From packages/notte-core/src/notte_core/errors/processing.py
class VaultCredentialError(NotteBaseError):
    def __init__(self, error_message: str) -> None:
        dev_message = "Unexpected error while requesting credentials from vault"
        super().__init__(
            agent_message=agent_message,
            user_message=user_message,
            dev_message=dev_message,
        )

Common Error Scenarios:

Error	Cause	Resolution
`CredentialNotFoundError`	No credentials stored for URL	Add credentials via SDK or console
`VaultCredentialError`	Vault unavailable or API failure	Check API key and network connectivity
`FieldMismatchError`	Vault lacks required credential field	Ensure credential has all required fields

Usage Patterns

Basic Agent with Vault

from notte_sdk import NotteClient

client = NotteClient()

with client.Vault() as vault:
    vault.add_credentials(
        url="https://github.com",
        email="[email protected]",
        password="secure-password"
    )
    
    with client.Session() as session:
        agent = client.Agent(session=session, vault=vault, max_steps=10)
        response = agent.run(
            task="go to twitter; login and go to my messages",
        )

Auth Vault Agent Example

The auth-vault-agent example demonstrates secure GitHub authentication:

# From examples/auth-vault-agent/README.md
with client.Vault() as vault:
    with client.Session(browser_type="chrome", open_viewer=True) as session:
        agent = client.Agent(session=session, vault=vault, max_steps=10)
        response = agent.run(
            task="go to twitter; login and go to my messages",
        )

Security Considerations

Credential Encryption: Vaults store credentials encrypted on Notte's servers
API Key Authentication: All vault operations require valid API key
Persona Isolation: Each persona has its own vault for identity separation
Session Cookies: Cookies are automatically saved and restored with vault credentials Sources: packages/notte-sdk/src/notte_sdk/endpoints/sessions.py:1-100

Summary

Notte's Vaults and Credential Management system provides:

Secure Storage: Centralized credential repository with encryption
Automatic Injection: Seamless credential replacement during browser automation
Multi-Provider Support: Built-in Notte vaults and HashiCorp Vault integration
Persona Association: Per-identity credential isolation
API-First Design: Full programmatic control via SDK

Source: https://github.com/nottelabs/notte / Human Manual

Agent Personas

Related topics: Vaults and Credential Management, Browser Sessions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Persona Initialization Parameters

Continue reading this section for the full explanation and source context.

Section Agent Configuration with Persona

Continue reading this section for the full explanation and source context.

Section Form Filling with Identity

Continue reading this section for the full explanation and source context.

Agent Personas

Agent Personas provide digital identities that can be attached to Agent instances, enabling automated browser interactions with unique credentials, phone numbers, and automated 2FA handling.

Overview

Agent Personas are specialized identity objects that provide your AI agents with realistic digital identities for web automation tasks. They solve the common challenge of websites requiring phone number verification, email confirmation, or 2FA codes by providing automated handling of these verification steps.

Sources: packages/notte-sdk/README.md

Key Features

Feature	Description
Email Addresses	Unique email addresses associated with the persona
Phone Numbers	Optional phone numbers for SMS verification (configurable)
Automated 2FA	Automatic handling of two-factor authentication codes
Identity Persistence	Persona persists across session boundaries

Architecture

graph TD
    A[NotteClient] --> B[Persona]
    B --> C[Email Identity]
    B --> D[Phone Identity]
    B --> E[2FA Handler]
    F[Agent] --> B
    G[Session] --> F
    H[Vault] --> E

Usage Pattern

Personas are used as context managers alongside Sessions to provide identity context for agent operations:

from notte_sdk import NotteClient

client = NotteClient()

with client.Persona(create_phone_number=False) as persona:
    with client.Session(browser_type="chrome", open_viewer=True) as session:
        agent = client.Agent(session=session, persona=persona, max_steps=15)
        response = agent.run(
            task="Open the Google form and RSVP yes with your name",
            url="https://forms.google.com/your-form-url",
        )
print(response.answer)

Sources: packages/notte-sdk/README.md

Configuration Options

Persona Initialization Parameters

Parameter	Type	Default	Description
`create_phone_number`	bool	`True`	Whether to provision a phone number for this persona

Agent Configuration with Persona

Parameter	Type	Description
`session`	Session	Active browser session
`persona`	Persona	Digital identity to attach
`max_steps`	int	Maximum steps for task completion

Credential Management

Personas integrate with Notte's credential vault system for secure storage and retrieval of authentication credentials:

sequenceDiagram
    participant A as Agent
    participant P as Persona
    participant V as Vault
    participant W as Website
    
    A->>P: Request 2FA code
    P->>V: Retrieve credential
    V-->>P: Credential data
    P->>W: Submit 2FA code

Sources: packages/notte-core/src/notte_core/credentials/base.py

API Endpoints

The Personas functionality is exposed through the SDK's endpoint interface:

Endpoint	Purpose
`personas.create()`	Create a new persona with identity credentials
`personas.list()`	List available personas
`personas.get()`	Retrieve persona details
`personas.delete()`	Remove a persona

Sources: packages/notte-sdk/src/notte_sdk/endpoints/personas.py

Use Cases

Form Filling with Identity

When completing web forms that require personal information:

with client.Persona() as persona:
    with client.Session() as session:
        agent = client.Agent(session=session, persona=persona)
        agent.run(
            task="Complete the registration form with your details",
            url="https://example.com/register"
        )

2FA-Protected Actions

For websites requiring two-factor authentication:

with client.Persona(create_phone_number=True) as persona:
    with client.Session() as session:
        agent = client.Agent(session=session, persona=persona, max_steps=20)
        agent.run(
            task="Login to your account and download your data",
            url="https://secure-site.com/dashboard"
        )

Anonymous Browsing

When phone verification is not needed:

with client.Persona(create_phone_number=False) as persona:
    agent = client.Agent(persona=persona)
    # Persona provides email without phone number

Best Practices

Resource Management: Always use Personas within context managers (with statement) to ensure proper cleanup
Phone Number Provisioning: Disable phone number creation (create_phone_number=False) when only email identity is needed to reduce costs
Session Coordination: Pair Persona usage with Session context for proper browser automation
Step Limits: Set appropriate max_steps values when using Personas with complex multi-step workflows

Error Handling

When persona-related operations fail, the SDK provides structured error responses:

try:
    with client.Persona() as persona:
        # operations
except Exception as e:
    # Persona errors are handled through NotteBaseError hierarchy
    logger.error(f"Persona operation failed: {e}")

Sources: packages/notte-core/src/notte_core/errors/processing.py

Sources: packages/notte-sdk/README.md

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium v1.8.8

First-time setup may fail or require extra isolation and rollback planning.

medium v1.8.13

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium v1.8.14

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

medium v1.8.15

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

Doramagic Pitfall Log

Doramagic extracted 13 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: v1.8.8

Severity: medium
Finding: Installation risk is backed by a source signal: v1.8.8. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.8

2. Configuration risk: v1.8.13

Severity: medium
Finding: Configuration risk is backed by a source signal: v1.8.13. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.13

3. Configuration risk: v1.8.14

Severity: medium
Finding: Configuration risk is backed by a source signal: v1.8.14. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.14

4. Configuration risk: v1.8.15

Severity: medium
Finding: Configuration risk is backed by a source signal: v1.8.15. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.15

5. Configuration risk: v1.8.6

Severity: medium
Finding: Configuration risk is backed by a source signal: v1.8.6. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.6

6. Configuration risk: v1.8.9

Severity: medium
Finding: Configuration risk is backed by a source signal: v1.8.9. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.9

7. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:900152988 | https://github.com/nottelabs/notte | README/documentation is current enough for a first validation pass.

8. Project risk: v1.8.7

Severity: medium
Finding: Project risk is backed by a source signal: v1.8.7. Treat it as a review item until the current version is checked.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/nottelabs/notte/releases/tag/v1.8.7

9. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:900152988 | https://github.com/nottelabs/notte | last_activity_observed missing

10. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:900152988 | https://github.com/nottelabs/notte | no_demo; severity=medium

11. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:900152988 | https://github.com/nottelabs/notte | no_demo; severity=medium

12. Maintenance risk: issue_or_pr_quality=unknown

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:900152988 | https://github.com/nottelabs/notte | issue_or_pr_quality=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 11

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using notte with real data or production workflows.

v1.8.16 - github / github_release
v1.8.15 - github / github_release
v1.8.14 - github / github_release
v1.8.13 - github / github_release
v1.8.11 - github / github_release
v1.8.10 - github / github_release
v1.8.9 - github / github_release
v1.8.8 - github / github_release
v1.8.7 - github / github_release
v1.8.6 - github / github_release
README/documentation is current enough for a first validation pass. - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

notte

Introduction to Notte

Related Pages

Introduction to Notte

Overview

Architecture

Core Packages

Browser Actions

Navigation Actions

Session Management

Session Lifecycle

Basic Session Usage

Observation Types

Cookie Management

Agent System

Action Identification System

CAPTCHA Handling

Validation System

Data Extraction

Document Analysis Pipeline

Output Format

API Integration

REST API Endpoint

SDK Client Usage

Personas

Available Operations

Error Handling

Core Error Classes

Search Demo

License and Citation

Quickstart Guide

Related Pages

Quickstart Guide

Prerequisites

Environment Setup

1. Obtain API Credentials

2. Configure Environment Variables

3. Install the SDK

Basic Usage

SDK Client Initialization

Simple Web Scraping

Structured Data Extraction

Session-Based Automation

API Reference

Client Configuration Parameters

Scrape Parameters

Workflow Diagram

cURL Alternative

Next Steps

System Architecture

Related Pages

System Architecture

Overview

High-Level Architecture

Core Package (`notte-core`)

Error Handling Architecture

Placeholder System

SDK Package (`notte-sdk`)

Session Management

Observation System

Action Execution

Cookie Management

Agent Package (`notte-agent`)

Agent Subsystem Comparison

Element Identification System

Gufo Agent System

Falco Agent System

CAPTCHA Handling

Action Examples

LLM Package (`notte-llm`)

Prompt Categories

Document Categorization

Data Extraction Templates

Structured Output Generation

Browser Package (`notte-browser`)

Key Responsibilities

CDP Integration

Data Flow Architecture

Session State Machine

Configuration Options