Doramagic Project Pack · Human Manual

browserable

Open source and self-hostable browser automation library for AI agents

Overview, Architecture & Getting Started

Related topics: AI Agents, Prompts & LLM Integration, REST API, JavaScript SDK & Custom Functions, Deployment, Configuration & Troubleshooting

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Agent Execution Loop

Continue reading this section for the full explanation and source context.

Related topics: AI Agents, Prompts & LLM Integration, REST API, JavaScript SDK & Custom Functions, Deployment, Configuration & Troubleshooting

Overview, Architecture & Getting Started

What is Browserable

Browserable is an open-source platform that lets AI agents drive a real web browser to complete user tasks. The repository bundles the agents, an orchestration layer, a browser automation service, a JavaScript SDK, and a desktop-style Admin UI into a single monorepo that can be launched with the npx browserable CLI (README context referenced in community issue #8).

The defining capability is the BROWSER_AGENT, declared in tasks/agents/browserable.js, which exposes four primitives — open_new_tab, read_tab, act_on_tab, and extract_from_tab — and is documented to "figure out how to procure a remote browser session + perform tasks like clicking, typing, etc on it." Higher-level agents (such as a Google Sheets agent) are explicitly preferred over the browser agent when a targeted tool is available.

System Architecture

Browserable is a polyglot monorepo. The Admin UI ships as an Electron-forge desktop application (see make scripts in ui/package.json), the SDK is published as browserable-js on npm (sdk/browserable-js/package.json), and the runtime is split across multiple long-running services that communicate over HTTP and Postgres-backed message logs.

flowchart LR
    User["User / Client App"] --> SDK["browserable-js SDK"]
    User --> CLI["npx browserable CLI"]
    User --> UI["Admin UI (Electron)"]
    CLI --> Tasks["tasks service<br/>(Jarvis orchestrator + agents)"]
    UI --> Tasks
    SDK --> Tasks
    Tasks --> DB[("Postgres<br/>message_logs")]
    Tasks --> Browser["browser service<br/>(Playwright session)"]
    Tasks --> LLM["OpenAI-compatible LLM<br/>(gemini, gpt-4o, claude, deepseek, qwen)"]
    Browser -->|screenshots / DOM| Tasks
    LLM -->|tool calls| Tasks

Core Components

ComponentSourceResponsibility
Jarvis orchestratortasks/agents/jarvis.jsSplits a user request into a flow of sub-tasks, schedules node loopers, and aggregates results into a structured outputGenerated object via richOutputPrompt.js
BaseAgenttasks/agents/base.jsProvides shared lifecycle helpers (_action_end, error reporting) for every concrete agent
BROWSER_AGENTtasks/agents/browserable.jsThe browser-driving agent; emits agent, user, and debug log segments, and persists screenshots after each chunk
Action / extract promptsactionPrompts.js, extractPrompts.jsJSON-only prompts that ask the LLM to choose doAction, skipSection, or actionCompleted
Admin UI shellui/src/containers/FlowContainer.jsx, ui/src/routes/NotFound.jsxRenders the live run timeline, message logs, screenshots, and code/markdown payloads
JavaScript SDKsdk/browserable-js/src/types.tsTyped client (BrowserableConfig, Task, TaskRunStatus, TaskRunGifResult) that wraps the REST API

Agent Execution Loop

The browser agent calls a shared callOpenAICompatibleLLMWithRetry helper with a fallback chain of gemini-2.0-flash, deepseek-chat, gpt-4o-mini, claude-3-5-haiku, and qwen-plus (see tasks/agents/browserable.js). For each chunk of a long page it: scrolls, waits for a settled DOM, takes a screenshot, asks the LLM to extract structured content, and recursively calls textExtractHelper until the schema is satisfied or the run is no longer active. A separate refine step (buildRefineExtractedContentPrompt) consolidates the chunked extractions before returning.

Getting Started

Browserable is distributed primarily through the npx browserable CLI. Per community issue #8, the CLI supports a --help flag and a down subcommand to tear the stack down; the canonical npx browserable flow boots the Docker Compose stack defined under deployment/.

A typical first run:

  1. Ensure Docker and a working node/npm are on PATH. (See the troubleshooting note below for the common WSL failure mode.)
  2. Run npx browserable from any directory; the CLI pulls and starts the Admin UI, the tasks service, the browser service, and Postgres.
  3. Open the Admin UI; the dashboard route renders the live run timeline implemented in ui/src/containers/FlowContainer.jsx.
  4. Author a task in natural language — Jarvis breaks it into nodes, dispatches the BROWSER_AGENT for browser work, and writes structured output back to the message log (tasks/agents/jarvis.js).
  5. From a separate Node project, install the SDK:
npm install browserable-js

The SDK exposes the typed surface from sdk/browserable-js/src/types.ts (CreateTaskOptions, TaskRunStatus, WaitForRunOptions) and is the recommended integration path for headless automation.

Common Setup Issues

The community has surfaced three recurring first-run failure modes that are worth documenting up front:

  • "Initial setup is in progress" hang (issue #6). The maintainer confirmed this almost always means the tasks service did not come up healthy. docker ps will surface the unhealthy container; docker exec -it browserable then inspect the logs to identify the cause.
  • /usr/bin/env: 'node --no-warnings': No such file or directory (issue #20). Reported on Ubuntu WSL when node resolves through fnm/nvm shims that the shebang cannot locate. The workaround is to invoke the CLI with an absolute path to node, e.g. node $(which npx) browserable, or to ensure node is on a stable PATH.
  • LLM provider coverage. Groq and OpenRouter are tracked as roadmap: planned (issue #5); Ollama/local LLMs are roadmap: requests (issue #9). Until first-class support lands, issue #3 documents the manual workaround: replace https://api.openai.com/v1/chat/completions with the provider's OpenAI-compatible endpoint in the source.

See Also

  • Custom Tools / Functions guide (closed in issue #10)
  • Task GIF generation via REST API and JS SDK (issue #13)
  • Local browser support (issue #4)
  • Troubleshooting documentation (issue #15)

Source: https://github.com/browserable/browserable / Human Manual

AI Agents, Prompts & LLM Integration

Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Community notes on LLM providers

Continue reading this section for the full explanation and source context.

Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions

AI Agents, Prompts & LLM Integration

Overview

Browserable is an open-source browser automation library for AI agents that currently reaches 90.4% on the Web Voyager benchmarks Source: [README.md](). The system is built around a multi-agent architecture in which each agent is responsible for a distinct capability (browser interaction, LLM passthrough, orchestration, research), and every LLM-backed step is driven by structured prompts that produce JSON-shaped tool calls.

The "AI Agents, Prompts & LLM Integration" subsystem therefore covers three concerns:

  1. The agent classes that define what each agent can do and how it terminates.
  2. The prompt library that instructs the underlying LLM at each step (extraction, action selection, routing, output formatting).
  3. The LLM integration layer that talks to OpenAI-compatible endpoints with a model-fallback list per use case.

Agent Architecture

All agents extend a shared BaseAgent defined in tasks/agents/base.js. The base class provides three reusable action handlers:

ActionPurpose
errorLogs an irrecoverable error to user/debug logs and calls errorAtNode Source: [tasks/agents/base.js]().
endWrites final output and reasoning markdown to the user log and closes the node as completed Source: [tasks/agents/base.js]().
getBaseActionsInherited by all subclasses to register custom actions.

Four concrete agent implementations ship in the repository:

  • BrowserableAgent (CODE = "BROWSER_AGENT") — the headline browser automation agent. It exposes four high-level actions — open_new_tab, read_tab, act_on_tab, and extract_from_tab — and is described in its system prompt as the agent to use "if the user explicitly asks you to do something on the browser" Source: [tasks/agents/browserable.js](). It always confirms results with a read_tab after each act_on_tab.
  • GenerativeAgent (CODE = "GENERATIVE_AGENT") — a thin wrapper that passes a task string to an LLM and returns an output string. The system prompt explicitly notes it is a "simple vanilla dumb agent" intended for trivial text-to-text calls Source: [tasks/agents/generative.js]().
  • JarvisAgent — the orchestrator/router that decides which sub-agent handles a row of a data table; its prompts live in tasks/prompts/agents/jarvis/ and are exported through index.js Source: [tasks/prompts/agents/jarvis/index.js]().
  • DeepResearchAgent — drives multi-step web research, including SERP processing prompts such as buildProcessSerpsPrompt Source: [tasks/prompts/agents/deepresearch/processSerpsPrompt.js]().
flowchart TB
    User[User / SDK Caller] --> Tasks[tasks service]
    Tasks --> Jarvis[JarvisAgent<br/>orchestrator]
    Jarvis -->|route sub-task| Browserable[BrowserableAgent<br/>BROWSER_AGENT]
    Jarvis -->|route sub-task| Gen[GenerativeAgent<br/>GENERATIVE_AGENT]
    Jarvis -->|route sub-task| Research[DeepResearchAgent]
    Browserable --> LLM[(OpenAI-compatible<br/>LLM endpoint)]
    Gen --> LLM
    Jarvis --> LLM
    Research --> LLM
    LLM -->|tool/function call JSON| Agent[Selected Agent]
    Agent -->|updateNodeUserLog<br/>endNode| Tasks

LLM Integration

The LLM layer is OpenAI-compatible and accepts a list of candidate models that are tried in order, enabling graceful fallback. Two example call sites illustrate the pattern:

  • Refining extracted page content uses the cascade ["gemini-2.0-flash", "deepseek-chat", "gpt-4o-mini", "claude-3-5-haiku", "qwen-plus"] with max_attempts: 3 Source: [tasks/agents/browserable.js]().
  • Deciding the next Playwright action uses ["gemini-2.0-flash", "deepseek-chat", "claude-3-5-sonnet", "gpt-4o", "qwen-plus"] with the same retry budget Source: [tasks/agents/browserable.js]().

Every call carries a metadata object with runId, nodeId, agentCode, usecase, flowId, accountId, and threadId, which the orchestrator uses to attribute logs back to the right node. Callers can short-circuit a long-running run by checking jarvis.isRunActive({ runId, flowId }) between LLM calls; if the run was cancelled, the agent returns early with completed: false and a descriptive message Source: [tasks/agents/browserable.js]().

Community notes on LLM providers

  • The default deployment targets OpenAI directly; the maintainers have noted that Groq and OpenRouter can be enabled by replacing https://api.openai.com/v1/chat/completions with a Groq-compatible URL Source: issue [#3, #5]().
  • Local LLM support (Ollama) is on the roadmap as a request Source: issue [#9]().

Prompt System

Prompts are colocated with their agents under tasks/prompts/agents/<agent-name>/ and exported as builder functions that return OpenAI-style messages arrays. Major prompt modules include:

  • extractPrompts.jsbuildExtractLLMPrompt instructs the LLM to print exact text from a rendered webpage or DOM slice and to emit JSON with a justification field. It is sensitive to whether the input is a text rendering or a raw DOM list Source: [tasks/prompts/agents/browserable/extractPrompts.js]().
  • actionPrompts.js — defines the function-calling schema for doAction, skipSection, and actionCompleted. The LLM must emit exactly one of these three JSON shapes, each carrying a reason plus optional Playwright method/args/element Source: [tasks/prompts/agents/browserable/actionPrompts.js]().
  • richOutputPrompt.js — assembles the final structured answer for a user. It enforces a hard ceiling of 4000 words for the entire outputGenerated object and reminds the model to honor per-field word limits Source: [tasks/prompts/agents/jarvis/richOutputPrompt.js]().
  • datatablePrompts.js — describes how Jarvis decomposes a user request into rows, delegates each row to a sub-agent, and merges results back. It codifies the work_on_subtask_before_deciding action code used when a row's prerequisites are missing Source: [tasks/prompts/agents/jarvis/datatablePrompts.js]().
  • processSerpsPrompt.js — directs the deep-research model to extract at most eight unique, dense learnings and three follow-up questions from SERP content Source: [tasks/prompts/agents/deepresearch/processSerpsPrompt.js]().

Common Failure Modes and Workarounds

  • /usr/bin/env: 'node --no-warnings': No such file or directory when running npx browserable inside WSL/Ubuntu with a fnm multishell. The shebang is interpreted by a shell that splits on spaces; the workaround is to launch npx from a regular login shell where node resolves to a single path (for example, which node returning /run/user/0/fnm_multishells/.../node confirms the environment quirk) Source: issue [#20]().
  • Stuck "Initial setup is in progress" on the admin UI. The maintainers recommend docker ps to check for an unhealthy tasks service, then docker exec -it browserable ... to inspect logs; the frontend is waiting for the backend health check to succeed Source: issue [#6]().
  • No Groq / OpenRouter entry in the Admin UI. There is no first-class toggle yet, so users must point the OpenAI base URL at a compatible endpoint by editing the configuration Source: issues [#3, #5]().

See Also

  • Task Runs & Status Polling — covers TaskRunStatus and the WaitForRunOptions poll loop Source: [sdk/browserable-js/src/types.ts]().
  • Custom Tools & Functions — the public guide for registering user-defined tool calls alongside the built-in agent actions Source: issue [#10]().
  • Local Browser & Deployment — running the browserable Docker stack and CLI (npx browserable --help, npx browserable down) for local browser support Source: issues [#4, #8]().
  • Troubleshooting — covers the Initial setup is in progress symptom and other deployment pitfalls Source: issue [#15]().

Source: https://github.com/browserable/browserable / Human Manual

REST API, JavaScript SDK & Custom Functions

Related topics: Overview, Architecture & Getting Started, AI Agents, Prompts & LLM Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview, Architecture & Getting Started, AI Agents, Prompts & LLM Integration

REST API, JavaScript SDK & Custom Functions

Overview

Browserable exposes its browser-automation capabilities through a REST API, a typed JavaScript/TypeScript SDK, and an extension point for custom tools/functions. Together these form the developer-facing surface for programmatically creating tasks, polling run status, retrieving run results, generating task GIFs, and extending the built-in BROWSER_AGENT with user-defined functions.

The project positions itself as open-source and self-hostable. The README directs users to a hosted REST endpoint and a JS SDK guide (README.md). The SDK is published as browserable-js (sdk/browserable-js/package.json) and depends on axios ^1.6.7 for HTTP transport.

A bundled example project at sdk/examples/js-sdk-test (sdk/examples/js-sdk-test/README.md) demonstrates the typical user flow against a local API at http://localhost:2003/api/v1.

REST API Surface

The JavaScript SDK is a thin wrapper over the REST API, so the SDK method list effectively documents the supported endpoints. The SDK methods are typed in sdk/browserable-js/src/types.ts and demonstrated in sdk/browserable-js/README.md.

SDK MethodPurposeReturns
createTask({ task, agent, triggers })Submit a new task for the default or specified agent.{ taskId }
listTasks({ page, limit })List tasks for the authenticated account.Paginated Task[]
getTaskRunStatus(taskId, runId?)Poll the lifecycle state of a run.TaskRunStatus
getTaskRunResult(taskId, runId?)Fetch final output of a run.TaskRunResult
getTaskRunGif(taskId, runId)Retrieve a rendered GIF of the run.TaskRunGifResult
stopRun(taskId, runId?)Cancel a running task.API envelope
waitForRun(taskId, options?)Block-poll until status is terminal.Final status
getUserProfile()Return the authenticated user.Profile object
listBrowsers()Enumerate registered browser providers.Browser list

All responses share a common envelope ApiResponse<T> defined in sdk/browserable-js/src/types.ts, with success: boolean, optional data, optional error, and pagination fields (total, page, limit).

sequenceDiagram
    participant App as Caller (SDK / curl)
    participant API as REST API
    participant Tasks as tasks service
    participant Agent as BROWSER_AGENT
    App->>API: POST /tasks (createTask)
    API->>Tasks: enqueue
    Tasks->>Agent: schedule node
    Agent-->>Tasks: status updates
    App->>API: GET /tasks/:id/runs/:runId (getTaskRunStatus)
    App->>API: GET /tasks/:id/runs/:runId/result
    App->>API: GET /tasks/:id/runs/:runId/gif

Task run lifecycle values shown in the type definitions are scheduled | running | completed | error (sdk/browserable-js/src/types.ts). The GIF endpoint mirrors this with pending | completed | error and a url field for the rendered asset.

The example test harness iterates the lifecycle by listing tasks and creating a browser session (sdk/examples/js-sdk-test/README.md), confirming the practical order of calls a developer is expected to make.

JavaScript SDK

The SDK is implemented in TypeScript and built with tsc (sdk/browserable-js/package.json). It exports a Browserable class initialized with an API key and optional baseURL (sdk/browserable-js/README.md). The default base URL points to the hosted service; the example project overrides it to http://localhost:2003/api/v1 (sdk/examples/js-sdk-test/README.md).

import { Browserable } from 'browserable-js';

const browserable = new Browserable({
  apiKey: 'your-api-key',
});

const { data } = await browserable.createTask({
  task: 'Visit example.com and extract all links',
  agent: 'BROWSER_AGENT',
});

createTask accepts an optional agent selector, allowing callers to route a task to a specific agent implementation. The built-in BROWSER_AGENT is defined in tasks/agents/browserable.js with the constant this.CODE = "BROWSER_AGENT".

The SDK ships convenience helpers for long-running runs. waitForRun accepts pollInterval (default 1000 ms) and timeout (default 300000 ms) and an optional onStatusChange callback (sdk/browserable-js/src/types.ts). Combined with stopRun, this lets a caller implement cancellation, progress streaming, and dead-letter handling entirely from JavaScript.

Custom Functions

Custom tools/functions are the extension point of BROWSER_AGENT. They are reached through the customFunctions and end actions on the base agent (tasks/agents/base.js). The base agent's _action_end writes a user-visible "Agent completed." message along with output and reasoning markdown, and signals node completion via jarvis.endNode(...).

For browser tasks, the BROWSER_AGENT is built on top of a stable set of LLM-driven actions defined in tasks/agents/browserable.js:

  • open_new_tab — opens a URL in a fresh browser session and returns a list of tabs.
  • read_tab — converts a tab's HTML to markdown for the LLM context.
  • act_on_tab — performs a click, type, or other interaction, verified by a vision-capable model.
  • extract_from_tab — runs schema-guided extraction over DOM or text, refining the result through buildRefineExtractedContentPrompt.

The refinement step is the natural place to plug in custom functions: the prompt builder in tasks/prompts/agents/browserable/extractPrompts.js accepts instructions, schema, previouslyExtractedContent, and domElements, which can be populated by user-defined helpers. Action selection itself is driven by tasks/prompts/agents/browserable/actionPrompts.js, which constrains the LLM to emit doAction, skipSection, or actionCompleted JSON — a contract that custom function authors can rely on when registering new callable tools.

Community note: Custom tools/functions V1 is documented as live, and https://docs.browserable.ai/guides/custom-functions is the canonical guide referenced from the issue tracker. Task GIFs are also exposed via both the REST API and JS SDK (see getTaskRunGif) (sdk/browserable-js/src/types.ts).

LLM Provider Configuration

Both the REST API and the SDK ultimately call the same LLM abstraction. The Admin UI exposes the underlying provider list in ui/src/containers/SettingsContainer.jsx, which collects openai, claude, and gemini API keys and stores them under userApiKeys on the account. Browser-side providers (hyperBrowser, steel) are stored under userBrowserApiKeys.

The agent layer further downgrades between providers on errors. The refinement call in tasks/agents/browserable.js iterates over gemini-2.0-flash, deepseek-chat, gpt-4o-mini, claude-3-5-haiku, and qwen-plus through callOpenAICompatibleLLMWithRetry, which means any provider offering an OpenAI-compatible endpoint (e.g. Groq) can be wired in by changing the upstream URL — a workaround the maintainers point to in the issue tracker for Groq/OpenRouter support.

Community note: Local LLMs (Ollama) and full Groq/OpenRouter support remain open roadmap items. Until they land, the supported configuration path is via the Admin UI providers list, with OpenAI-compatible endpoints supported through code-level URL substitution.

See Also

Source: https://github.com/browserable/browserable / Human Manual

Deployment, Configuration & Troubleshooting

Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Quick Start: npx browserable

Continue reading this section for the full explanation and source context.

Section Manual Docker Compose Deployment

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions

Deployment, Configuration & Troubleshooting

Browserable is shipped as a self-hostable, Docker-based platform for running AI browser agents. The repository contains three runtime layers that must come up together: the admin UI (React/electron-forge dashboard), the tasks service (the Node.js agent runtime that drives Playwright sessions and calls LLMs), and the data layer (Postgres/Supabase + S3-compatible storage). This page covers the supported deployment paths, the configuration surface you need to know about, and the most common failure modes reported by users on GitHub.

Deployment Paths

Quick Start: `npx browserable`

The README documents npx browserable as the fastest onboarding path. It bootstraps a local Docker Compose stack and opens the admin UI on http://localhost:2001, where you enter your LLM and remote-browser API keys to begin running tasks Source: [README.md:18-32]. This is the recommended path for new users and matches the down and --help commands requested in issue #8.

Manual Docker Compose Deployment

For contributors and self-hosters, the canonical path is:

``bash docker-compose -f docker-compose.dev.yml up ``

  1. Install Docker and Docker Compose.
  2. Clone the repository and cd deployment.
  3. Start the dev stack:
  4. Open the admin dashboard at http://localhost:2001 to set your LLM and remote-browser keys Source: [README.md:34-50].

Under the hood, the deployment folder orchestrates several compose files. The base stack (deployment/docker-compose.dev.yml) brings up the UI, the tasks service, and the agent worker. Optional companion files (deployment/supabase-docker/docker-compose.yml and docker-compose.s3.yml) provide the local Postgres/Supabase backend and an S3-compatible object store used for screenshots and run GIFs.

High-Level Architecture

flowchart LR
    User([Operator]) --> Admin[Admin UI<br/>localhost:2001]
    Admin -->|API keys & config| Tasks[Tasks Service<br/>Node.js agents]
    Tasks -->|Playwright CDP| Browser[(Remote Browser<br/>e.g. BrowserBase)]
    Tasks -->|Chat completions| LLM[(LLM Provider<br/>OpenAI-compatible)]
    Tasks -->|Run state & logs| DB[(Postgres / Supabase)]
    Tasks -->|Screenshots & GIFs| S3[(S3-compatible store)]

Configuration

API Keys

All sensitive credentials are entered through the admin UI after the stack is up; they are persisted in the database rather than only in .env files. The two credential classes you must provide are:

  • LLM API key — used by the callOpenAICompatibleLLMWithRetry helper, which fans out across a default model roster such as gemini-2.0-flash, deepseek-chat, gpt-4o-mini, claude-3-5-haiku, and qwen-plus Source: [tasks/agents/browserable.js:33-49].
  • Remote browser key — used by browserService.getPlaywrightBrowser() to acquire a Playwright connectUrl and sessionId for each run Source: [tasks/agents/browserable.js:91-101].

Environment Variables

The most commonly referenced environment variables live in deployment/.env and are documented in docs/development/environment-variables.md. The key categories are summarized below.

CategoryPurposeExample
DatabasePostgres / Supabase connection for the tasks serviceDATABASE_URL, SUPABASE_URL
Object storageS3-compatible endpoint for screenshots and task-run GIFsS3_ENDPOINT, S3_BUCKET, S3_ACCESS_KEY, S3_SECRET_KEY
LLMOpenAI-compatible base URL and key (overrideable in code)OPENAI_API_KEY, OPENAI_BASE_URL
BrowserRemote browser provider credentialsBROWSERBASE_API_KEY, BROWSERBASE_CONNECT_URL
PortsUI and API ports2001 (UI), 2003 (REST API)

The REST API base URL is also reflected in the JS SDK examples, which target http://localhost:2003/api/v1 by default Source: [sdk/examples/js-sdk-test/README.md:7-9].

Using Non-OpenAI Providers (Groq, OpenRouter, Ollama)

The admin UI / Docker compose flow is wired to OpenAI today. For Groq, OpenRouter, or any other OpenAI-compatible endpoint, the maintainers' guidance in issue #3 is to replace the OpenAI chat-completions URL in code with the provider's OpenAI-compatible URL while reusing the same key plumbing [Source: GitHub issue #3]. A native admin-UI selector for Groq and OpenRouter is tracked as planned work in issue #5. Local LLMs such as Ollama follow the same pattern and are tracked in issue #9.

JavaScript SDK

A published SDK ships as browserable-js (v1.0.1) and exposes a typed BrowserableConfig (apiKey, baseURL), CreateTaskOptions, TaskRunStatus, and a WaitForRunOptions helper with pollInterval, timeout, and onStatusChange callback Source: [sdk/browserable-js/package.json:1-25, sdk/browserable-js/src/types.ts:1-50].

Troubleshooting Common Issues

"Initial setup is in progress" never resolves

This typically means the tasks service has not finished booting. Diagnose it from the host:

docker ps                  # look for unhealthy / restarting tasks container
docker exec -it browserable-<container> ...

[Source: GitHub issue #6]. If the tasks container is unhealthy, check its logs and the DATABASE_URL / S3 reachability before retrying.

`npx browserable` fails with `node --no-warnings` not found

The CLI shebang uses env node --no-warnings. On WSL/Ubuntu with shell shims such as fnm_multishells, /usr/bin/env expands the whole string as a single binary name, producing /usr/bin/env: 'node --no-warnings': No such file or directory [Source: GitHub issue #20]. Workarounds reported in the thread include invoking npx browserable from a shell where node resolves through a wrapper that supports -S-style option splitting, or running the manual Docker Compose path instead.

Custom LLM endpoint not being picked up

If the admin UI rejects a non-OpenAI key or silently falls back, remember that the URL and key are read from the OpenAI-compatible client directly. Confirm the base URL was updated in code, then restart the tasks container so the change is loaded [Source: GitHub issue #3].

Agent runs that hang or fail mid-flow

The agent runtime short-circuits gracefully when a run is cancelled via jarvis.isRunActive() checks in both the text-extraction and DOM-extraction helpers Source: [tasks/agents/browserable.js:7-25, tasks/agents/browserable.js:309-340]. If a run stays in running indefinitely, inspect the agent and debug logs surfaced through the admin UI; these correspond to updateNodeAgentLog and updateNodeDebugLog calls in base.js and jarvis.js.

"Task is not active" or "Tab with ID ... not found"

These errors are raised when a run has been stopped between scheduling and execution, or when a requested tabId no longer matches a live Playwright page. The extractHelper and textExtractHelper both wrap their work in isRunActive guards and return a structured failure rather than throwing Source: [tasks/agents/browserable.js:255-275]. Re-run the task from the admin UI; if the error persists, the browser session likely lost its CDP connection and a new session must be provisioned.

A consolidated troubleshooting reference lives at docs/development/troubleshooting.mdx (issue #15, live). For unresolved issues, the maintainers track new reports in the GitHub issue tracker as the public roadmap (issue #7).

See Also

Source: https://github.com/browserable/browserable / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/browserable/browserable/issues/20

2. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.host_targets | https://github.com/browserable/browserable

3. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/browserable/browserable

4. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable

5. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/browserable/browserable

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/browserable/browserable

7. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable

8. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using browserable with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence