Doramagic Project Pack · Human Manual
browserable
Open source and self-hostable browser automation library for AI agents
Overview, Architecture & Getting Started
Related topics: AI Agents, Prompts & LLM Integration, REST API, JavaScript SDK & Custom Functions, Deployment, Configuration & Troubleshooting
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: AI Agents, Prompts & LLM Integration, REST API, JavaScript SDK & Custom Functions, Deployment, Configuration & Troubleshooting
Overview, Architecture & Getting Started
What is Browserable
Browserable is an open-source platform that lets AI agents drive a real web browser to complete user tasks. The repository bundles the agents, an orchestration layer, a browser automation service, a JavaScript SDK, and a desktop-style Admin UI into a single monorepo that can be launched with the npx browserable CLI (README context referenced in community issue #8).
The defining capability is the BROWSER_AGENT, declared in tasks/agents/browserable.js, which exposes four primitives — open_new_tab, read_tab, act_on_tab, and extract_from_tab — and is documented to "figure out how to procure a remote browser session + perform tasks like clicking, typing, etc on it." Higher-level agents (such as a Google Sheets agent) are explicitly preferred over the browser agent when a targeted tool is available.
System Architecture
Browserable is a polyglot monorepo. The Admin UI ships as an Electron-forge desktop application (see make scripts in ui/package.json), the SDK is published as browserable-js on npm (sdk/browserable-js/package.json), and the runtime is split across multiple long-running services that communicate over HTTP and Postgres-backed message logs.
flowchart LR
User["User / Client App"] --> SDK["browserable-js SDK"]
User --> CLI["npx browserable CLI"]
User --> UI["Admin UI (Electron)"]
CLI --> Tasks["tasks service<br/>(Jarvis orchestrator + agents)"]
UI --> Tasks
SDK --> Tasks
Tasks --> DB[("Postgres<br/>message_logs")]
Tasks --> Browser["browser service<br/>(Playwright session)"]
Tasks --> LLM["OpenAI-compatible LLM<br/>(gemini, gpt-4o, claude, deepseek, qwen)"]
Browser -->|screenshots / DOM| Tasks
LLM -->|tool calls| TasksCore Components
| Component | Source | Responsibility |
|---|---|---|
Jarvis orchestrator | tasks/agents/jarvis.js | Splits a user request into a flow of sub-tasks, schedules node loopers, and aggregates results into a structured outputGenerated object via richOutputPrompt.js |
BaseAgent | tasks/agents/base.js | Provides shared lifecycle helpers (_action_end, error reporting) for every concrete agent |
BROWSER_AGENT | tasks/agents/browserable.js | The browser-driving agent; emits agent, user, and debug log segments, and persists screenshots after each chunk |
| Action / extract prompts | actionPrompts.js, extractPrompts.js | JSON-only prompts that ask the LLM to choose doAction, skipSection, or actionCompleted |
| Admin UI shell | ui/src/containers/FlowContainer.jsx, ui/src/routes/NotFound.jsx | Renders the live run timeline, message logs, screenshots, and code/markdown payloads |
| JavaScript SDK | sdk/browserable-js/src/types.ts | Typed client (BrowserableConfig, Task, TaskRunStatus, TaskRunGifResult) that wraps the REST API |
Agent Execution Loop
The browser agent calls a shared callOpenAICompatibleLLMWithRetry helper with a fallback chain of gemini-2.0-flash, deepseek-chat, gpt-4o-mini, claude-3-5-haiku, and qwen-plus (see tasks/agents/browserable.js). For each chunk of a long page it: scrolls, waits for a settled DOM, takes a screenshot, asks the LLM to extract structured content, and recursively calls textExtractHelper until the schema is satisfied or the run is no longer active. A separate refine step (buildRefineExtractedContentPrompt) consolidates the chunked extractions before returning.
Getting Started
Browserable is distributed primarily through the npx browserable CLI. Per community issue #8, the CLI supports a --help flag and a down subcommand to tear the stack down; the canonical npx browserable flow boots the Docker Compose stack defined under deployment/.
A typical first run:
- Ensure Docker and a working
node/npmare onPATH. (See the troubleshooting note below for the common WSL failure mode.) - Run
npx browserablefrom any directory; the CLI pulls and starts the Admin UI, thetasksservice, the browser service, and Postgres. - Open the Admin UI; the dashboard route renders the live run timeline implemented in ui/src/containers/FlowContainer.jsx.
- Author a task in natural language — Jarvis breaks it into nodes, dispatches the
BROWSER_AGENTfor browser work, and writes structured output back to the message log (tasks/agents/jarvis.js). - From a separate Node project, install the SDK:
npm install browserable-js
The SDK exposes the typed surface from sdk/browserable-js/src/types.ts (CreateTaskOptions, TaskRunStatus, WaitForRunOptions) and is the recommended integration path for headless automation.
Common Setup Issues
The community has surfaced three recurring first-run failure modes that are worth documenting up front:
- "Initial setup is in progress" hang (issue #6). The maintainer confirmed this almost always means the
tasksservice did not come up healthy.docker pswill surface the unhealthy container;docker exec -it browserablethen inspect the logs to identify the cause. /usr/bin/env: 'node --no-warnings': No such file or directory(issue #20). Reported on Ubuntu WSL whennoderesolves throughfnm/nvmshims that the shebang cannot locate. The workaround is to invoke the CLI with an absolute path tonode, e.g.node $(which npx) browserable, or to ensurenodeis on a stablePATH.- LLM provider coverage. Groq and OpenRouter are tracked as
roadmap: planned(issue #5); Ollama/local LLMs areroadmap: requests(issue #9). Until first-class support lands, issue #3 documents the manual workaround: replacehttps://api.openai.com/v1/chat/completionswith the provider's OpenAI-compatible endpoint in the source.
See Also
Source: https://github.com/browserable/browserable / Human Manual
AI Agents, Prompts & LLM Integration
Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions
AI Agents, Prompts & LLM Integration
Overview
Browserable is an open-source browser automation library for AI agents that currently reaches 90.4% on the Web Voyager benchmarks Source: [README.md](). The system is built around a multi-agent architecture in which each agent is responsible for a distinct capability (browser interaction, LLM passthrough, orchestration, research), and every LLM-backed step is driven by structured prompts that produce JSON-shaped tool calls.
The "AI Agents, Prompts & LLM Integration" subsystem therefore covers three concerns:
- The agent classes that define what each agent can do and how it terminates.
- The prompt library that instructs the underlying LLM at each step (extraction, action selection, routing, output formatting).
- The LLM integration layer that talks to OpenAI-compatible endpoints with a model-fallback list per use case.
Agent Architecture
All agents extend a shared BaseAgent defined in tasks/agents/base.js. The base class provides three reusable action handlers:
| Action | Purpose |
|---|---|
error | Logs an irrecoverable error to user/debug logs and calls errorAtNode Source: [tasks/agents/base.js](). |
end | Writes final output and reasoning markdown to the user log and closes the node as completed Source: [tasks/agents/base.js](). |
getBaseActions | Inherited by all subclasses to register custom actions. |
Four concrete agent implementations ship in the repository:
- BrowserableAgent (
CODE = "BROWSER_AGENT") — the headline browser automation agent. It exposes four high-level actions —open_new_tab,read_tab,act_on_tab, andextract_from_tab— and is described in its system prompt as the agent to use "if the user explicitly asks you to do something on the browser" Source: [tasks/agents/browserable.js](). It always confirms results with aread_tabafter eachact_on_tab. - GenerativeAgent (
CODE = "GENERATIVE_AGENT") — a thin wrapper that passes ataskstring to an LLM and returns anoutputstring. The system prompt explicitly notes it is a "simple vanilla dumb agent" intended for trivial text-to-text calls Source: [tasks/agents/generative.js](). - JarvisAgent — the orchestrator/router that decides which sub-agent handles a row of a data table; its prompts live in
tasks/prompts/agents/jarvis/and are exported throughindex.jsSource: [tasks/prompts/agents/jarvis/index.js](). - DeepResearchAgent — drives multi-step web research, including SERP processing prompts such as
buildProcessSerpsPromptSource: [tasks/prompts/agents/deepresearch/processSerpsPrompt.js]().
flowchart TB
User[User / SDK Caller] --> Tasks[tasks service]
Tasks --> Jarvis[JarvisAgent<br/>orchestrator]
Jarvis -->|route sub-task| Browserable[BrowserableAgent<br/>BROWSER_AGENT]
Jarvis -->|route sub-task| Gen[GenerativeAgent<br/>GENERATIVE_AGENT]
Jarvis -->|route sub-task| Research[DeepResearchAgent]
Browserable --> LLM[(OpenAI-compatible<br/>LLM endpoint)]
Gen --> LLM
Jarvis --> LLM
Research --> LLM
LLM -->|tool/function call JSON| Agent[Selected Agent]
Agent -->|updateNodeUserLog<br/>endNode| TasksLLM Integration
The LLM layer is OpenAI-compatible and accepts a list of candidate models that are tried in order, enabling graceful fallback. Two example call sites illustrate the pattern:
- Refining extracted page content uses the cascade
["gemini-2.0-flash", "deepseek-chat", "gpt-4o-mini", "claude-3-5-haiku", "qwen-plus"]withmax_attempts: 3Source: [tasks/agents/browserable.js](). - Deciding the next Playwright action uses
["gemini-2.0-flash", "deepseek-chat", "claude-3-5-sonnet", "gpt-4o", "qwen-plus"]with the same retry budget Source: [tasks/agents/browserable.js]().
Every call carries a metadata object with runId, nodeId, agentCode, usecase, flowId, accountId, and threadId, which the orchestrator uses to attribute logs back to the right node. Callers can short-circuit a long-running run by checking jarvis.isRunActive({ runId, flowId }) between LLM calls; if the run was cancelled, the agent returns early with completed: false and a descriptive message Source: [tasks/agents/browserable.js]().
Community notes on LLM providers
- The default deployment targets OpenAI directly; the maintainers have noted that Groq and OpenRouter can be enabled by replacing
https://api.openai.com/v1/chat/completionswith a Groq-compatible URL Source: issue [#3, #5](). - Local LLM support (Ollama) is on the roadmap as a request Source: issue [#9]().
Prompt System
Prompts are colocated with their agents under tasks/prompts/agents/<agent-name>/ and exported as builder functions that return OpenAI-style messages arrays. Major prompt modules include:
extractPrompts.js—buildExtractLLMPromptinstructs the LLM to print exact text from a rendered webpage or DOM slice and to emit JSON with ajustificationfield. It is sensitive to whether the input is a text rendering or a raw DOM list Source: [tasks/prompts/agents/browserable/extractPrompts.js]().
actionPrompts.js— defines the function-calling schema fordoAction,skipSection, andactionCompleted. The LLM must emit exactly one of these three JSON shapes, each carrying areasonplus optional Playwrightmethod/args/elementSource: [tasks/prompts/agents/browserable/actionPrompts.js]().
richOutputPrompt.js— assembles the final structured answer for a user. It enforces a hard ceiling of 4000 words for the entireoutputGeneratedobject and reminds the model to honor per-field word limits Source: [tasks/prompts/agents/jarvis/richOutputPrompt.js]().
datatablePrompts.js— describes how Jarvis decomposes a user request into rows, delegates each row to a sub-agent, and merges results back. It codifies thework_on_subtask_before_decidingaction code used when a row's prerequisites are missing Source: [tasks/prompts/agents/jarvis/datatablePrompts.js]().
processSerpsPrompt.js— directs the deep-research model to extract at most eight unique, dense learnings and three follow-up questions from SERP content Source: [tasks/prompts/agents/deepresearch/processSerpsPrompt.js]().
Common Failure Modes and Workarounds
/usr/bin/env: 'node --no-warnings': No such file or directorywhen runningnpx browserableinside WSL/Ubuntu with afnmmultishell. The shebang is interpreted by a shell that splits on spaces; the workaround is to launchnpxfrom a regular login shell wherenoderesolves to a single path (for example,which nodereturning/run/user/0/fnm_multishells/.../nodeconfirms the environment quirk) Source: issue [#20]().
- Stuck "Initial setup is in progress" on the admin UI. The maintainers recommend
docker psto check for an unhealthytasksservice, thendocker exec -it browserable ...to inspect logs; the frontend is waiting for the backend health check to succeed Source: issue [#6]().
- No Groq / OpenRouter entry in the Admin UI. There is no first-class toggle yet, so users must point the OpenAI base URL at a compatible endpoint by editing the configuration Source: issues [#3, #5]().
See Also
- Task Runs & Status Polling — covers
TaskRunStatusand theWaitForRunOptionspoll loop Source: [sdk/browserable-js/src/types.ts]().
- Custom Tools & Functions — the public guide for registering user-defined tool calls alongside the built-in agent actions Source: issue [#10]().
- Local Browser & Deployment — running the
browserableDocker stack and CLI (npx browserable --help,npx browserable down) for local browser support Source: issues [#4, #8]().
- Troubleshooting — covers the
Initial setup is in progresssymptom and other deployment pitfalls Source: issue [#15]().
Source: https://github.com/browserable/browserable / Human Manual
REST API, JavaScript SDK & Custom Functions
Related topics: Overview, Architecture & Getting Started, AI Agents, Prompts & LLM Integration
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Architecture & Getting Started, AI Agents, Prompts & LLM Integration
REST API, JavaScript SDK & Custom Functions
Overview
Browserable exposes its browser-automation capabilities through a REST API, a typed JavaScript/TypeScript SDK, and an extension point for custom tools/functions. Together these form the developer-facing surface for programmatically creating tasks, polling run status, retrieving run results, generating task GIFs, and extending the built-in BROWSER_AGENT with user-defined functions.
The project positions itself as open-source and self-hostable. The README directs users to a hosted REST endpoint and a JS SDK guide (README.md). The SDK is published as browserable-js (sdk/browserable-js/package.json) and depends on axios ^1.6.7 for HTTP transport.
A bundled example project at sdk/examples/js-sdk-test (sdk/examples/js-sdk-test/README.md) demonstrates the typical user flow against a local API at http://localhost:2003/api/v1.
REST API Surface
The JavaScript SDK is a thin wrapper over the REST API, so the SDK method list effectively documents the supported endpoints. The SDK methods are typed in sdk/browserable-js/src/types.ts and demonstrated in sdk/browserable-js/README.md.
| SDK Method | Purpose | Returns |
|---|---|---|
createTask({ task, agent, triggers }) | Submit a new task for the default or specified agent. | { taskId } |
listTasks({ page, limit }) | List tasks for the authenticated account. | Paginated Task[] |
getTaskRunStatus(taskId, runId?) | Poll the lifecycle state of a run. | TaskRunStatus |
getTaskRunResult(taskId, runId?) | Fetch final output of a run. | TaskRunResult |
getTaskRunGif(taskId, runId) | Retrieve a rendered GIF of the run. | TaskRunGifResult |
stopRun(taskId, runId?) | Cancel a running task. | API envelope |
waitForRun(taskId, options?) | Block-poll until status is terminal. | Final status |
getUserProfile() | Return the authenticated user. | Profile object |
listBrowsers() | Enumerate registered browser providers. | Browser list |
All responses share a common envelope ApiResponse<T> defined in sdk/browserable-js/src/types.ts, with success: boolean, optional data, optional error, and pagination fields (total, page, limit).
sequenceDiagram
participant App as Caller (SDK / curl)
participant API as REST API
participant Tasks as tasks service
participant Agent as BROWSER_AGENT
App->>API: POST /tasks (createTask)
API->>Tasks: enqueue
Tasks->>Agent: schedule node
Agent-->>Tasks: status updates
App->>API: GET /tasks/:id/runs/:runId (getTaskRunStatus)
App->>API: GET /tasks/:id/runs/:runId/result
App->>API: GET /tasks/:id/runs/:runId/gifTask run lifecycle values shown in the type definitions are scheduled | running | completed | error (sdk/browserable-js/src/types.ts). The GIF endpoint mirrors this with pending | completed | error and a url field for the rendered asset.
The example test harness iterates the lifecycle by listing tasks and creating a browser session (sdk/examples/js-sdk-test/README.md), confirming the practical order of calls a developer is expected to make.
JavaScript SDK
The SDK is implemented in TypeScript and built with tsc (sdk/browserable-js/package.json). It exports a Browserable class initialized with an API key and optional baseURL (sdk/browserable-js/README.md). The default base URL points to the hosted service; the example project overrides it to http://localhost:2003/api/v1 (sdk/examples/js-sdk-test/README.md).
import { Browserable } from 'browserable-js';
const browserable = new Browserable({
apiKey: 'your-api-key',
});
const { data } = await browserable.createTask({
task: 'Visit example.com and extract all links',
agent: 'BROWSER_AGENT',
});
createTask accepts an optional agent selector, allowing callers to route a task to a specific agent implementation. The built-in BROWSER_AGENT is defined in tasks/agents/browserable.js with the constant this.CODE = "BROWSER_AGENT".
The SDK ships convenience helpers for long-running runs. waitForRun accepts pollInterval (default 1000 ms) and timeout (default 300000 ms) and an optional onStatusChange callback (sdk/browserable-js/src/types.ts). Combined with stopRun, this lets a caller implement cancellation, progress streaming, and dead-letter handling entirely from JavaScript.
Custom Functions
Custom tools/functions are the extension point of BROWSER_AGENT. They are reached through the customFunctions and end actions on the base agent (tasks/agents/base.js). The base agent's _action_end writes a user-visible "Agent completed." message along with output and reasoning markdown, and signals node completion via jarvis.endNode(...).
For browser tasks, the BROWSER_AGENT is built on top of a stable set of LLM-driven actions defined in tasks/agents/browserable.js:
open_new_tab— opens a URL in a fresh browser session and returns a list of tabs.read_tab— converts a tab's HTML to markdown for the LLM context.act_on_tab— performs a click, type, or other interaction, verified by a vision-capable model.extract_from_tab— runs schema-guided extraction over DOM or text, refining the result throughbuildRefineExtractedContentPrompt.
The refinement step is the natural place to plug in custom functions: the prompt builder in tasks/prompts/agents/browserable/extractPrompts.js accepts instructions, schema, previouslyExtractedContent, and domElements, which can be populated by user-defined helpers. Action selection itself is driven by tasks/prompts/agents/browserable/actionPrompts.js, which constrains the LLM to emit doAction, skipSection, or actionCompleted JSON — a contract that custom function authors can rely on when registering new callable tools.
Community note: Custom tools/functions V1 is documented as live, andhttps://docs.browserable.ai/guides/custom-functionsis the canonical guide referenced from the issue tracker. Task GIFs are also exposed via both the REST API and JS SDK (seegetTaskRunGif) (sdk/browserable-js/src/types.ts).
LLM Provider Configuration
Both the REST API and the SDK ultimately call the same LLM abstraction. The Admin UI exposes the underlying provider list in ui/src/containers/SettingsContainer.jsx, which collects openai, claude, and gemini API keys and stores them under userApiKeys on the account. Browser-side providers (hyperBrowser, steel) are stored under userBrowserApiKeys.
The agent layer further downgrades between providers on errors. The refinement call in tasks/agents/browserable.js iterates over gemini-2.0-flash, deepseek-chat, gpt-4o-mini, claude-3-5-haiku, and qwen-plus through callOpenAICompatibleLLMWithRetry, which means any provider offering an OpenAI-compatible endpoint (e.g. Groq) can be wired in by changing the upstream URL — a workaround the maintainers point to in the issue tracker for Groq/OpenRouter support.
Community note: Local LLMs (Ollama) and full Groq/OpenRouter support remain open roadmap items. Until they land, the supported configuration path is via the Admin UI providers list, with OpenAI-compatible endpoints supported through code-level URL substitution.
See Also
- README.md — project overview, links to hosted REST docs and JS SDK guide.
- sdk/browserable-js/README.md — full SDK reference and examples.
- tasks/agents/base.js — base agent lifecycle,
endaction, error reporting. - tasks/agents/browserable.js —
BROWSER_AGENTactions and LLM fallback chain. - ui/src/containers/SettingsContainer.jsx — provider and browser API key configuration.
Source: https://github.com/browserable/browserable / Human Manual
Deployment, Configuration & Troubleshooting
Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Architecture & Getting Started, REST API, JavaScript SDK & Custom Functions
Deployment, Configuration & Troubleshooting
Browserable is shipped as a self-hostable, Docker-based platform for running AI browser agents. The repository contains three runtime layers that must come up together: the admin UI (React/electron-forge dashboard), the tasks service (the Node.js agent runtime that drives Playwright sessions and calls LLMs), and the data layer (Postgres/Supabase + S3-compatible storage). This page covers the supported deployment paths, the configuration surface you need to know about, and the most common failure modes reported by users on GitHub.
Deployment Paths
Quick Start: `npx browserable`
The README documents npx browserable as the fastest onboarding path. It bootstraps a local Docker Compose stack and opens the admin UI on http://localhost:2001, where you enter your LLM and remote-browser API keys to begin running tasks Source: [README.md:18-32]. This is the recommended path for new users and matches the down and --help commands requested in issue #8.
Manual Docker Compose Deployment
For contributors and self-hosters, the canonical path is:
``bash docker-compose -f docker-compose.dev.yml up ``
- Install Docker and Docker Compose.
- Clone the repository and
cd deployment. - Start the dev stack:
- Open the admin dashboard at
http://localhost:2001to set your LLM and remote-browser keys Source: [README.md:34-50].
Under the hood, the deployment folder orchestrates several compose files. The base stack (deployment/docker-compose.dev.yml) brings up the UI, the tasks service, and the agent worker. Optional companion files (deployment/supabase-docker/docker-compose.yml and docker-compose.s3.yml) provide the local Postgres/Supabase backend and an S3-compatible object store used for screenshots and run GIFs.
High-Level Architecture
flowchart LR
User([Operator]) --> Admin[Admin UI<br/>localhost:2001]
Admin -->|API keys & config| Tasks[Tasks Service<br/>Node.js agents]
Tasks -->|Playwright CDP| Browser[(Remote Browser<br/>e.g. BrowserBase)]
Tasks -->|Chat completions| LLM[(LLM Provider<br/>OpenAI-compatible)]
Tasks -->|Run state & logs| DB[(Postgres / Supabase)]
Tasks -->|Screenshots & GIFs| S3[(S3-compatible store)]Configuration
API Keys
All sensitive credentials are entered through the admin UI after the stack is up; they are persisted in the database rather than only in .env files. The two credential classes you must provide are:
- LLM API key — used by the
callOpenAICompatibleLLMWithRetryhelper, which fans out across a default model roster such asgemini-2.0-flash,deepseek-chat,gpt-4o-mini,claude-3-5-haiku, andqwen-plusSource: [tasks/agents/browserable.js:33-49]. - Remote browser key — used by
browserService.getPlaywrightBrowser()to acquire a PlaywrightconnectUrlandsessionIdfor each run Source: [tasks/agents/browserable.js:91-101].
Environment Variables
The most commonly referenced environment variables live in deployment/.env and are documented in docs/development/environment-variables.md. The key categories are summarized below.
| Category | Purpose | Example |
|---|---|---|
| Database | Postgres / Supabase connection for the tasks service | DATABASE_URL, SUPABASE_URL |
| Object storage | S3-compatible endpoint for screenshots and task-run GIFs | S3_ENDPOINT, S3_BUCKET, S3_ACCESS_KEY, S3_SECRET_KEY |
| LLM | OpenAI-compatible base URL and key (overrideable in code) | OPENAI_API_KEY, OPENAI_BASE_URL |
| Browser | Remote browser provider credentials | BROWSERBASE_API_KEY, BROWSERBASE_CONNECT_URL |
| Ports | UI and API ports | 2001 (UI), 2003 (REST API) |
The REST API base URL is also reflected in the JS SDK examples, which target http://localhost:2003/api/v1 by default Source: [sdk/examples/js-sdk-test/README.md:7-9].
Using Non-OpenAI Providers (Groq, OpenRouter, Ollama)
The admin UI / Docker compose flow is wired to OpenAI today. For Groq, OpenRouter, or any other OpenAI-compatible endpoint, the maintainers' guidance in issue #3 is to replace the OpenAI chat-completions URL in code with the provider's OpenAI-compatible URL while reusing the same key plumbing [Source: GitHub issue #3]. A native admin-UI selector for Groq and OpenRouter is tracked as planned work in issue #5. Local LLMs such as Ollama follow the same pattern and are tracked in issue #9.
JavaScript SDK
A published SDK ships as browserable-js (v1.0.1) and exposes a typed BrowserableConfig (apiKey, baseURL), CreateTaskOptions, TaskRunStatus, and a WaitForRunOptions helper with pollInterval, timeout, and onStatusChange callback Source: [sdk/browserable-js/package.json:1-25, sdk/browserable-js/src/types.ts:1-50].
Troubleshooting Common Issues
"Initial setup is in progress" never resolves
This typically means the tasks service has not finished booting. Diagnose it from the host:
docker ps # look for unhealthy / restarting tasks container
docker exec -it browserable-<container> ...
[Source: GitHub issue #6]. If the tasks container is unhealthy, check its logs and the DATABASE_URL / S3 reachability before retrying.
`npx browserable` fails with `node --no-warnings` not found
The CLI shebang uses env node --no-warnings. On WSL/Ubuntu with shell shims such as fnm_multishells, /usr/bin/env expands the whole string as a single binary name, producing /usr/bin/env: 'node --no-warnings': No such file or directory [Source: GitHub issue #20]. Workarounds reported in the thread include invoking npx browserable from a shell where node resolves through a wrapper that supports -S-style option splitting, or running the manual Docker Compose path instead.
Custom LLM endpoint not being picked up
If the admin UI rejects a non-OpenAI key or silently falls back, remember that the URL and key are read from the OpenAI-compatible client directly. Confirm the base URL was updated in code, then restart the tasks container so the change is loaded [Source: GitHub issue #3].
Agent runs that hang or fail mid-flow
The agent runtime short-circuits gracefully when a run is cancelled via jarvis.isRunActive() checks in both the text-extraction and DOM-extraction helpers Source: [tasks/agents/browserable.js:7-25, tasks/agents/browserable.js:309-340]. If a run stays in running indefinitely, inspect the agent and debug logs surfaced through the admin UI; these correspond to updateNodeAgentLog and updateNodeDebugLog calls in base.js and jarvis.js.
"Task is not active" or "Tab with ID ... not found"
These errors are raised when a run has been stopped between scheduling and execution, or when a requested tabId no longer matches a live Playwright page. The extractHelper and textExtractHelper both wrap their work in isRunActive guards and return a structured failure rather than throwing Source: [tasks/agents/browserable.js:255-275]. Re-run the task from the admin UI; if the error persists, the browser session likely lost its CDP connection and a new session must be provisioned.
A consolidated troubleshooting reference lives at docs/development/troubleshooting.mdx (issue #15, live). For unresolved issues, the maintainers track new reports in the GitHub issue tracker as the public roadmap (issue #7).
See Also
- README.md — quick start and architecture overview
- docs/development/environment-variables.md — full env-var reference
- docs/development/troubleshooting.mdx — detailed troubleshooting guide
- sdk/browserable-js — JavaScript SDK source
- tasks/agents/browserable.js — browser agent implementation
- tasks/agents/base.js — base agent lifecycle
- tasks/agents/jarvis.js — orchestrator / run scheduling
Source: https://github.com/browserable/browserable / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/browserable/browserable/issues/20
2. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.host_targets | https://github.com/browserable/browserable
3. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/browserable/browserable
4. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable
5. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/browserable/browserable
6. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/browserable/browserable
7. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable
8. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/browserable/browserable
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using browserable with real data or production workflows.
- Groq and OpenRouter support - github / github_issue
- Initial Setup - github / github_issue
- npx browserable returns error `/usr/bin/env: ‘node --no-warnings’: No su - github / github_issue
- Login/ User input via API & Admin UI - github / github_issue
- npx browserable additional commands - github / github_issue
- Task GIFs - github / github_issue
- Custom tools/ functions - github / github_issue
- Local browser support - github / github_issue
- Troubleshooting documentation - github / github_issue
- Public Roadmap - github / github_issue
- Web voyager benchmark - release blog post - github / github_issue
- Evals - github / github_issue
Source: Project Pack community evidence and pitfall evidence