Add Browser Env Integration by filip-michalsky · Pull Request #732 · PrimeIntellect-ai/verifiers

filip-michalsky · 2026-01-15T11:46:19Z

Description

Adds BrowserEnv - a unified browser automation integration for the verifiers library supporting two operational modes:

DOM Mode (mode="dom")

Uses the Stagehand Python SDK for natural language browser control
Tools: navigate, observe, act, extract - Stagehand's AI-driven primitives
Ideal for tasks that benefit from semantic understanding of page elements

CUA Mode (mode="cua")

Vision-based primitives for Computer Use Agent workflows
Tools: click, double_click, type_text, keypress, scroll, goto, back, forward, wait, screenshot
Automatic sandbox deployment - CUA server is deployed automatically to sandbox containers
Three execution modes (fastest to most flexible):
1. Pre-built image (default): Uses deepdream19/cua-server:latest for ~5-10s startup
2. Binary upload: Builds and uploads custom server version (~30-60s startup)
3. Manual server: Connect to locally running CUA server for development

Both modes support local browser execution or Browserbase cloud infrastructure.

What's included:

verifiers/envs/integrations/browser_env/ - Core integration (BrowserEnv, DOMMode, CUAMode, CUASandboxMode)
assets/templates/browserbase/cua/ - TypeScript CUA server with Docker build/runtime configs
environments/browser_dom_example/ - Minimal DOM mode example
environments/browser_cua_example/ - Minimal CUA mode example
New [browser] extra: uv add 'verifiers[browser]'

Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the browserbase/ namespace.

Type of Change

New feature (non-breaking change which adds functionality)

Testing

# DOM mode
prime eval run browser-dom-example -m openai/gpt-4o-mini

# CUA mode (pre-built image - default, fastest)
prime eval run browser-cua-example -m openai/gpt-4o-mini

# CUA mode (binary upload - custom server)
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_prebuilt_image": false}'

# CUA mode (manual server for development)
cd assets/templates/browserbase/cua && pnpm dev  # In separate terminal
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_sandbox": false}'

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

CUA Server Deployment Options:

Mode	Flag	Startup	Use Case
Pre-built image (default)	None	~5-10s	Production
Binary upload	`use_prebuilt_image=false`	~30-60s	Custom server
Manual server	`use_sandbox=false`	Instant	Development

Future work:

Additional benchmark environments available on Prime Hub under browserbase/ org


---

## Bugbot Summary (updated)

```markdown


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Introduces a new browser automation integration that spins up remote sandboxes, runs external HTTP servers, and adds new optional dependencies; failures could impact environment lifecycle/cleanup and test stability despite being largely additive.
> 
> **Overview**
> Adds a new `BrowserEnv` integration with two modes: **DOM mode** routes natural-language `navigate/observe/act/extract` calls through the Stagehand Python SDK, while **CUA mode** exposes coordinate-based primitives (`click`, `scroll`, `type_text`, etc.) backed by a CUA server.
> 
> CUA mode gains **automatic sandbox deployment** (including an optional prebuilt Docker image path vs binary upload vs local server), screenshot capture/retention controls, retry/health-check handling, and a backwards-compatible `CUASandboxMode` alias. The PR also vendors a TypeScript Fastify CUA server template (with Docker build/runtime + SEA binary scripts) and adds two example environments (`browser_dom_example`, `browser_cua_example`) plus docs, tests, and a new `[browser]` optional dependency group.
> 
> Additionally updates tool execution to allow tools to return structured multipart content (lists) instead of forcing `str(...)`, and adjusts import tests to tolerate missing optional dependency imports.
> 
> <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 907fddd8bd0a847196c1fdbe9dc325eaf37a8e43. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

CLAassistant · 2026-01-15T11:46:34Z

All committers have signed the CLA.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

environments/AGENTS.md

verifiers/envs/integrations/browser_env/browser_env.py

verifiers/envs/integrations/browser_env/modes/cua_mode.py

tests/test_envs.py

environments/mind2web/mind2web.py

verifiers/envs/integrations/browser_env/browser_env.py

verifiers/envs/integrations/browser_env/modes/dom_mode.py

* use remote sandbox env for cua mode * update tests * remote cua in sandbox * Fm/browser add binary (#4) * binary * update * fix non binary execution * fix ruff * update default flags

…ky/verifiers into fm/add-browser-env

assets/templates/browserbase/cua/package.json

verifiers/envs/integrations/browser_env/modes/base.py

assets/templates/browserbase/cua/sessionManager.ts

verifiers/envs/integrations/browser_env/modes/cua_mode.py

environments/browser_cua_example/browser_cua_example.py

assets/templates/browserbase/cua/sessionManager.ts

verifiers/envs/integrations/browser_env/modes/cua_mode.py

verifiers/envs/integrations/browser_env/browser_env.py

verifiers/envs/integrations/browser_env/modes/cua_mode.py

cursor · 2026-01-28T15:22:50Z

verifiers/envs/integrations/browser_env/modes/cua_mode.py

+        }
+
+        # Thread-safe locks (shared)
+        self._thread_lock = threading.Lock()


Redundant threading lock in async-only context

Low Severity

_thread_lock (a threading.Lock) wraps an asyncio.Lock in _get_http_client, which is an async method that will only be called from async contexts. The threading lock is unnecessary since the async lock already provides sufficient protection for concurrent coroutines. Additionally, holding a synchronous lock while awaiting an async lock is an anti-pattern that can cause blocking issues.

Additional Locations (1)

verifiers/envs/integrations/browser_env/modes/cua_mode.py#L300-L305

verifiers/envs/integrations/browser_env/__init__.py

verifiers/envs/integrations/browser_env/browser_env.py

verifiers/envs/integrations/browser_env/__init__.py

cdreetz · 2026-01-29T08:52:39Z

@willccbb wanna take a look? everything for dom should be good to go

idk how to get ty check to pass tho? checked merged prs and they pass with warnings but this one fails with warnings

verifiers/envs/integrations/browser_env/modes/dom_mode.py

cursor · 2026-01-29T15:46:16Z

assets/templates/browserbase/cua/types.ts

+  browserbaseProjectId?: string;
+  viewport?: Viewport;
+  proxies?: boolean;
+}


CUA mode silently ignores advanced_stealth parameter

Medium Severity

The advanced_stealth parameter is accepted by BrowserEnv and CUAMode in Python, which includes it in session_config as browserSettings.advancedStealth. However, the TypeScript CUA server's SessionCreateRequest type doesn't define browserSettings, and sessionManager.ts constructs browserSettings with only viewport, ignoring any advancedStealth setting. Users enabling advanced_stealth=True in CUA mode expect Browserbase's anti-bot detection mode, but it has no effect. DOM mode correctly handles this parameter.

Additional Locations (1)

assets/templates/browserbase/cua/sessionManager.ts#L40-L53

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-01-30T00:04:03Z

environments/browser_dom_example/browser_dom_example.py

+
+
+def load_environment(
+    project_id: str,


DOM example requires project_id but documentation omits it

High Severity

The load_environment function requires project_id: str as a mandatory positional argument with no default value. However, the README shows usage without passing this argument (prime eval run browser-dom-example -m gpt-4.1-mini ...), and the docstring example shows env = load_environment() with no arguments. Running the example as documented will raise a TypeError. The CUA example correctly uses optional parameters with defaults, but the DOM example is inconsistent.

Additional Locations (1)

environments/browser_dom_example/browser_dom_example.py#L131-L132

cursor · 2026-01-30T00:04:03Z

verifiers/envs/integrations/browser_env/modes/cua_mode.py

+            "proxies": proxies,
+            "browserSettings": {"advancedStealth": advanced_stealth}
+            if advanced_stealth
+            else None,


CUA mode advanced_stealth parameter is silently ignored

Medium Severity

When advanced_stealth=True is passed to BrowserEnv in CUA mode, the Python code sends browserSettings: {"advancedStealth": true} in the session config. However, the TypeScript CUA server's SessionCreateRequest interface doesn't include browserSettings, and sessionManager.ts constructs its own browserSettings object containing only viewport settings. The advancedStealth setting is silently discarded, so users expecting anti-bot detection will not have it enabled.

Additional Locations (1)

assets/templates/browserbase/cua/sessionManager.ts#L44-L52

ruff precommit

cedc25b

filip-michalsky changed the title ~~ruff precommit~~ Add Browser Env Integration Jan 15, 2026

filip-michalsky and others added 7 commits January 15, 2026 10:11

smoke test env

7280e65

simplify smoke test

9cfd1c6

delete datasets from wrong place

c93bca2

restructure

8a978c5

Remove gaia, mind2web, and webvoyager environment folders

3d6d559

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

increment examples

5db1d16

update readme and auto start cua server

e3024b8

filip-michalsky marked this pull request as ready for review January 16, 2026 16:02

cursor bot reviewed Jan 16, 2026

View reviewed changes

environments/AGENTS.md Outdated Show resolved Hide resolved

filip-michalsky added 3 commits January 16, 2026 12:25

add env check

01eac47

update readme

79e04d0

update agents md

0affb42

cursor bot reviewed Jan 16, 2026

View reviewed changes

environments/AGENTS.md Show resolved Hide resolved

verifiers/envs/integrations/browser_env/browser_env.py Show resolved Hide resolved

verifiers/envs/integrations/browser_env/modes/cua_mode.py Show resolved Hide resolved

fix tests

49f5b95

cursor bot reviewed Jan 16, 2026

View reviewed changes

tests/test_envs.py Show resolved Hide resolved

update tests

813eca4

cursor bot reviewed Jan 16, 2026

View reviewed changes

environments/mind2web/mind2web.py Outdated Show resolved Hide resolved

Remove gaia, webvoyager, mind2web from tracking

906a836

cursor bot reviewed Jan 16, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/browser_env.py Outdated Show resolved Hide resolved

verifiers/envs/integrations/browser_env/modes/dom_mode.py Outdated Show resolved Hide resolved

filip-michalsky and others added 8 commits January 16, 2026 16:31

make bugbot happier

e688da6

Fm/browser sandbox env (#3)

64096e6

* use remote sandbox env for cua mode * update tests * remote cua in sandbox * Fm/browser add binary (#4) * binary * update * fix non binary execution * fix ruff * update default flags

move cua server to assets

278ae78

update readmes

3ef51bb

Merge branch 'main' into fm/add-browser-env

be0762f

DRY modes

40f5b44

Merge branch 'fm/add-browser-env' of https://github.com/filip-michals…

38d8ae1

…ky/verifiers into fm/add-browser-env

fix act in dom mode-small dict schema issue

5dd0649

cursor bot reviewed Jan 26, 2026

View reviewed changes

assets/templates/browserbase/cua/package.json Outdated Show resolved Hide resolved

filip-michalsky added 3 commits January 27, 2026 13:24

update README to recommend max turns as 50 in examples

71899f5

update assets

310d8c9

proxy bug

3a6a365

cursor bot reviewed Jan 28, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/modes/base.py Show resolved Hide resolved

assets/templates/browserbase/cua/sessionManager.ts Outdated Show resolved Hide resolved

remove duplicate code

ddf8fb4

cursor bot reviewed Jan 28, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/modes/cua_mode.py Show resolved Hide resolved

environments/browser_cua_example/browser_cua_example.py Show resolved Hide resolved

assets/templates/browserbase/cua/sessionManager.ts Outdated Show resolved Hide resolved

add advanced stealth flag

311164c

cursor bot reviewed Jan 28, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/modes/cua_mode.py Outdated Show resolved Hide resolved

verifiers/envs/integrations/browser_env/browser_env.py Outdated Show resolved Hide resolved

filip-michalsky added 2 commits January 28, 2026 09:07

update logging

9f4bb13

make bugbot happy

732cb07

cursor bot reviewed Jan 28, 2026

View reviewed changes

remove system prompts from browser_env

9288b11

cursor bot reviewed Jan 29, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/browser_env.py Outdated Show resolved Hide resolved

remove references to sys prompts and tests for sys prpompts

d253e1d

cursor bot reviewed Jan 29, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/__init__.py Show resolved Hide resolved

add prompt to examples

d96787d

cursor bot reviewed Jan 29, 2026

View reviewed changes

verifiers/envs/integrations/browser_env/modes/dom_mode.py Show resolved Hide resolved

local browser config only local

0373e01

cursor bot reviewed Jan 29, 2026

View reviewed changes

streamlining for call_tool fix + ty

907fddd

cursor bot reviewed Jan 30, 2026

View reviewed changes

willccbb merged commit c1e1a5d into PrimeIntellect-ai:main Jan 30, 2026
6 checks passed

mikasenghaas mentioned this pull request Jan 30, 2026

always cast tool call output to str #805

Closed

13 tasks

Comments

Conversation

filip-michalsky commented Jan 15, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Additional Notes

Uh oh!

CLAassistant commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Jan 28, 2026

Choose a reason for hiding this comment

Redundant threading lock in async-only context

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cdreetz commented Jan 29, 2026

Uh oh!

Uh oh!

cursor bot Jan 29, 2026

Choose a reason for hiding this comment

CUA mode silently ignores advanced_stealth parameter

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Jan 30, 2026

Choose a reason for hiding this comment

DOM example requires project_id but documentation omits it

Uh oh!

cursor bot Jan 30, 2026

Choose a reason for hiding this comment

CUA mode advanced_stealth parameter is silently ignored

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

filip-michalsky commented Jan 15, 2026 •

edited by cursor bot

Loading

CLAassistant commented Jan 15, 2026 •

edited

Loading