-
Notifications
You must be signed in to change notification settings - Fork 1.3k
replace operator agent with base of new agent #1014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
70 commits
Select commit
Hold shift + click to select a range
bcbe2e4
restore agent integrations
sameelarif 3719911
working build
sameelarif 8efdd36
Merge branch 'main' into sarif/stg-519-mcp-and-tools-support
sameelarif e75a850
deps
sameelarif e3c7697
ignore inference summary
sameelarif 088c51f
better tool calling in operator and oai
sameelarif e16dfd0
example integrations
sameelarif c32ebbf
handle malformed args from LLM
sameelarif 52348c6
fix "none" tool choice handling
sameelarif ab7742c
merge
sameelarif b581063
mcp docs
sameelarif 0382bae
Merge branch 'main' into sarif/stg-519-mcp-and-tools-support
sameelarif 5e3a7f3
basic implementation
tkattkat 74fb597
move tools to own folder + add screenshot filterning
tkattkat 36dbb00
add accessability tool + context handling for it
tkattkat b3a0138
add fill form tool + agent to eval runner
tkattkat db4e31e
remove operator handler
tkattkat 664e90b
update naming of the computer use agent handler
tkattkat 3fe546c
add type guard
tkattkat 96c0508
update system
tkattkat 7b7adb9
add scroll tool
tkattkat 803642c
update act tool
tkattkat 14196a6
remove comments
tkattkat 722f951
remove operator type
tkattkat a04a0ee
update fillform
tkattkat 91d9b69
add llms thinking to stagehand logs
tkattkat 0232b1e
update fillform, messageprocessing, and logs
tkattkat 66aba99
remove refresh tool + timestamp on aria tree tool
tkattkat ba8c94e
update scroll tool + system prompt
tkattkat 7dee006
update goto tool
tkattkat 42e7805
add pascal case to cua handler definition
tkattkat ad1719b
update wait tool
tkattkat 06279ac
add "execution model"
tkattkat 2eab512
update aria tree tool
tkattkat fe75f60
update param names / types
tkattkat f225d9e
update task completed in actions
tkattkat e5048b5
update instruction handling
tkattkat b1c78fe
update agent text to log level 1
tkattkat a51f812
update result.message to contain all "reasoning" text throughout agen…
tkattkat a5f615d
Merge branch 'main' into agent-revamp
tkattkat 2ea5c69
update screenshot quality
tkattkat 63e7977
Merge branch 'mcp-tools-support' into agent-revamp
tkattkat 75ccd81
implement mcp tools to new agent
tkattkat eee3389
add changeset
tkattkat d74eba7
Merge branch 'main' into agent-revamp
tkattkat 4a0345f
Merge main into agent-revamp
tkattkat f7cb8c9
change back args
tkattkat 2f7be48
Merge remote-tracking branch 'origin/main' into agent-revamp
tkattkat f8ca451
add inference time
tkattkat 7bbcb78
add comment on system prompt
tkattkat 6cbfec6
add trycatch and change zod
tkattkat 31bae67
pass stagehand page instead of page
tkattkat 95fcecb
move get languate model to llmclient for proper typing while using el…
tkattkat d82c6c3
fallback to iframes true on iframes
tkattkat 5ad60ab
make logic cleaner
tkattkat 3158586
use stagehandpage instead of page
tkattkat 7927f39
remove screenshot console logs & use logger for extract
tkattkat 9c7f393
add back warning when not using provider/model format
tkattkat 6e2e3ec
add docs for agent
tkattkat a08ac8d
Merge branch 'main' into agent-revamp
tkattkat 7f0f11d
update to use act instead of observe
tkattkat 786b139
update copy on variable
tkattkat 220b37f
Merge branch 'agent-revamp' of https://github.com/browserbase/stageha…
tkattkat f00222b
remove closing page from close tool
tkattkat 001cc4f
Merge branch 'main' into agent-revamp
miguelg719 77961b1
update init stagehand and sf library card eval
tkattkat 6008fc7
add new model to task config
tkattkat 07211cc
update extract prompt
tkattkat f86955c
add changeset
tkattkat ed42209
add url note, and remove optional from examples
tkattkat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| "@browserbasehq/stagehand": patch | ||
| --- | ||
|
|
||
| Replace operator handler with base of new agent |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| "@browserbasehq/stagehand": patch | ||
| --- | ||
|
|
||
| replace operator agent with scaffold for new stagehand agent |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| import { tool } from "ai"; | ||
| import { z } from "zod/v3"; | ||
| import { StagehandPage } from "../../StagehandPage"; | ||
|
|
||
| export const createActTool = ( | ||
| stagehandPage: StagehandPage, | ||
| executionModel?: string, | ||
| ) => | ||
| tool({ | ||
| description: "Perform an action on the page (click, type)", | ||
| parameters: z.object({ | ||
| action: z.string() | ||
| .describe(`Describe what to click, or type within in a short, specific phrase that mentions the element type. | ||
| Examples: | ||
| - "click the Login button" | ||
| - "click the language dropdown" | ||
| - type "John" into the first name input | ||
| - type "Doe" into the last name input`), | ||
| }), | ||
| execute: async ({ action }) => { | ||
| try { | ||
| let result; | ||
| if (executionModel) { | ||
| result = await stagehandPage.page.act({ | ||
| action, | ||
| modelName: executionModel, | ||
| }); | ||
| } else { | ||
| result = await stagehandPage.page.act(action); | ||
| } | ||
| const isIframeAction = result.action === "an iframe"; | ||
|
|
||
| if (isIframeAction) { | ||
| const fallback = await stagehandPage.page.act( | ||
| executionModel | ||
| ? { action, modelName: executionModel, iframes: true } | ||
| : { action, iframes: true }, | ||
| ); | ||
| return { | ||
| success: fallback.success, | ||
| action: fallback.action, | ||
| isIframe: true, | ||
| }; | ||
| } | ||
|
|
||
| return { | ||
| success: result.success, | ||
| action: result.action, | ||
| isIframe: false, | ||
| }; | ||
| } catch (error) { | ||
| return { success: false, error: error.message }; | ||
| } | ||
| }, | ||
| }); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| import { tool } from "ai"; | ||
| import { z } from "zod/v3"; | ||
| import { StagehandPage } from "../../StagehandPage"; | ||
|
|
||
| export const createAriaTreeTool = (stagehandPage: StagehandPage) => | ||
| tool({ | ||
| description: | ||
| "gets the accessibility (ARIA) tree from the current page. this is useful for understanding the page structure and accessibility features. it should provide full context of what is on the page", | ||
| parameters: z.object({}), | ||
| execute: async () => { | ||
| const { page_text } = await stagehandPage.page.extract(); | ||
| const pageUrl = stagehandPage.page.url(); | ||
|
|
||
| let content = page_text; | ||
| const MAX_CHARACTERS = 70000; | ||
|
|
||
| const estimatedTokens = Math.ceil(content.length / 4); | ||
tkattkat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| if (estimatedTokens > MAX_CHARACTERS) { | ||
| const maxCharacters = MAX_CHARACTERS * 4; | ||
| content = | ||
| content.substring(0, maxCharacters) + | ||
| "\n\n[CONTENT TRUNCATED: Exceeded 70,000 token limit]"; | ||
tkattkat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
|
|
||
| return { | ||
| content, | ||
| pageUrl, | ||
| }; | ||
| }, | ||
| experimental_toToolResultContent: (result) => { | ||
| const content = typeof result === "string" ? result : result.content; | ||
| return [{ type: "text", text: `Accessibility Tree:\n${content}` }]; | ||
| }, | ||
| }); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| import { tool } from "ai"; | ||
| import { z } from "zod/v3"; | ||
|
|
||
| export const createCloseTool = () => | ||
| tool({ | ||
| description: "Complete the task and close", | ||
| parameters: z.object({ | ||
| reasoning: z.string().describe("Summary of what was accomplished"), | ||
| taskComplete: z | ||
| .boolean() | ||
| .describe("Whether the task was completed successfully"), | ||
| }), | ||
| execute: async ({ reasoning, taskComplete }) => { | ||
| return { success: true, reasoning, taskComplete }; | ||
| }, | ||
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| import { tool } from "ai"; | ||
| import { z } from "zod/v3"; | ||
| import { StagehandPage } from "../../StagehandPage"; | ||
| import { LogLine } from "@/types/log"; | ||
|
|
||
| /** | ||
| * Evaluates a Zod schema string and returns the actual Zod schema | ||
| * Uses Function constructor to evaluate the schema string in a controlled way | ||
| */ | ||
| function evaluateZodSchema( | ||
| schemaStr: string, | ||
| logger?: (message: LogLine) => void, | ||
| ): z.ZodTypeAny { | ||
| try { | ||
| // Create a function that returns the evaluated schema | ||
| // We pass z as a parameter to make it available in the evaluated context | ||
| const schemaFunction = new Function("z", `return ${schemaStr}`); | ||
| return schemaFunction(z); | ||
| } catch (error) { | ||
| logger?.({ | ||
| category: "extract", | ||
| message: `Failed to evaluate schema string, using z.any(): ${error}`, | ||
| level: 1, | ||
| auxiliary: { | ||
| error: { | ||
| value: error, | ||
| type: "string", | ||
| }, | ||
| }, | ||
| }); | ||
| return z.any(); | ||
| } | ||
| } | ||
|
|
||
| export const createExtractTool = ( | ||
| stagehandPage: StagehandPage, | ||
| executionModel?: string, | ||
| logger?: (message: LogLine) => void, | ||
| ) => | ||
| tool({ | ||
| description: `Extract structured data from the current page based on a provided schema. | ||
|
|
||
| USAGE GUIDELINES: | ||
| - Keep schemas MINIMAL - only include fields essential for the task | ||
| - IMPORANT: only use this if explicitly asked for structured output. In most scenarios, you should use the aria tree tool over this. | ||
| - If you need to extract a link, make sure the type defintion follows the format of z.string().url() | ||
| EXAMPLES: | ||
| 1. Extract a single value: | ||
| instruction: "extract the product price" | ||
| schema: "z.object({ price: z.number()})" | ||
|
|
||
| 2. Extract multiple fields: | ||
| instruction: "extract product name and price" | ||
| schema: "z.object({ name: z.string(), price: z.number() })" | ||
|
|
||
tkattkat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| 3. Extract arrays: | ||
| instruction: "extract all product names and prices" | ||
| schema: "z.object({ products: z.array(z.object({ name: z.string(), price: z.number() })) })"`, | ||
| parameters: z.object({ | ||
| instruction: z | ||
| .string() | ||
| .describe( | ||
| "Clear instruction describing what data to extract from the page", | ||
| ), | ||
| schema: z | ||
| .string() | ||
| .describe( | ||
| 'Zod schema as a string (e.g., "z.object({ price: z.number() })")', | ||
| ), | ||
| }), | ||
| execute: async ({ instruction, schema }) => { | ||
| try { | ||
| // Evaluate the schema string to get the actual Zod schema | ||
| const zodSchema = evaluateZodSchema(schema, logger); | ||
|
|
||
| // Ensure we have a ZodObject | ||
| const schemaObject = | ||
| zodSchema instanceof z.ZodObject | ||
| ? zodSchema | ||
| : z.object({ result: zodSchema }); | ||
|
|
||
| // Extract with the schema - only pass modelName if executionModel is explicitly provided | ||
| const result = await stagehandPage.page.extract({ | ||
| instruction, | ||
| schema: schemaObject, | ||
| ...(executionModel && { modelName: executionModel }), | ||
| }); | ||
|
|
||
| return { | ||
| success: true, | ||
| data: result, | ||
| timestamp: Date.now(), | ||
| }; | ||
| } catch (error) { | ||
| const errorMessage = | ||
| error instanceof Error ? error.message : String(error); | ||
| return { | ||
| success: false, | ||
| error: `Failed to extract data: ${errorMessage}`, | ||
| timestamp: Date.now(), | ||
| }; | ||
| } | ||
| }, | ||
| }); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| import { tool } from "ai"; | ||
| import { z } from "zod/v3"; | ||
| import { StagehandPage } from "../../StagehandPage"; | ||
|
|
||
| export const createFillFormTool = ( | ||
| stagehandPage: StagehandPage, | ||
| executionModel?: string, | ||
| ) => | ||
| tool({ | ||
| description: `📝 FORM FILL - SPECIALIZED MULTI-FIELD INPUT TOOL | ||
|
|
||
| CRITICAL: Use this for ANY form with 2+ input fields (text inputs, textareas, etc.) | ||
|
|
||
| WHY THIS TOOL EXISTS: | ||
| • Forms are the #1 use case for multi-field input | ||
| • Optimized specifically for input/textarea elements | ||
| • 4-6x faster than individual typing actions | ||
|
|
||
| Use fillForm: Pure form filling (inputs, textareas only) | ||
|
|
||
|
|
||
| MANDATORY USE CASES (always use fillForm for these): | ||
| Registration forms: name, email, password fields | ||
| Contact forms: name, email, message fields | ||
| Checkout forms: address, payment info fields | ||
| Profile updates: multiple user data fields | ||
| Search filters: multiple criteria inputs | ||
|
|
||
|
|
||
|
|
||
| PARAMETER DETAILS: | ||
| • fields: Array of { action, value } objects. | ||
| – action: short description of where to type (e.g. "type 'john@example.com' into the email input"). | ||
| – value: the exact text to enter. | ||
| `, | ||
| parameters: z.object({ | ||
| fields: z | ||
| .array( | ||
| z.object({ | ||
| action: z | ||
| .string() | ||
| .describe( | ||
| 'Description of the typing action, e.g. "type foo into the bar field"', | ||
| ), | ||
| value: z.string().describe("Text to type into the target field"), | ||
tkattkat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }), | ||
| ) | ||
| .min(1, "Provide at least one field to fill"), | ||
| }), | ||
|
|
||
| execute: async ({ fields }) => { | ||
| const instruction = `Return observation results for the following actions: ${fields | ||
| .map((field) => field.action) | ||
| .join(", ")}`; | ||
|
|
||
| const observeResults = executionModel | ||
| ? await stagehandPage.page.observe({ | ||
| instruction, | ||
| modelName: executionModel, | ||
| }) | ||
| : await stagehandPage.page.observe(instruction); | ||
|
|
||
| const completedActions = []; | ||
| for (const result of observeResults) { | ||
| const action = await stagehandPage.page.act(result); | ||
| completedActions.push(action); | ||
| } | ||
|
|
||
| return { success: true, actions: completedActions }; | ||
| }, | ||
| }); | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.