Stagehand agent improvements #1094

tkattkat · 2025-09-23T22:41:17Z

why

This PR enhances the Stagehand agent with model routing, expanded toolset, and more robust context management to improve performance and reliability across different LLM providers.

what changed

Model Routing

Model-specific tool filtering: Tools are now dynamically included/excluded based on the model being used
Anthropic-optimized toolset: When using Claude models with storeActions: false, enables specialized tools for better performance
Custom system prompts: Different system prompts are applied based on the model to optimize behavior

New Tools Added

Anthropic-Specific Tools (enabled when storeActions: false)

clickAndHold: Performs click and hold actions with coordinate precision
type: coordinate based typing
click: Precise coordinate-based clicking
dragAndDrop: drag and drop functionality

Model-Agnostic Tools

think: Allows the agent to reason through problems before acting
keys: Keyboard input handling for complex key combinations
search: Web search capability (auto-enabled when EXA_API_KEY is provided)

Enhanced Context Management

Image optimization: Automatically removes old images, keeping only the 2 most recent
A11y tree management: Maintains only the 2 most recent accessibility trees
checkpointing: Creates conversation summaries every 25 tool calls
Token-based summarization: When context exceeds 120,000 tokens, automatically summarizes content

Enhanced Type Safety

Discriminated union types: AgentToolCall and AgentToolResult provide complete type safety
Tool-specific typing: Each tool has strongly typed parameters and return values

test plan

tested locally
tested on browserbase
tested with exa key, and without to ensure search tool is only present in prompt & tools when key is present
tested with claude 4 / non to ensure system prompt / tools are properly routed based on models being used

…vements

changeset-bot · 2025-09-23T22:41:20Z

🦋 Changeset detected

Latest commit: a7bf3a7

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

greptile-apps

Greptile Overview

Summary

This PR significantly enhances the Stagehand agent with sophisticated model routing, expanded toolset, and robust context management capabilities. The implementation intelligently adapts tools and system prompts based on the LLM provider and configuration.

Key Changes:

Smart Model Routing: Tools are dynamically filtered based on the model being used. When using Claude models with storeActions: false, specialized coordinate-based tools (click, type, dragAndDrop, clickAndHold) are enabled for better performance, while other models use the generic act tool
Enhanced Toolset: Added think for reasoning, keys for keyboard input, search for web searches (when EXA_API_KEY is available), and Anthropic-optimized tools for precise interactions
Advanced Context Management: Implements multi-level compression with image optimization, A11y tree management, and intelligent checkpointing every 25 tool calls. Token-based summarization kicks in at 120,000 tokens to maintain performance
Type Safety: Strong typing with discriminated unions for AgentToolCall and AgentToolResult, plus tool-specific parameter validation

The architecture demonstrates thoughtful design with proper separation of concerns, robust error handling, and performance optimizations. The model routing logic ensures optimal tool selection while maintaining backward compatibility.

Confidence Score: 4/5

This PR is safe to merge with minimal risk
Score reflects well-architected changes with comprehensive testing mentioned, though some minor issues exist like the parameter description error in dragAndDrop tool
Pay close attention to lib/agent/tools/dragAndDrop.ts for the parameter description fix

Important Files Changed

File Analysis

Filename	Score	Overview
lib/handlers/stagehandAgentHandler.ts	4/5	Core agent handler with model routing and tool creation logic - well structured
lib/prompt.ts	4/5	System prompt generation with model-specific routing and tool filtering - complex but solid
lib/agent/tools/index.ts	4/5	Tool creation and filtering logic with proper model routing - clean implementation
lib/agent/tools/dragAndDrop.ts	3/5	Drag and drop tool with coordinate-based interaction - has parameter description issue
lib/agent/contextManager/contextManager.ts	4/5	Complex context management with compression, checkpointing, and summarization - sophisticated implementation
types/agent.ts	5/5	Type definitions with new AgentOptions.storeActions property - clean type additions

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Handler as StagehandAgentHandler
    participant Prompt as buildStagehandAgentSystemPrompt
    participant Tools as createAgentTools
    participant Filter as filterToolsByModelName
    participant Context as ContextManager
    participant Wrapper as modelWrapper
    participant LLM as LLMClient

    Client->>Handler: execute(options)
    Handler->>Handler: Extract storeActions from options
    
    Handler->>Prompt: buildStagehandAgentSystemPrompt(url, modelName, instruction, storeActions)
    Prompt->>Prompt: Detect if Anthropic model (modelName.startsWith("claude"))
    Prompt->>Prompt: Check useAnthropicCustomizations = isAnthropic && storeActions === false
    alt useAnthropicCustomizations = true
        Prompt-->>Handler: Return prompt with click, type, dragAndDrop tools
    else useAnthropicCustomizations = false
        Prompt-->>Handler: Return prompt with act tool (no click/type/dragAndDrop)
    end
    
    Handler->>Tools: createAgentTools(stagehand, {mainModel, storeActions})
    Tools->>Tools: Create all tool instances
    note over Tools: EXA_API_KEY check for search tool
    Tools->>Filter: filterToolsByModelName(mainModel, tools, storeActions)
    
    alt isAnthropic && storeActions === false
        Filter->>Filter: Keep all tools except fillForm
        Filter-->>Tools: Return Anthropic-optimized toolset
    else Other models or storeActions = true
        Filter->>Filter: Remove dragAndDrop, clickAndHold, click, type, fillFormVision
        Filter-->>Tools: Return standard toolset
    end
    
    Tools-->>Handler: Return filtered tools
    
    Handler->>Context: new ContextManager(logger)
    Handler->>Wrapper: modelWrapper(llmClient, contextManager, sessionId)
    Wrapper->>Wrapper: Wrap model with context processing middleware
    Wrapper-->>Handler: Return wrapped model
    
    Handler->>LLM: generateText({model: wrappedModel, system: systemPrompt, tools})
    LLM-->>Handler: Return result with actions
    Handler-->>Client: Return AgentResult

_{36 files reviewed, 1 comment}

_{Edit Code Review Bot Settings | Greptile}

lib/agent/tools/dragAndDrop.ts

tkattkat added 27 commits September 22, 2025 13:16

initial commit

13b0603

update agent types

c097fba

update logger type

ae974a5

update log levels

4b26bf1

remove logger helper, and use inline

d6434d6

extract changes

a4b277e

remove unnecessary return values from tools

0f376a2

clean up action handler types

5801b85

remove aria tree caching

9ad0e6d

move system prompt to prompt.ts

5dc0d1d

Merge remote-tracking branch 'origin/main' into stagehand-agent-impro…

edce0cc

…vements

remove unnecessary type casting

b2742dd

remove more unnecessary type casting

ab1ec2d

update aria tool

01a7de3

update hasSearch

47de9ea

improve tool return values

2a5a5b6

remove unnecessary params

4ff715b

add store actions flag

4760b07

Merge remote-tracking branch 'origin/main' into stagehand-agent-impro…

44be5e3

…vements

merge changes

a8eddcc

update goto

61cede9

temp remove store actions flag

a76407d

add execution model to eval runner

c321896

add store actions flag

8a33ebc

update prompt

843b60d

replace any types with proper typing

793c3d1

remove execution model

6754ead

tkattkat added 2 commits September 23, 2025 15:49

add exa dependency

4fd7bca

update docs

8bcf453

changeset

4b3239d

tkattkat marked this pull request as ready for review September 24, 2025 00:40

greptile-apps bot reviewed Sep 24, 2025

View reviewed changes

lib/agent/tools/dragAndDrop.ts Outdated Show resolved Hide resolved

update prompt on drag and drop

887e5d5

tkattkat marked this pull request as draft September 24, 2025 18:00

tkattkat added 6 commits September 24, 2025 11:38

temp disable tool filtering

7fbbbf6

temp disable model routing

966d92b

add back model routing

529f226

add move

f8fdd5c

prompt changes

0125390

update prompts

a7bf3a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stagehand agent improvements #1094

Stagehand agent improvements #1094

tkattkat commented Sep 23, 2025

Uh oh!

changeset-bot bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Stagehand agent improvements #1094

Are you sure you want to change the base?

Stagehand agent improvements #1094

Conversation

tkattkat commented Sep 23, 2025

why

what changed

test plan

Uh oh!

changeset-bot bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

changeset-bot bot commented Sep 23, 2025 •

edited

Loading