Skip to content

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Oct 17, 2025

Summary

This PR refactors the Midscene API by renaming key methods for improved clarity and consistency:

  • aiAction()aiAct() - More concise and clear naming
  • logScreenshot()recordToReport() - Better reflects the method's purpose

Breaking Changes

⚠️ Breaking Change: The following methods have been renamed:

  • aiAction() is now aiAct()
  • logScreenshot() is now recordToReport()

Backward Compatibility: The old aiAction() method is kept as a deprecated wrapper to maintain backward compatibility. Users should migrate to aiAct() when convenient.

Changes

Core Changes

  • ✨ Renamed aiAction() to aiAct() in packages/core/src/agent/agent.ts
  • ✨ Renamed logScreenshot() to recordToReport() in agent implementation
  • 🔄 Added deprecation wrapper for aiAction() with JSDoc warning
  • 🧪 Updated all test files to use new method names

Web Integration

  • 🔧 Updated Playwright fixture to support new method names
  • 🔧 Added aiAct and recordToReport fixture types
  • 🧪 Updated integration tests

Documentation

  • 📝 Updated all English documentation (15 files)
  • 📝 Updated all Chinese documentation (16 files)
  • 📝 Updated README.md and README.zh.md with new examples
  • 📝 Updated code examples in blog posts and guides

Migration Guide

Before:

await aiAction('click the submit button');
await logScreenshot('After submission');

After:

await aiAct('click the submit button');
await recordToReport('After submission');

Impact

  • Users: Need to update method names in their code (or continue using deprecated aiAction())
  • Documentation: All examples now use the new naming convention
  • Tests: All passing with new method names

Test Plan

  • ✅ All existing unit tests pass
  • ✅ All integration tests updated and passing
  • ✅ Documentation examples verified

🤖 Generated with Claude Code

Copilot AI and others added 3 commits October 17, 2025 17:12
* Initial plan

* fix(cli): allow duplicate YAML files in config.yaml

Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>

* fix(cli): deep clone YAML script to prevent mutation issues

* fix(yaml): prevent mutation of flowItem by creating a new object for processing

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>
Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
….x (#1325)

* refactor(core): remove non-OpenAI SDK support and upgrade to OpenAI 6.x

This commit removes support for Anthropic SDK and Azure OpenAI, simplifying
the codebase to use only the standard OpenAI SDK with OpenAI-style APIs.

Changes:
- Remove Anthropic SDK (@anthropic-ai/sdk) dependency
- Remove Azure OpenAI specific code and @azure/identity dependency
- Remove langsmith wrapper support
- Remove proxy agent support (https-proxy-agent, socks-proxy-agent)
- Upgrade OpenAI SDK from 4.81.0 to 6.3.0
- Simplify createChatClient function to only create standard OpenAI clients
- Remove 'style' parameter from createChatClient return type
- Remove all Anthropic-specific message handling code
- Add openai 6.3.0 as devDependency to @midscene/shared

Benefits:
- Cleaner, more maintainable codebase
- Reduced dependencies (removed 5 packages)
- All AI providers can now be accessed through OpenAI-compatible APIs

Breaking Changes:
- Anthropic SDK mode no longer supported
- Azure OpenAI specific configuration removed
- MIDSCENE_LANGSMITH_DEBUG no longer supported
- httpAgent/socksProxy removed from createChatClient

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(core): model provider documentation and remove Azure and Anthropic configurations

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(core): add proxy support for OpenAI client with HTTP and SOCKS configurations

* feat(core): add qwen-vl specific configuration for high resolution images

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This change ensures that Planning functionality only supports vision
language models (VL mode) and removes DOM-based planning support.

Changes:
- Add validation in ModelConfigManager.getModelConfig() to require
  VL mode for Planning intent
- Remove DOM mode logic from llm-planning.ts (describeUserPage,
  markupImageForLLM)
- Simplify image processing to only support VL mode paths
- Add comprehensive JSDoc documentation for Planning VL mode
  requirement
- Add 6 new unit tests covering Planning VL mode validation in both
  isolated and normal modes
- Fix existing tests to provide VL mode for Planning intent

Breaking Change:
- Planning without VL mode configured will now throw an error with
  clear instructions
- Error message includes all supported VL modes and configuration
  examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
@netlify
Copy link

netlify bot commented Oct 17, 2025

Deploy Preview for midscene failed. Why did it fail? →

Name Link
🔨 Latest commit 80a2c97
🔍 Latest deploy log https://app.netlify.com/projects/midscene/deploys/68f222aba4619800083f99ca

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ quanru
❌ Copilot
You have signed the CLA already but the status is still pending? Let us recheck it.

@quanru
Copy link
Collaborator Author

quanru commented Oct 17, 2025

Closing this PR to restructure as a feature branch workflow

@quanru quanru closed this Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants