refactor(1.0): expose action space into mcp #1481

yuyutaotao · 2025-11-19T04:39:43Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-19T04:43:41Z

packages/mcp/src/index.ts

 async function runServer() {
-  server.server.registerCapabilities({
-    resources: {},
-    logging: {},
-  });
-
-  // Server capabilities are now properly registered
+  midsceneManager = new MidsceneManager();

-  // Register resource handlers BEFORE connecting
-  server.server.setRequestHandler(
-    ListResourcesRequestSchema,
-    handleListResources,
-  );
-  server.server.setRequestHandler(
-    ReadResourceRequestSchema,
-    handleReadResource,
-  );
+  // Initialize tools asynchronously (independent of server)
+  await midsceneManager.initTools();


Re-register MCP resource handlers

In runServer (lines 20‑24) the server now only creates a MidsceneManager, calls initTools, and connects transport, but it never calls server.server.setRequestHandler for ListResourcesRequestSchema, ReadResourceRequestSchema, or SetLevelRequestSchema anymore. As a result, MCP clients can no longer call resources/list, resources/read, or update logging level—the SDK will reject those requests with “method not found,” breaking previously working features such as fetching console logs or saved screenshots. Please wire handleListResources, handleReadResource, and the logging handler back up before calling server.connect.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-11-19T04:43:41Z

packages/mcp/src/midscene.ts

+    // Prepare dynamic action space tools
+    const agent = await this.ensureAgent();
+    const actionSpace = await agent.getActionSpace();
+    const actionTools = this.prepareActionSpaceToolDefinitions(actionSpace);


Avoid requiring a live browser during startup

initTools() now unconditionally calls this.ensureAgent()/agent.getActionSpace() (lines 422‑425) before the MCP server even starts listening. ensureAgent immediately tries to connect to Chrome/ADB; when the browser extension or device is not yet available—which is a common deployment scenario—the promise rejects with “Unable to establish Bridge mode connection…” and the whole server exits before it can accept requests. Previously the agent was created lazily on the first tool invocation so the server could start independently. Please defer agent creation (and action space discovery) until a tool actually runs, or handle connection failures so the server can continue to start.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-11-19T04:43:41Z

packages/mcp/src/midscene.ts

+    const tools = actionSpace.map((action) => ({
+      name: action.name,
+      description: `Ask Midscene (a helper that can understand natural language and perform actions) to perform the action "${action.name}", this action is defined as follows: ${action.description || 'No description provided'}.`,
+      schema: {
+        instruction: z


Preserve documented MCP tool names

Dynamic tool registration now uses the raw action.name from the action space as the MCP tool name (lines 372‑376). For the default web/Android devices those names are simple verbs like Tap, Scroll, Input, etc., which no longer match the canonical midscene_aiTap, midscene_aiScroll, … names still exported from packages/mcp/src/tools.ts and referenced in docs/tests. As soon as this change ships, existing clients calling midscene_aiTap, midscene_aiScroll, etc. will receive “Tool not found.” Either keep the original names when exposing the action space or update the contract/documents/tests accordingly; otherwise the MCP API is silently broken.

Useful? React with 👍 / 👎.

netlify · 2025-11-21T16:11:38Z

✅ Deploy Preview for midscene ready!

Name	Link
🔨 Latest commit	`703569e`
🔍 Latest deploy log	https://app.netlify.com/projects/midscene/deploys/6923c8157e4bed0008387cc7
😎 Deploy Preview	https://deploy-preview-1481--midscene.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-23T16:13:42Z

packages/shared/src/env/parse-model-config.ts

+  const legacyModelFamily = useLegacyLogic
+    ? legacyConfigToModelFamily(provider)
+    : undefined;
+
+  const modelFamilyRaw = provider[keys.modelFamily] || legacyModelFamily;


Planning model ignores VL family when override is set

When a dedicated planning model is provided (e.g., via MIDSCENE_PLANNING_MODEL_NAME or an isolated modelConfig map), the new parser only looks up keys.modelFamily, which for planning is the literal string "THERE_IS_NO_MODEL_FAMILY_FOR_PLANNING". As a result vlMode is always undefined for planning overrides even if MIDSCENE_MODEL_FAMILY is set, so VL-specific preprocessing (padding and bbox filling in plan() at packages/core/src/ai-model/llm-planning.ts:40-59) is skipped for Qwen/UI‑TARS models. This regresses earlier behavior where planning configs inherited the global model family and can degrade element localization for dedicated planning models.

Useful? React with 👍 / 👎.

yuyutaotao added 3 commits November 19, 2025 12:26

refactor(mcp): redefine mcp server for Midscene

b5f01d2

refactor(mcp): redefine mcp server for Midscene

ad485f7

refactor(mcp): redefine mcp server for Midscene

76a0c86

chatgpt-codex-connector bot reviewed Nov 19, 2025

View reviewed changes

yuyutaotao added 6 commits November 19, 2025 12:47

Merge branch '1.0' into feat/mcp-1.0

4733d7a

refactor(mcp): redefine mcp server for Midscene

2c1714d

refactor(mcp): redefine mcp server for Midscene

99723e9

Merge branch '1.0' into feat/mcp-1.0

08d452d

refactor(core): split planning and locator

2d08b30

fix(core): test cases

d037f83

yuyutaotao marked this pull request as draft November 20, 2025 06:26

yuyutaotao added 4 commits November 20, 2025 15:22

chore(core): add assertion plan into action space

7ab61de

Merge branch '1.0' into feat/mcp-1.0

1d41c2e

fix(core): ci

5c71a61

refactor(core): model config callback

e7c3b73

yuyutaotao and others added 14 commits November 22, 2025 00:15

Merge branch '1.0' into feat/mcp-1.0

52afbd5

chore(core): fix lint

f387fae

chore(core): fix lint

d0d7420

test(shared): update env unit tests (#1494)

e5e0f0a

chore(core): change signature of model config

987047e

chore(core): fix lint

e815a0b

fix(core): lint

3f6d2b2

fix(core): mcp unit test

d1cf073

chore(core): update test cases

6c4b668

fix(core): ci

1824f77

chore(core): update test cases

6d79f32

chore(core): update docs

ce65f0c

fix(core): ci test

a0d9112

chore(core): update docs

94f6e5a

yuyutaotao marked this pull request as ready for review November 23, 2025 16:07

chatgpt-codex-connector bot reviewed Nov 23, 2025

View reviewed changes

yuyutaotao added 6 commits November 24, 2025 09:45

chore(core): update deps

9ea9485

chore(core): fix lint

dfc98a7

chore(core): fix building error

f48969d

fix(core): building error

5acbef5

chore(core): update prompt

148772f

fix(core): building error

703569e

quanru approved these changes Nov 24, 2025

View reviewed changes

yuyutaotao merged commit ab07218 into 1.0 Nov 24, 2025
10 checks passed

yuyutaotao deleted the feat/mcp-1.0 branch November 24, 2025 03:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(1.0): expose action space into mcp #1481

refactor(1.0): expose action space into mcp #1481

yuyutaotao commented Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Uh oh!

netlify bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor(1.0): expose action space into mcp #1481

refactor(1.0): expose action space into mcp #1481

Conversation

yuyutaotao commented Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

netlify bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for midscene ready!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Nov 21, 2025 •

edited

Loading