|
1 | 1 | # Midscene Python |
2 | 2 |
|
3 | | -Midscene Python 是一个基于 AI 的自动化框架,支持 Web 和 Android 平台的 UI 自动化操作。 |
| 3 | +Midscene Python is an AI-based automation framework that supports UI automation operations on Web and Android platforms. |
4 | 4 |
|
5 | | -## 概述 |
| 5 | +## Overview |
6 | 6 |
|
7 | | -Midscene Python 提供全面的 UI 自动化能力,具有以下核心特性: |
| 7 | +Midscene Python provides comprehensive UI automation capabilities with the following core features: |
8 | 8 |
|
9 | | -- **自然语言驱动**:使用自然语言描述自动化任务 |
10 | | -- **多平台支持**:支持 Web(Selenium/Playwright)和 Android(ADB) |
11 | | -- **AI 模型集成**:支持 GPT-4V、Qwen2.5-VL、Gemini 等多种视觉语言模型 |
12 | | -- **可视化调试**:提供详细的执行报告和调试信息 |
13 | | -- **缓存机制**:智能缓存提升执行效率 |
| 9 | +- **Natural Language Driven**: Describe automation tasks using natural language |
| 10 | +- **Multi-platform Support**: Supports Web (Selenium/Playwright) and Android (ADB) |
| 11 | +- **AI Model Integration**: Supports multiple vision-language models such as GPT-4V, Qwen2.5-VL, and Gemini |
| 12 | +- **Visual Debugging**: Provides detailed execution reports and debugging information |
| 13 | +- **Caching Mechanism**: Intelligent caching to improve execution efficiency |
14 | 14 |
|
15 | | -## 项目架构 |
| 15 | +## Project Architecture |
16 | 16 |
|
17 | 17 | ``` |
18 | 18 | midscene-python/ |
19 | | -├── midscene/ # 核心框架 |
20 | | -│ ├── core/ # 核心框架 |
21 | | -│ │ ├── agent/ # Agent系统 |
22 | | -│ │ ├── insight/ # AI推理引擎 |
23 | | -│ │ ├── ai_model/ # AI模型集成 |
24 | | -│ │ ├── yaml/ # YAML脚本执行器 |
25 | | -│ │ └── types.py # 核心类型定义 |
26 | | -│ ├── web/ # Web集成 |
27 | | -│ │ ├── selenium/ # Selenium集成 |
28 | | -│ │ ├── playwright/ # Playwright集成 |
29 | | -│ │ └── bridge/ # Bridge模式 |
30 | | -│ ├── android/ # Android集成 |
31 | | -│ │ ├── device.py # 设备管理 |
| 19 | +├── midscene/ # Core framework |
| 20 | +│ ├── core/ # Core framework |
| 21 | +│ │ ├── agent/ # Agent system |
| 22 | +│ │ ├── insight/ # AI inference engine |
| 23 | +│ │ ├── ai_model/ # AI model integration |
| 24 | +│ │ ├── yaml/ # YAML script executor |
| 25 | +│ │ └── types.py # Core type definitions |
| 26 | +│ ├── web/ # Web integration |
| 27 | +│ │ ├── selenium/ # Selenium integration |
| 28 | +│ │ ├── playwright/ # Playwright integration |
| 29 | +│ │ └── bridge/ # Bridge mode |
| 30 | +│ ├── android/ # Android integration |
| 31 | +│ │ ├── device.py # Device management |
32 | 32 | │ │ └── agent.py # Android Agent |
33 | | -│ ├── cli/ # 命令行工具 |
34 | | -│ ├── mcp/ # MCP协议支持 |
35 | | -│ ├── shared/ # 共享工具 |
36 | | -│ └── visualizer/ # 可视化报告 |
37 | | -├── examples/ # 示例代码 |
38 | | -├── tests/ # 测试用例 |
39 | | -└── docs/ # 文档 |
| 33 | +│ ├── cli/ # Command line tools |
| 34 | +│ ├── mcp/ # MCP protocol support |
| 35 | +│ ├── shared/ # Shared utilities |
| 36 | +│ └── visualizer/ # Visual reports |
| 37 | +├── examples/ # Example code |
| 38 | +├── tests/ # Test cases |
| 39 | +└── docs/ # Documentation |
40 | 40 | ``` |
41 | 41 |
|
42 | | -## 技术栈 |
| 42 | +## Tech Stack |
43 | 43 |
|
44 | | -- **Python 3.9+**:核心运行环境 |
45 | | -- **Pydantic**:数据验证和序列化 |
46 | | -- **Selenium/Playwright**:Web 自动化 |
47 | | -- **OpenCV/Pillow**:图像处理 |
48 | | -- **HTTPX/AIOHTTP**:HTTP 客户端 |
49 | | -- **Typer**:CLI 框架 |
50 | | -- **Loguru**:日志记录 |
| 44 | +- **Python 3.9+**: Core runtime environment |
| 45 | +- **Pydantic**: Data validation and serialization |
| 46 | +- **Selenium/Playwright**: Web automation |
| 47 | +- **OpenCV/Pillow**: Image processing |
| 48 | +- **HTTPX/AIOHTTP**: HTTP client |
| 49 | +- **Typer**: CLI framework |
| 50 | +- **Loguru**: Logging |
51 | 51 |
|
52 | | -## 快速开始 |
| 52 | +## Quick Start |
53 | 53 |
|
54 | | -### 安装 |
| 54 | +### Installation |
55 | 55 |
|
56 | 56 | ```bash |
57 | 57 | pip install midscene-python |
58 | 58 | ``` |
59 | 59 |
|
60 | | -### 基础用法 |
| 60 | +### Basic Usage |
61 | 61 |
|
62 | 62 | ```python |
63 | 63 | from midscene import Agent |
64 | 64 | from midscene.web import SeleniumWebPage |
65 | 65 |
|
66 | | -# 创建 Web Agent |
| 66 | +# Create a Web Agent |
67 | 67 | with SeleniumWebPage.create() as page: |
68 | 68 | agent = Agent(page) |
69 | 69 |
|
70 | | - # 使用自然语言进行自动化操作 |
71 | | - await agent.ai_action("点击登录按钮") |
72 | | - await agent.ai_action("输入用户名 'test@example.com'") |
73 | | - await agent.ai_action("输入密码 'password123'") |
74 | | - await agent.ai_action("点击提交按钮") |
| 70 | + # Perform automation operations using natural language |
| 71 | + await agent.ai_action("Click the login button") |
| 72 | + await agent.ai_action("Enter username 'test@example.com'") |
| 73 | + await agent.ai_action("Enter password 'password123'") |
| 74 | + await agent.ai_action("Click the submit button") |
75 | 75 |
|
76 | | - # 数据提取 |
77 | | - user_info = await agent.ai_extract("提取用户个人信息") |
| 76 | + # Data extraction |
| 77 | + user_info = await agent.ai_extract("Extract user personal information") |
78 | 78 |
|
79 | | - # 断言验证 |
80 | | - await agent.ai_assert("页面显示欢迎信息") |
| 79 | + # Assertion verification |
| 80 | + await agent.ai_assert("Page displays welcome message") |
81 | 81 | ``` |
82 | 82 |
|
83 | | -## 主要特性 |
| 83 | +## Key Features |
84 | 84 |
|
85 | | -### 🤖 AI 驱动的自动化 |
| 85 | +### 🤖 AI-Driven Automation |
86 | 86 |
|
87 | | -使用自然语言描述操作,AI 自动理解并执行: |
| 87 | +Describe operations using natural language, and AI automatically understands and executes: |
88 | 88 |
|
89 | 89 | ```python |
90 | | -await agent.ai_action("在搜索框中输入'Python教程'并搜索") |
| 90 | +await agent.ai_action("Enter 'Python tutorial' in the search box and search") |
91 | 91 | ``` |
92 | 92 |
|
93 | | -### 🔍 智能元素定位 |
| 93 | +### 🔍 Intelligent Element Location |
94 | 94 |
|
95 | | -支持多种定位策略,自动选择最优方案: |
| 95 | +Supports multiple location strategies and automatically selects the optimal solution: |
96 | 96 |
|
97 | 97 | ```python |
98 | | -element = await agent.ai_locate("登录按钮") |
| 98 | +element = await agent.ai_locate("Login button") |
99 | 99 | ``` |
100 | 100 |
|
101 | | -### 📊 数据提取 |
| 101 | +### 📊 Data Extraction |
102 | 102 |
|
103 | | -从页面提取结构化数据: |
| 103 | +Extract structured data from the page: |
104 | 104 |
|
105 | 105 | ```python |
106 | 106 | products = await agent.ai_extract({ |
107 | 107 | "products": [ |
108 | | - {"name": "产品名称", "price": "价格", "rating": "评分"} |
| 108 | + {"name": "Product Name", "price": "Price", "rating": "Rating"} |
109 | 109 | ] |
110 | 110 | }) |
111 | 111 | ``` |
112 | 112 |
|
113 | | -### ✅ 智能断言 |
| 113 | +### ✅ Intelligent Assertions |
114 | 114 |
|
115 | | -AI 理解页面状态,进行智能断言: |
| 115 | +AI understands page state and performs intelligent assertions: |
116 | 116 |
|
117 | 117 | ```python |
118 | | -await agent.ai_assert("用户已成功登录") |
| 118 | +await agent.ai_assert("User has successfully logged in") |
119 | 119 | ``` |
120 | 120 |
|
121 | | -## 许可证 |
| 121 | +### 📝 Credits |
| 122 | + |
| 123 | +Thanks to Midscene Project: https://github.com/web-infra-dev/midscene for inspiration and technical references |
| 124 | + |
| 125 | +## License |
122 | 126 |
|
123 | 127 | MIT License |
0 commit comments