Skip to content

Commit 207d34a

Browse files
Python51888gitee-org
authored andcommitted
!4 增加了中英文readme
Merge pull request !4 from Python51888/mdpy
2 parents 307cbf5 + 0592a51 commit 207d34a

File tree

3 files changed

+198
-64
lines changed

3 files changed

+198
-64
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,3 +157,6 @@ bin-release/
157157
# should NOT be excluded as they contain compiler settings and other important
158158
# information for Eclipse / Flash Builder.
159159
>>>>>>> 2a066347ae84a69f9986cffe451aeae1a5364b10
160+
161+
# YoYo AI version control directory
162+
.yoyo/

README.md

Lines changed: 68 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -1,123 +1,127 @@
11
# Midscene Python
22

3-
Midscene Python 是一个基于 AI 的自动化框架,支持 Web 和 Android 平台的 UI 自动化操作。
3+
Midscene Python is an AI-based automation framework that supports UI automation operations on Web and Android platforms.
44

5-
## 概述
5+
## Overview
66

7-
Midscene Python 提供全面的 UI 自动化能力,具有以下核心特性:
7+
Midscene Python provides comprehensive UI automation capabilities with the following core features:
88

9-
- **自然语言驱动**:使用自然语言描述自动化任务
10-
- **多平台支持**:支持 WebSelenium/Playwright)和 AndroidADB
11-
- **AI 模型集成**:支持 GPT-4VQwen2.5-VL、Gemini 等多种视觉语言模型
12-
- **可视化调试**:提供详细的执行报告和调试信息
13-
- **缓存机制**:智能缓存提升执行效率
9+
- **Natural Language Driven**: Describe automation tasks using natural language
10+
- **Multi-platform Support**: Supports Web (Selenium/Playwright) and Android (ADB)
11+
- **AI Model Integration**: Supports multiple vision-language models such as GPT-4V, Qwen2.5-VL, and Gemini
12+
- **Visual Debugging**: Provides detailed execution reports and debugging information
13+
- **Caching Mechanism**: Intelligent caching to improve execution efficiency
1414

15-
## 项目架构
15+
## Project Architecture
1616

1717
```
1818
midscene-python/
19-
├── midscene/ # 核心框架
20-
│ ├── core/ # 核心框架
21-
│ │ ├── agent/ # Agent系统
22-
│ │ ├── insight/ # AI推理引擎
23-
│ │ ├── ai_model/ # AI模型集成
24-
│ │ ├── yaml/ # YAML脚本执行器
25-
│ │ └── types.py # 核心类型定义
26-
│ ├── web/ # Web集成
27-
│ │ ├── selenium/ # Selenium集成
28-
│ │ ├── playwright/ # Playwright集成
29-
│ │ └── bridge/ # Bridge模式
30-
│ ├── android/ # Android集成
31-
│ │ ├── device.py # 设备管理
19+
├── midscene/ # Core framework
20+
│ ├── core/ # Core framework
21+
│ │ ├── agent/ # Agent system
22+
│ │ ├── insight/ # AI inference engine
23+
│ │ ├── ai_model/ # AI model integration
24+
│ │ ├── yaml/ # YAML script executor
25+
│ │ └── types.py # Core type definitions
26+
│ ├── web/ # Web integration
27+
│ │ ├── selenium/ # Selenium integration
28+
│ │ ├── playwright/ # Playwright integration
29+
│ │ └── bridge/ # Bridge mode
30+
│ ├── android/ # Android integration
31+
│ │ ├── device.py # Device management
3232
│ │ └── agent.py # Android Agent
33-
│ ├── cli/ # 命令行工具
34-
│ ├── mcp/ # MCP协议支持
35-
│ ├── shared/ # 共享工具
36-
│ └── visualizer/ # 可视化报告
37-
├── examples/ # 示例代码
38-
├── tests/ # 测试用例
39-
└── docs/ # 文档
33+
│ ├── cli/ # Command line tools
34+
│ ├── mcp/ # MCP protocol support
35+
│ ├── shared/ # Shared utilities
36+
│ └── visualizer/ # Visual reports
37+
├── examples/ # Example code
38+
├── tests/ # Test cases
39+
└── docs/ # Documentation
4040
```
4141

42-
## 技术栈
42+
## Tech Stack
4343

44-
- **Python 3.9+**:核心运行环境
45-
- **Pydantic**:数据验证和序列化
46-
- **Selenium/Playwright**Web 自动化
47-
- **OpenCV/Pillow**:图像处理
48-
- **HTTPX/AIOHTTP**HTTP 客户端
49-
- **Typer**CLI 框架
50-
- **Loguru**:日志记录
44+
- **Python 3.9+**: Core runtime environment
45+
- **Pydantic**: Data validation and serialization
46+
- **Selenium/Playwright**: Web automation
47+
- **OpenCV/Pillow**: Image processing
48+
- **HTTPX/AIOHTTP**: HTTP client
49+
- **Typer**: CLI framework
50+
- **Loguru**: Logging
5151

52-
## 快速开始
52+
## Quick Start
5353

54-
### 安装
54+
### Installation
5555

5656
```bash
5757
pip install midscene-python
5858
```
5959

60-
### 基础用法
60+
### Basic Usage
6161

6262
```python
6363
from midscene import Agent
6464
from midscene.web import SeleniumWebPage
6565

66-
# 创建 Web Agent
66+
# Create a Web Agent
6767
with SeleniumWebPage.create() as page:
6868
agent = Agent(page)
6969

70-
# 使用自然语言进行自动化操作
71-
await agent.ai_action("点击登录按钮")
72-
await agent.ai_action("输入用户名 'test@example.com'")
73-
await agent.ai_action("输入密码 'password123'")
74-
await agent.ai_action("点击提交按钮")
70+
# Perform automation operations using natural language
71+
await agent.ai_action("Click the login button")
72+
await agent.ai_action("Enter username 'test@example.com'")
73+
await agent.ai_action("Enter password 'password123'")
74+
await agent.ai_action("Click the submit button")
7575

76-
# 数据提取
77-
user_info = await agent.ai_extract("提取用户个人信息")
76+
# Data extraction
77+
user_info = await agent.ai_extract("Extract user personal information")
7878

79-
# 断言验证
80-
await agent.ai_assert("页面显示欢迎信息")
79+
# Assertion verification
80+
await agent.ai_assert("Page displays welcome message")
8181
```
8282

83-
## 主要特性
83+
## Key Features
8484

85-
### 🤖 AI 驱动的自动化
85+
### 🤖 AI-Driven Automation
8686

87-
使用自然语言描述操作,AI 自动理解并执行:
87+
Describe operations using natural language, and AI automatically understands and executes:
8888

8989
```python
90-
await agent.ai_action("在搜索框中输入'Python教程'并搜索")
90+
await agent.ai_action("Enter 'Python tutorial' in the search box and search")
9191
```
9292

93-
### 🔍 智能元素定位
93+
### 🔍 Intelligent Element Location
9494

95-
支持多种定位策略,自动选择最优方案:
95+
Supports multiple location strategies and automatically selects the optimal solution:
9696

9797
```python
98-
element = await agent.ai_locate("登录按钮")
98+
element = await agent.ai_locate("Login button")
9999
```
100100

101-
### 📊 数据提取
101+
### 📊 Data Extraction
102102

103-
从页面提取结构化数据:
103+
Extract structured data from the page:
104104

105105
```python
106106
products = await agent.ai_extract({
107107
"products": [
108-
{"name": "产品名称", "price": "价格", "rating": "评分"}
108+
{"name": "Product Name", "price": "Price", "rating": "Rating"}
109109
]
110110
})
111111
```
112112

113-
### 智能断言
113+
### Intelligent Assertions
114114

115-
AI 理解页面状态,进行智能断言:
115+
AI understands page state and performs intelligent assertions:
116116

117117
```python
118-
await agent.ai_assert("用户已成功登录")
118+
await agent.ai_assert("User has successfully logged in")
119119
```
120120

121-
## 许可证
121+
### 📝 Credits
122+
123+
Thanks to Midscene Project: https://github.com/web-infra-dev/midscene for inspiration and technical references
124+
125+
## License
122126

123127
MIT License

README.zh.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# Midscene Python
2+
3+
Midscene Python 是一个基于 AI 的自动化框架,支持 Web 和 Android 平台的 UI 自动化操作。
4+
5+
## 概述
6+
7+
Midscene Python 提供全面的 UI 自动化能力,具有以下核心特性:
8+
9+
- **自然语言驱动**:使用自然语言描述自动化任务
10+
- **多平台支持**:支持 Web(Selenium/Playwright)和 Android(ADB)
11+
- **AI 模型集成**:支持 GPT-4V、Qwen2.5-VL、Gemini 等多种视觉语言模型
12+
- **可视化调试**:提供详细的执行报告和调试信息
13+
- **缓存机制**:智能缓存提升执行效率
14+
15+
## 项目架构
16+
17+
```
18+
midscene-python/
19+
├── midscene/ # 核心框架
20+
│ ├── core/ # 核心框架
21+
│ │ ├── agent/ # Agent系统
22+
│ │ ├── insight/ # AI推理引擎
23+
│ │ ├── ai_model/ # AI模型集成
24+
│ │ ├── yaml/ # YAML脚本执行器
25+
│ │ └── types.py # 核心类型定义
26+
│ ├── web/ # Web集成
27+
│ │ ├── selenium/ # Selenium集成
28+
│ │ ├── playwright/ # Playwright集成
29+
│ │ └── bridge/ # Bridge模式
30+
│ ├── android/ # Android集成
31+
│ │ ├── device.py # 设备管理
32+
│ │ └── agent.py # Android Agent
33+
│ ├── cli/ # 命令行工具
34+
│ ├── mcp/ # MCP协议支持
35+
│ ├── shared/ # 共享工具
36+
│ └── visualizer/ # 可视化报告
37+
├── examples/ # 示例代码
38+
├── tests/ # 测试用例
39+
└── docs/ # 文档
40+
```
41+
42+
## 技术栈
43+
44+
- **Python 3.9+**:核心运行环境
45+
- **Pydantic**:数据验证和序列化
46+
- **Selenium/Playwright**:Web 自动化
47+
- **OpenCV/Pillow**:图像处理
48+
- **HTTPX/AIOHTTP**:HTTP 客户端
49+
- **Typer**:CLI 框架
50+
- **Loguru**:日志记录
51+
52+
## 快速开始
53+
54+
### 安装
55+
56+
```bash
57+
pip install midscene-python
58+
```
59+
60+
### 基础用法
61+
62+
```python
63+
from midscene import Agent
64+
from midscene.web import SeleniumWebPage
65+
66+
# 创建 Web Agent
67+
with SeleniumWebPage.create() as page:
68+
agent = Agent(page)
69+
70+
# 使用自然语言进行自动化操作
71+
await agent.ai_action("点击登录按钮")
72+
await agent.ai_action("输入用户名 'test@example.com'")
73+
await agent.ai_action("输入密码 'password123'")
74+
await agent.ai_action("点击提交按钮")
75+
76+
# 数据提取
77+
user_info = await agent.ai_extract("提取用户个人信息")
78+
79+
# 断言验证
80+
await agent.ai_assert("页面显示欢迎信息")
81+
```
82+
83+
## 主要特性
84+
85+
### 🤖 AI 驱动的自动化
86+
87+
使用自然语言描述操作,AI 自动理解并执行:
88+
89+
```python
90+
await agent.ai_action("在搜索框中输入'Python教程'并搜索")
91+
```
92+
93+
### 🔍 智能元素定位
94+
95+
支持多种定位策略,自动选择最优方案:
96+
97+
```python
98+
element = await agent.ai_locate("登录按钮")
99+
```
100+
101+
### 📊 数据提取
102+
103+
从页面提取结构化数据:
104+
105+
```python
106+
products = await agent.ai_extract({
107+
"products": [
108+
{"name": "产品名称", "price": "价格", "rating": "评分"}
109+
]
110+
})
111+
```
112+
113+
### ✅ 智能断言
114+
115+
AI 理解页面状态,进行智能断言:
116+
117+
```python
118+
await agent.ai_assert("用户已成功登录")
119+
```
120+
121+
### 📝 致谢
122+
123+
感谢Midscene项目:https://github.com/web-infra-dev/midscene 提供的灵感和技术参考
124+
125+
## 许可证
126+
127+
MIT License

0 commit comments

Comments
 (0)