- Pager:根据分页URL请求格式, 获取某一范围的所有页的response
- Ruler: 指定网页response 分析规则
- URL Collector: 依赖
Pager
和Ruler
收集所有的需要最终爬取数据的页面的URL集合 - Data Collector: 从
URL Collector
中读取URL, 并指定Ruler
集合, 让后爬取相关数据 - Data Storage: 从
Data Collector
中读取数据存储到指定位置, 现在只支持到CSV
-
Notifications
You must be signed in to change notification settings - Fork 0
songshine/crawler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published