Skip to content
forked from binux/pyspider

A Powerful Spider(Web Crawler) System in Python.

License

Notifications You must be signed in to change notification settings

mutour/pyspider

Repository files navigation

pyspider Build Status

most powerful spider system in python!

debug demo demo code: gist:9424801

Installation

Docker

# mysql
docker run -it -d --name mysql dockerfile/mysql
# rabbitmq
docker run -it -d --name rabbitmq dockerfile/rabbitmq

# scheduler
docker run -it -d --name scheduler --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider scheduler
# fetcher, run multiple instance if needed.
docker run -it -d -m 64m --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider fetcher
# processor, run multiple instance if needed.
docker run -it -d -m 128m --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider processor
# webui
docker run -it -d -p 5000:5000 --link mysql:mysql --link rabbitmq:rabbitmq --link scheduler:scheduler binux/pyspider webui

Documents

Contribute

License

Licensed under the Apache License, Version 2.0

About

A Powerful Spider(Web Crawler) System in Python.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 82.4%
  • JavaScript 8.9%
  • CSS 4.4%
  • HTML 4.3%