Skip to content

DaMacho/RocketPunch_crawling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RocketPunch_crawling

Please read comments in each file to use.

Objective

  • To build a crawler to get job infos from rocketpuch.com, a startup community.

    • because I am looking for job.
  • To practice my skill.

    • a good practice to code often.
  • Just for fun with productive time.

Components

  • crawler.py
  • jobsdao.py
  • jobsdao_mongo.py
  • config.py

Efficiency

  • 308 job posts / 25 minutes on AWS t2.micro

Requirements

  • Python 3.x
  • MySQL
  • MongoDB
  • chromedriver
  • pyvirtualdisplay
  • selenium
  • requests
  • pymysql
  • pymongo
  • json
  • re

Supplement points

  • speed is slower than supposed
  • as infos remain in DB, suggest to make web app dashboard
  • may good to put search ability in it

About

Job posts crawler, crawl from rocketpunch.com ( Korean startup community )

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages