A Simple Web Automation Script based on Capybara-Slenium/Nokogiri, telegram-bot-ruby,Thor and google_url_shortener.
Sqlite Database containing ~100K listings: https://github.com/jahan-paisley/rahnama_dot_com_scraper/blob/master/data/people_ads.db
bundle install
thor list
thor rahnama:generate_dic # Generaet Dictionary based on Ads words
thor rahnama:help [COMMAND] # Describe available commands or one specific command
thor rahnama:scrap_ads # Scrap the Rahnama.com Real Estate Ads based on provided links.txt
thor rahnama:send # Send ads to Telegram Channel
thor rahnama:send_daily_digest # Send ads to Telegram Channel
thor rahnama:update_elasticsearch # Update Elasticsearch data
Output will be saved in a sqlite database and sent to this telegram channel: https://telegram.me/hamshahri_ads
I've setup an ElasticSearch and Kibana to make it easier to search and visualize the data as it grows.
- Save the logo and the filename of bmp file in ads and associate them with advertisers
Tokenize ads to extract important and mostly used keywordsMake the process scheduled and automatic usingwhenever