Your task is to write a Puppy Scraper. Your scraper will search Craigslist for all pets in the SF area and use regex to return the date, title, and location for any posting that matches the words "pup", "puppy", "puppies", or "dog".
-
Use regex to only return results where the title matches "pup", "puppy", "puppies", or "dog".
-
In addition to matching the keywords above, only return results where the title DOES NOT include "house", "item", "boots", "walker", or "sitter".
-
Bonus: Only return results that have an image.
- Hint: This will involve scraping an extra page element in
get_page_results
as well as updating the logic infilter_links
. You will need to visit http://sfbay.craigslist.org/sfc/pet in the browser and useInspect Element
to find the page element containing the image flag ("pic").
- Fork this repo, and clone it onto your local machine.
- From the terminal, run
ruby scraper.rb
. - Your program returns an empty array (
dogs
). Your goal is to edit thefilter_links
method to add results to thedogs
array based on the regex matching you write.
- To see the all results scraped from Craigslist before filling out
filter_links
, comment out line 39 and comment in line 42.