Name	Name	Last commit message	Last commit date
Latest commit History 35 Commits
bookScraper	bookScraper
formulaOneScraper	formulaOneScraper
getImagesFromUrlScraper	getImagesFromUrlScraper
googleWebScrapers	googleWebScrapers
medium-automate-proj	medium-automate-proj
multiUrlFunctionalityScraper	multiUrlFunctionalityScraper
news-scraper	news-scraper
serpApi	serpApi
youtubeTrendingScraper	youtubeTrendingScraper
.gitignore	.gitignore
README.md	README.md

web-scraping-with-node

A collection of web scraping projects using Node.js & their corresponding technologies.

Web Scrapers built using:

Node.js
Express
Cheerio
Axios
Dotenv
File System (fs)
Path
PM2 Process Management (daemon process manager)
Node-Fetch
Unirest
Got-Scraping
Node-Cron
Crawlee
Puppeteer
Playwright
EJS
yt-trending-scraper
PDFKit
Json2Csv
CsvToJson
Docker

Formula One Scraper (Node.js, Cheerio, Node-Fetch, PDFKit)

Scrapes the Formula One website for the latest news, results,standings, converts the scraped data into a PDF file, and saves it to a local folder.

Book Scraper (Node.js, Cheerio, Axios, Json2Csv, CsvToJson)

Scrapes the website for the latest books, converts the scraped data into a CSV file & saves it to a local folder.

Hacker News Scraper (Node.js, Cheerio, Got-Scraping, Crawlee, Docker)

Version 1: Scrapes the website for the latest news.
Version 2: The CheerioCrawler version using Crawlee is similar, but since Crawlee "simulates" the actions of a real user, the browser settings are defaulted to "headless: false", so the designated browser opens & the whole program runs as automated. Also, any & all Datasets are stored in a storage folder in the root directory, & containerized using Docker.

Product Scraper (Node.js, Cheerio, Playwright, Crawlee, Docker)

Version 1: Scrapes a website for a specific product & takes a screenshot of the webpage. Code is currently set for mintmobile.com.
Version 2: The PlaywrightCrawler version using Crawlee is similar, but since Crawlee "simulates" the actions of a real user, the browser settings are defaulted to "headless: false", so the designated browser opens & the whole program runs as automated. Also, any & all Datasets are stored in a storage folder in the root directory, & containerized using Docker.

Amazon Scraper (Node.js, Cheerio, Puppeteer, Playwright)

Version 1: Scrapes Amazon for a specific product & takes a screenshot of the webpage.
Version 2: The Playwright version is similar, but since Playwright "simulates" the actions of a real user, the browser settings are defaulted to "headless: false", so the designated browser opens & the whole program runs as automated.

Yelp Scraper (Node.js, Cheerio, Unirest)

Scrapes Yelp for the latest restaurants, their corresponding information & saves it in a local folder.

Google Search Scraper (Node.js, Cheerio, Unirest)

Scrapes Google for the latest search results.

Google Jobs Scraper (Node.js, Cheerio, Unirest, PDFKit)

Running as a background app via PM2 (Process Management), Job scrapers scrapes Google for the latest jobs in an specific area, converts the scraped data into a PDF file, saves to a local folder, & uploaded/sent as an email via custom-made Email Sender App.

Google Images Scraper (Node.js, Cheerio, Unirest)

Scrapes Google for the latest images in an area, and downloads them to a local folder.

Website Image Scraper (Node.js, Puppeteer)

Scrapes a website for all of its images, and downloads them to a local folder.

Youtube Trending Scraper (Node.js, Express, yt-trending-scraper, EJS)

Scrapes YouTube for the latest trending videos by country & category.

Multiple Website Scraper (Node.js, Puppeteer, Node-Cron)

Scrapes multiple websites for images, texts, can perform operations such as button clicking, form submission, as well as saves the scraped data to a local folder. Can also be automated using Node-Cron.

License

MIT

Author

@keithhetrick

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

web-scraping-with-node

A collection of web scraping projects using Node.js & their corresponding technologies.

Table of Contents

Formula One Scraper (Node.js, Cheerio, Node-Fetch, PDFKit)

Book Scraper (Node.js, Cheerio, Axios, Json2Csv, CsvToJson)

Hacker News Scraper (Node.js, Cheerio, Got-Scraping, Crawlee, Docker)

Product Scraper (Node.js, Cheerio, Playwright, Crawlee, Docker)

Amazon Scraper (Node.js, Cheerio, Puppeteer, Playwright)

Yelp Scraper (Node.js, Cheerio, Unirest)

Google Search Scraper (Node.js, Cheerio, Unirest)

Google Jobs Scraper (Node.js, Cheerio, Unirest, PDFKit)

Google Images Scraper (Node.js, Cheerio, Unirest)

Website Image Scraper (Node.js, Puppeteer)

Youtube Trending Scraper (Node.js, Express, yt-trending-scraper, EJS)

Multiple Website Scraper (Node.js, Puppeteer, Node-Cron)

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

keithhetrick/web-scraping-with-node

Folders and files

Latest commit

History

Repository files navigation

web-scraping-with-node

A collection of web scraping projects using Node.js & their corresponding technologies.

Table of Contents

Formula One Scraper (Node.js, Cheerio, Node-Fetch, PDFKit)

Book Scraper (Node.js, Cheerio, Axios, Json2Csv, CsvToJson)

Hacker News Scraper (Node.js, Cheerio, Got-Scraping, Crawlee, Docker)

Product Scraper (Node.js, Cheerio, Playwright, Crawlee, Docker)

Amazon Scraper (Node.js, Cheerio, Puppeteer, Playwright)

Yelp Scraper (Node.js, Cheerio, Unirest)

Google Search Scraper (Node.js, Cheerio, Unirest)

Google Jobs Scraper (Node.js, Cheerio, Unirest, PDFKit)

Google Images Scraper (Node.js, Cheerio, Unirest)

Website Image Scraper (Node.js, Puppeteer)

Youtube Trending Scraper (Node.js, Express, yt-trending-scraper, EJS)

Multiple Website Scraper (Node.js, Puppeteer, Node-Cron)

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages