Skip to content

Commit 975233b

Browse files
committed
cron, email sender, pm2 daemon manager added
1 parent ee91031 commit 975233b

File tree

5 files changed

+74
-1051
lines changed

5 files changed

+74
-1051
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ Web Scrapers built using:
88
- [Express](https://expressjs.com/)
99
- [Cheerio](https://cheerio.js.org/)
1010
- [Axios](https://www.npmjs.com/package/axios)
11+
- [Dotenv](https://www.npmjs.com/package/dotenv)
12+
- [File System (fs)](https://nodejs.org/api/fs.html)
13+
- [Path](https://nodejs.org/api/path.html)
14+
- [PM2 Process Management (daemon process manager)](https://pm2.keymetrics.io/)
1115
- [Node-Fetch](https://www.npmjs.com/package/node-fetch)
1216
- [Unirest](https://www.npmjs.com/package/unirest)
1317
- [Got-Scraping](https://www.npmjs.com/package/got-scraping)
@@ -57,7 +61,7 @@ Web Scrapers built using:
5761

5862
### **Google Jobs Scraper _(Node.js, Cheerio, Unirest, PDFKit)_**
5963

60-
- _Scrapes Google for the latest jobs in an area, and converts the scraped data into a PDF file in a local folder._
64+
- _Running as a background app via [PM2 (Process Management)](https://pm2.keymetrics.io/), Job scrapers scrapes Google for the latest jobs in an specific area, converts the scraped data into a PDF file, saves to a local folder, & uploaded/sent as an email via custom-made [Email Sender App](https://github.com/keithhetrick/nodemailer-project)._
6165

6266
### **Google Images Scraper _(Node.js, Cheerio, Unirest)_**
6367

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
LOCAL_PATH="/Users/keithhetrick/Desktop/VS-Code-Projects/web-scraper-test-2"

googleWebScrapers/googleJobScraper/googleJobScraper.js

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,13 @@
1+
require("dotenv").config();
2+
13
const cheerio = require("cheerio");
24
const unirest = require("unirest");
5+
const cron = require("node-cron");
36
const fs = require("fs");
47
const PDFDocument = require("pdfkit");
58

9+
const LOCAL_PATH = process.env.LOCAL_PATH;
10+
611
const getJobsData = async () => {
712
try {
813
const url =
@@ -53,12 +58,10 @@ const getJobsData = async () => {
5358
// create PDF file
5459
const doc = new PDFDocument();
5560

56-
const file_Name = "dailyJobScraper.pdf";
57-
const file_Path = "../../../web-scraper-test-2/";
58-
59-
// path.resolve('joe.txt'); // '/Users/joe/joe.txt' if run from my home folder
61+
const file_Name = `dailyJobScraper-${new Date().getTime()}.pdf`;
62+
const file_Path = LOCAL_PATH;
6063

61-
// clean file name to elimiate colons, spaces & commas in the file name
64+
// sanitize file name
6265
const full_FileName = (file_Path + file_Name)
6366
.replace(/:/g, "-")
6467
.replace(/,/g, "-")
@@ -102,8 +105,20 @@ const getJobsData = async () => {
102105
});
103106

104107
doc.end();
108+
console.log(
109+
"\nPDF file created successfully! Saved to designated path for emailer app.\n"
110+
);
105111
} catch (e) {
106112
console.log("ERROR:", e);
107113
}
108114
};
109115
getJobsData();
116+
117+
// ======================================================== \\
118+
// ================== CRON SCHEDULER ====================== ||
119+
// ======================================================== //
120+
121+
cron.schedule("0 8 * * *", getJobsData, {
122+
scheduled: true,
123+
timezone: "America/Chicago",
124+
});

0 commit comments

Comments
 (0)