Job Scooper v4

Get automatic email alerts for all new job postings from 70+ job boards and career websites based on your keywords and location.

Jobs Scooper currently supports over 70 different sites, such Monster, Indeed, Facebook & ZipRecruiter. view all supported sites to see the full list.

Configuring Search Terms for a User

Make a copy of the example_config.ini and edit it's settings to match the search keywords and locations that you want for that user:

[global_search_options]
search_location[] = "Seattle, WA"

[search_keyword_set]
[search_keyword_set.analytics]
keywords=["analytics manager", "Digital market", "director"]   ;# will pick up analytics manager and senior/sr analytics manager

Then update config.ini values for your notification email address and output folder path. That's it! Run Jobs Scooper and let it do the work for you.

##Power Up Your Results! If you're looking at job listings across many sites, Job Scooper has some built-in features to make that work much easier:

Automatic duplication detection: if the same job is posted on multiple sites, job scooper automatically marks all but the first one as duplicates so you don't waste time reviewing the same job again.
Automatically filtered to match titles by your keywords: The majority of sites do not support filtering your search to match only the job title. One of the best features of Job Scooper is that it let's you filter to title-only matches for any site, regardless of whether the site supports it or not!
Exclude specific companies automatically
Exclude particular job title matches automatically

That's just the start of what Jobs Scooper can do....

Running Jobs_Scooper

To run Jobs Scooper, set the following environment variables

JOBSCOOPER_HOST_VOLDIR=/private/var/local/jobs_scooper
JOBSCOOPER_CONFIG=/private/var/local/jobs_scooper/configs/jobscooper.ini
JOBSCOOPER_OUTPUT=/private/var/local/jobs_scooper/output

Then type:

/usr/bin/php run_job_scooper.php

Requirements:

PHP 5.6.24
Python 2.7
pip: To install the other required Python modules, run "pip install -r ./python/pyJobNormalizer/requirements.txt"
NTLK for Python: You will also need the NTLK data available at http://www.nltk.org/data.html.
Selenium server standalone: required for dynamic/client-side websites.
You can configure the app to run selenium locally, in Docker or pointing to a Selenium instance running on another host or in Docker. Check out scripts/start_selenium.sh for an example.
To run standalone manually: download selenium-server-standalone-3.0.1.jar and copy it to the /lib directory. Oracle](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) Selenium Standalone Server requires Java 8 on macOS 10.12.

What's New in JobScooper V4

Job Site Plugin Authoring Using Agenty's Data Scraping Studio ✏️

Non-developers can now author plugins using Data Scraping Studio or the Advanced Web Scraper Chrome extension from (http://www.agenty.com)[Agenty].

Just tag the specific job results page fields for the site and set the Agenty field names to match the corresponding Job Scooper field:

(
        'TotalResultPageCount',
	'NoPostsFound',
	'TotalPostCount',
	'JobPostItem',
	'NextButton',
	'JobSitePostId',
	'Title',
	'Url',
	'Department',
	'Location',
	'Category',
	'Company',
	'PostedAt',
	'EmploymentType',
);

Once you've tagged the fields that are available, export the Agenty data scraping agent configuration you authored out to a JSON file. In that JSON file, add a key/value under Pagination called "Type" with a value equal to the corresponding Job Scooper value for the job sites pagination and load more results style.

"Pagination": {
    "Type": "INFINITE-SCROLL-NO-CONTROL"
  }

The currently list of pagination types supported by Job Scooper can be found in SitePlugins.php.

Save your updated Agenty JSON config file to the plugins/json-based directory and kick off a job scooper run. Your new Agenty-authored plugin will run exactly like any other plugin built for Job Scooper.

Don't Know PHP? Add a plugin solely via JSON instead! ✏️

You can define the full configuration for a job site plugin in a single JSON file. Just drop the new file into the plugins/JsonBased folder and let it rip.

{
  "JobSiteName": "Startjobs",
  "SourceURL": "https://start.jobs/search",
  "Collections": [
    {
      "Name": "PageFields",
      "Fields": [
      {
          "Name": "JobPostItem",
          "Selector": ".js-job",
          "Attribute": "node",
          "Type": "CSS"
        },
        {
          "Name": "pages_count",
          "Selector": "div.js-infinite-scroll",
          "Attribute": "data-total-pages",
          "Type": "CSS"
        }
      ]
    },
    {
      "Name": "ItemFields",
      "Fields": [
        {
          "Name": "Title",
          "Selector": ".title",
          "Attribute": "text",
          "Type": "CSS"
        },
        {
          "Name": "Location",
          "Selector": "span.location",
          "Attribute": "text",
          "Type": "CSS"
        }
      }
    }
  ],
  "Pagination": {
    "Type": "INFINITE-SCROLL-NO-CONTROL"
  }
}

Job Scooper now Supports Running under Docker 🖥

Setting up and running job scooper anywhere is now made easier through Docker & Docker Compose. With just a few tweaks to the Dockerfile and associated scripts, you can have JobScooper up and running in a container quickly.

Check out the Dockerfile and build_and_run_docker.* files in the repo for a set of base files that should get you 90% of the way there for your system.

Get the full set of feature additions and updates in the release notes.

Other Stuff

Version: v4.0 release notes
Author: Bryan Selner (dev at recoilvelocity dot com)
Tested mainly in Alpine Linux under Docker for Windows and macOS 10.13. Your mileage might vary on other platforms.
Issues/Bugs: Please report them!

Name		Name	Last commit message	Last commit date
Latest commit History 1,399 Commits
Config		Config
Plugins		Plugins
examples		examples
include/static		include/static
python/pyJobNormalizer		python/pyJobNormalizer
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
codefresh.yml		codefresh.yml
composer.json		composer.json
docker-compose.yml		docker-compose.yml
run_job_scooper.php		run_job_scooper.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Job Scooper v4

Get automatic email alerts for all new job postings from 70+ job boards and career websites based on your keywords and location.

Configuring Search Terms for a User

Running Jobs_Scooper

Requirements:

What's New in JobScooper V4

Job Site Plugin Authoring Using Agenty's Data Scraping Studio ✏️

Don't Know PHP? Add a plugin solely via JSON instead! ✏️

Job Scooper now Supports Running under Docker 🖥

Other Stuff

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

selner/job_scooper_v4

Folders and files

Latest commit

History

Repository files navigation

Job Scooper v4

Get automatic email alerts for all new job postings from 70+ job boards and career websites based on your keywords and location.

Configuring Search Terms for a User

Running Jobs_Scooper

Requirements:

What's New in JobScooper V4

Job Site Plugin Authoring Using Agenty's Data Scraping Studio ✏️

Don't Know PHP? Add a plugin solely via JSON instead! ✏️

Job Scooper now Supports Running under Docker 🖥

Other Stuff

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages