Job Market Intelligence Scraper

A full-stack job aggregation and analytics platform that automatically collects, filters and analyzes job listings from multiple global and remote-first sources.

The system is designed to help developers, freelancers and recruiters discover opportunities faster, track hiring trends and eliminate manual job searching.

Built with a FastAPI backend and a React dashboard, the platform separates data extraction, API services and UI into independent layers for scalability and future SaaS expansion.

Key Features

Multi-source job aggregation (RemoteOK, WeWorkRemotely, African boards and extensible careers pages)
Real-time scraping with manual trigger support
Advanced filtering (keyword, location, job type, source)
Deduplicated job storage and normalization
Analytics dashboard for job trends and insights
CSV export for external workflows

What is included

FastAPI backend with:
- GET /jobs
- POST /scrape/run
- GET /stats
- GET /export/csv
Pluggable scraper architecture under backend/app/scrapers
Deduplication based on job title, company and application URL hash
JSON-backed persistence that can be swapped for PostgreSQL or MongoDB later
React dashboard with:
- keyword and location filtering
- source and job type filtering
- live scrape trigger
- analytics cards and trend panels
- CSV export

Source coverage

Current MVP behavior:

RemoteOK: active scraper using the public API with local fallback samples
WeWorkRemotely: active scraper using RSS with local fallback samples
Greenhouse: active scraper using public job board feeds configured by board token
Lever: active scraper using public postings feeds configured by company slug
Ashby: active scraper using public hosted job board endpoints configured by board slug
Careers Pages: active scraper that extracts JobPosting JSON-LD from configured company, NGO and international-organization careers pages
MyJobMag: active scraper with heuristic extraction for African market listings
BrighterMonday: active scraper with heuristic extraction for East African listings
Corporate Staffing Services: active scraper for Corporate Staffing company postings
Fuzu: active scraper for marketplace-style African job listings

This keeps the architecture production-oriented while focusing on safer, more reliable job sources than high-risk platforms like LinkedIn or Indeed.

Project structure

backend/
  app/
    api/
    data/
    scrapers/
    services/
    main.py
  requirements.txt
frontend/
  src/
  package.json

Backend setup

cd backend
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload

The API starts on http://127.0.0.1:8000.

POST /scrape/run is protected by API key auth, rate limiting, cooldowns, a single active scrape lock and scraper timeouts. For local development the default key is local-dev-scrape-key; change it with JOBINTEL_SCRAPE_API_KEYS.

Frontend setup

cd frontend
npm install
npm run dev

The dashboard starts on http://127.0.0.1:5173.

To point the frontend at a different API URL, create frontend/.env from frontend/.env.example.

Notes on persistence

Jobs are stored in backend/app/data/jobs.json. The storage service is isolated in backend/app/services/job_store.py, so moving to PostgreSQL or MongoDB is mostly a repository-layer change.

Build Status

Backend successfully compiled and validated
Frontend production build completed

Next plans

Replace JSON storage with PostgreSQL and add Alembic migrations.
Add APScheduler or Celery for recurring scrape jobs.
Introduce Playwright-based authenticated adapters where source terms allow it.
Add alert rules for email or webhook notifications.

Why this project matters

Job seekers and recruiters often rely on multiple platforms to find relevant opportunities. This system centralizes job discovery into a single interface, reducing time spent searching and enabling data-driven insights into the job market.

Instead of manually browsing job boards, users get structured, searchable and exportable job data in real time.

Project Positioning

This project is not just a scraper, but a modular job market intelligence system designed to evolve into a scalable SaaS platform for automated job discovery and analytics.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Market Intelligence Scraper

Key Features

What is included

Source coverage

Project structure

Backend setup

Frontend setup

Notes on persistence

Build Status

Next plans

Why this project matters

Project Positioning

🖥️ UI Preview

Main landing page & key insigths

Filters for easier job search

Retrieved jobs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Job Market Intelligence Scraper

Key Features

What is included

Source coverage

Project structure

Backend setup

Frontend setup

Notes on persistence

Build Status

Next plans

Why this project matters

Project Positioning

🖥️ UI Preview

Main landing page & key insigths

Filters for easier job search

Retrieved jobs

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages