Tooling for scraping and providing publicly available data from FCSE services. The data is provided using a REST API or webhooks. Requires Node.js >= 20.
The scrapers are implemented as classes (called strategies) which contain several selectors and methods for fetching the data from each container (post, announcement, etc). Adding a new service requires creating a new strategy and linking it. See the example strategy for more info.
To run the scraper:
- Clone the repository:
git clone https://github.com/finki-hub/finki-scraper.git
- Prepare configuration by copying
config/config.sample.json
toconfig/config.json
- Install dependencies:
npm i
- Run the scraper
npm run start
It's also available as a Docker image:
docker run -d \
--name finki-scraper \
--restart unless-stopped \
-v ./cache:/app/cache \
-v ./config:/app/config \
-v ./logs:/app/logs \
ghcr.io/finki-hub/finki-scraper:latest
Or Docker Compose: docker compose up -d
You can select which scrapers to run declaratively (in the configuration with the enabled
flag) or imperatively: npm run start scraper_1 scraper_2 ... scraper_n
- Clone the repository:
git clone https://github.com/finki-hub/finki-scraper.git
- Install dependencies:
npm i
- Prepare configuration:
cp config/config.sample.json config/config.json
- Build the project:
npm run build
- Run it:
npm run start
There is an example configuration file available at config/config.sample.json
. Copy it to config/config.json
and edit it to your liking.
This project is licensed under the terms of the MIT license.