This project provides a tool to scrape Thai school information from Wikipedia and serve it via an API. It is built with Bun, ElysiaJS, and Cheerio.
- Live Scraping: Fetches up-to-date school data directly from Wikipedia.
- REST API: Serves school data via high-performance ElysiaJS endpoints.
- Data Export: Script to scrape and save all school data to JSON files.
- Bun runtime installed.
bun installBefore running the API, you must scrape the data from Wikipedia. This will generate the necessary JSON files in the dist/ directory.
bun run scrapeOutput:
dist/schools.json: Combined list of all schools (pretty).dist/schools.min.json: Combined list of all schools (minified).dist/provinces/[province].json: Individual JSON files for each province (pretty).dist/provinces/[province].min.json: Individual JSON files for each province (minified).
Start the development server:
bun devThe server will be running at http://localhost:3000.
GET /schools: Retrieve a list of schools.- Query Parameters:
q: Search by school name (optional).province: Filter by province (optional).
- Example:
GET /schools?province=ภูเก็ต
- Query Parameters:
GET /: API Information.GET /openapi: OpenAPI documentation.
src/index.ts: API server entry point.src/scripts/scrape.ts: CLI script for scraping and saving data.src/services/scraper.ts: Scraper logic using Cheerio.src/constants/provinces.ts: List of Thai provinces.
This project uses GitHub Actions to automatically update the school data.
- The workflow runs on the 1st of every 3rd month at midnight UTC.
- It executes the scraper and commits any changes to the
dist/directory back to the repository. - You can also manually trigger the "Update School Data" workflow from the Actions tab.