Core Scraper & Trend Viewer

A Python-based web application that scrapes trending topics from various online sources (Google Trends, BBC News, Reddit, PTT) and displays them in a unified, clean web interface.

✨ Features

Multi-Source Scraping: Gathers data from Google Trends, BBC News, Reddit, and PTT.
Unified Data Pipeline: All scraped data is standardized into a consistent TrendItem schema for easy processing.
Flask Web Interface: A lightweight web server built with Flask to present the data.
Modern & Responsive UI: The frontend is built with Bootstrap, ensuring a clean look on both desktop and mobile.
Universal Card Layout: All trend items are displayed using a single, consistent card component for a unified user experience.
Configurable Architecture: Easily add or remove data sources by editing the central config.py file.

🛠️ Tech Stack

Backend: Python, Flask
Scraping: Playwright, Requests, Feedparser, BeautifulSoup
Frontend: HTML, CSS, Bootstrap, Jinja2
Package Management: uv

🚀 Getting Started

Follow these instructions to get a copy of the project up and running on your local machine.

Prerequisites

Python 3.10+
uv (A fast Python package installer and resolver)

Installation

Clone the repository:

git clone https://github.com/LayorX/core-scraper.git
cd core-scraper

Create a virtual environment:
```
uv venv
```
You may need to activate it afterwards, e.g., .venv\Scripts\activate on Windows.
Install dependencies:
```
uv pip sync pyproject.toml
```
Install Playwright browser binaries: (This is required for the Google Trends scraper)
```
uv run playwright install
```

🏃 Usage

Scrape Data

To run all scrapers defined in config.py and update the JSON files in the /data directory, execute:
```
uv run -m main
```
Run the Web Application

Execute the batch script to start the Flask server:
```
.\run-website.bat
```
The application will be available at http://127.0.0.1:5000.

📁 Project Structure

core-scraper/
├── .venv/                  # Virtual environment
├── data/                   # Stores scraped data in JSON format
├── scrapers/               # Individual scraper modules for each source
│   ├── __init__.py
│   ├── schema.py           # Defines the unified TrendItem data structure
│   └── ...                 # bbc.py, google.py, etc.
├── static/                 # Static assets (CSS, JS, images)
├── templates/              # Jinja2 HTML templates
│   ├── index.html          # Main page template with universal card
│   └── layout.html         # Base layout
├── .gitignore
├── app.py                  # Main Flask application file
├── config.py               # Central configuration for data sources
├── main.py                 # Main script to trigger all scrapers
├── pyproject.toml          # Project metadata and dependencies
├── README.md               # This file
└── run-website.bat         # Script to run the web server

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
scrapers		scrapers
templates		templates
.gitignore		.gitignore
.python-version		.python-version
COMMIT_MSG.txt		COMMIT_MSG.txt
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
README.zh-TW.md		README.zh-TW.md
app.py		app.py
config.py		config.py
home.png		home.png
home_zh.png		home_zh.png
main.py		main.py
pyproject.toml		pyproject.toml
run-website.bat		run-website.bat
uv.lock		uv.lock
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Core Scraper & Trend Viewer

✨ Features

🛠️ Tech Stack

🚀 Getting Started

Prerequisites

Installation

🏃 Usage

📁 Project Structure

About

Uh oh!

Releases

Packages

Languages

License

LayorX/WorldTrendScraper

Folders and files

Latest commit

History

Repository files navigation

Core Scraper & Trend Viewer

✨ Features

🛠️ Tech Stack

🚀 Getting Started

Prerequisites

Installation

🏃 Usage

📁 Project Structure

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages