A web game similar to Google Feud where users guess the most common words in news article headlines for different time periods. Test your knowledge of current events by predicting which words appear most frequently in recent news!
🌐 Live Demo: newswordy.vercel.app
Players select a time range (past day, week, month, year, etc.) and try to guess the most frequently occurring words in news headlines from that period. Points are awarded based on how common the guessed words are, similar to Google Feud mechanics.
- Classic Mode: Guess the most common words from recent news headlines across all selected sources
- Source Comparison: Compare word usage between two groups of sources. Find words that appear more in one group than the other
- Word Association: Choose an anchor word and guess words that frequently appear together with it in headlines
- Comparative Association: Compare word usage between two groups of sources with a selected anchor word
- Frontend: React 19 + TypeScript + Tailwind CSS + Material-UI
- Database: Supabase (PostgreSQL) with custom functions for word frequency queries
- Scraping: Python 3.11 with BeautifulSoup, feedparser, SQLAlchemy
- Deployment: Vercel (frontend)
- Authentication: Auth0
- CI/CD: GitHub Actions (scheduled scraping runs twice daily)
This is a serverless application with no traditional backend API:
- Frontend communicates directly with Supabase using PostgreSQL functions
- Python scraper runs via GitHub Actions (scheduled twice daily at 1 AM and 1 PM UTC)
- News articles and word frequencies are stored in PostgreSQL
- Real-time game data is fetched using Supabase client library
newswordy/
├── frontend/ # React TypeScript application
│ ├── src/
│ │ ├── components/ # Reusable UI components
│ │ ├── pages/ # Game pages and routes
│ │ ├── services/ # API and Supabase client
│ │ └── types/ # TypeScript type definitions
│ └── package.json
├── scraper/ # Python news scraping scripts
│ ├── news_scraper.py # Main scraper logic
│ ├── word_processor.py # Word frequency analysis
│ ├── database.py # Database models and operations
│ ├── config.py # News sources configuration
│ ├── scheduler.py # Local scheduling (optional)
│ └── requirements.txt
├── supabase/ # Database schema and SQL functions
│ ├── schema.sql # Complete database schema
│ └── functions/ # RPC functions (SQL)
└── .github/
└── workflows/
└── run-scraper.yaml # GitHub Actions workflow
The scraper collects headlines from 15+ major news sources, including:
- ABC News, Al Jazeera, Axios, BBC News, CBS News
- Fox News, The Guardian, Los Angeles Times, NBC News
- New York Post, Newsmax, NPR, The New York Times
- The Wall Street Journal, The Washington Post, Yahoo News
The source list can easily be expanded by adding new configurations in scraper/config.py and frontend/src/types + frontend/src/components.
- Node.js 18+ and npm
- Python 3.11+
- Vercel account (free tier)
- Supabase account (free tier)
- Auth0 account (free tier)
- Git
-
Clone the repository
git clone https://github.com/ethanpschoen/newswordy.git cd newswordy -
Navigate to frontend directory
cd frontend -
Install dependencies
npm install
-
Set up environment variables
Create a
.envfile in thefrontend/directory:REACT_APP_SUPABASE_URL=your_supabase_project_url REACT_APP_SUPABASE_ANON_KEY=your_supabase_anon_key REACT_APP_AUTH0_DOMAIN=your_auth0_domain REACT_APP_AUTH0_CLIENT_ID=your_auth0_client_id REACT_APP_AUTH0_AUDIENCE=your_auth0_audience
-
Start the development server
npm start
The app will open at
http://localhost:3000
-
Navigate to scraper directory
cd scraper -
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Download NLTK data (required for word processing)
python -c "import nltk; nltk.download('punkt'); nltk.download('punkt_tab'); nltk.download('stopwords')" -
Set up environment variables
Create a
.envfile in thescraper/directory:DB_HOST=your_supabase_db_host DB_PORT=5432 DB_NAME=postgres DB_USER=postgres DB_PASSWORD=your_supabase_db_password DATABASE_URL=your_supabase_connection_string
-
Initialize database tables
python database.py
-
Run the scraper manually
python news_scraper.py
-
Optional: Run scraper with local scheduler
python scheduler.py
This will run the scraper at 6 AM and 6 PM daily.
-
Set up GitHub Secrets
In your GitHub repository, go to Settings → Secrets and variables → Actions, and add:
DB_HOSTDB_PORTDB_NAMEDB_USERDB_PASSWORDDATABASE_URL
-
The workflow is already configured in
.github/workflows/run-scraper.yamlIt runs automatically twice daily (1 AM and 1 PM UTC) and can also be triggered manually via GitHub Actions UI.
- News headline scraping from 15+ major sources
- Word frequency analysis by time period
- Multiple game modes (Classic, Comparison, Association, Comparative Association)
- Interactive word guessing game with real-time feedback
- Real-time scoring system
- User authentication and profiles (Auth0)
- Customizable game settings (time periods, sources, max guesses, scoreboard size)
- Advanced filtering options (source selection, time range)
- Word hints system (fill-in-the-blank, first letter)
- Article information display for guessed words
- User statistics tracking (total games, best score, average score)
- Automated daily scraping via GitHub Actions
- Global and personal leaderboards
This is a personal project for learning and portfolio purposes. However, contributions, suggestions, and feedback are welcome!
If you'd like to contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure your code follows the existing style and includes appropriate tests if applicable.
This project is open source and available for educational purposes.
- Inspired by Google Feud
- Built with React, Supabase, and Python
- News sources provide RSS feeds and public content
Made with ❤️ for news enthusiasts and word game lovers




