Reddish Trends Engine — backend of Reddish Trends

This README documents the engine that powers the Reddish Trends website. It focuses on:

Reddit fetching & preprocessing
Sentiment extraction
Market enrichment (yfinance)
The explainable 3-step ranking algorithm (data_processing.py)
API endpoints and scheduling (main.py)

Quick overview

Purpose: turn Reddit conversation into ranked, explainable stock signals enriched with market data.
Primary language: Python
Main modules:
- sentiment_analysis.py — fetch + preprocess Reddit posts + VADER sentiment + ticker extraction
- market_analysis.py — yfinance enrichment (price, high, low, RSI, percent change)
- data_processing.py — deterministic 3‑step ranking (Top/Worst/Rising)
- market_sentiment_analysis.py — orchestration + parallelization
- main.py — Flask API, caching and APScheduler cron jobs
- gpt_processing.py — optional GPT JSON summarization for headline stocks

Core pipeline (high level)

Fetch posts and top-level comments from target subreddits and build a "full_text" blob per post.
Extract tickers using regex r"$[A-Z]+" and compute compound sentiment (VADER) per post.
Aggregate mentions & average sentiment per ticker across posts.
Enrich each ticker using yfinance (price, range, RSI, etc).
Run the 3-step ranking algorithm to produce Top, Worst and Rising stocks.
Optionally call GPT to produce a concise JSON analysis for each headline stock.
Cache results to cached_analysis.json and expose via API endpoints.

Reddit fetching & preprocessing (implementation notes)

Fetch behavior (sentiment_analysis.py):
- Retrieve limit submissions for each subreddit and post_type (hot/new/top/rising/controversial).
- For "top" & "controversial", optionally pass a time_filter (hour/day/week/month/year/all).
- Load top-level comments only via submission.comments.replace_more(limit=0).
- Limit comments per post to comment_limit to keep latency predictable.
- Build full_text = "Post Title: ... Post Text: ... Top Comments: ..." and run VADER on that blob.
- Extract tickers using r"$[A-Z]+" and aggregate counts + sentiment scores.
Why this design:
- Bounded comment retrieval = predictable runtime.
- Title + body + top comments = useful context, lower noise vs deep-comment traversal.
- Direct mapping ticker → example post yields explainability for UX.

Market enrichment (market_analysis.py)

Uses yfinance to download historical data for requested period (1d, 5d, 1mo, 3mo, 6mo, 1y, etc).
Computes:
- current price, high, low
- absolute change and percentage change vs period-open
- RSI (14-period) when enough data exists
get_stock_data(symbol, period) returns a uniform dict or an error message when data is missing/delisted.

The ranking algorithm (data_processing.py) — explainable 3-step filter

Per-subreddit peak selection:
- For each subreddit, pick the stock(s) that have the highest sentiment (or lowest for worst).
Cross-subreddit frequency:
- Select stocks that reappear across multiple subreddits' peak lists — frequency across communities is rewarded.
Subreddit mention weight:
- If symbols tie on frequency, break ties using the mention count (how many times it was mentioned in the subreddit where it was strongest).

Result: repeat, cross-community traction outranks one-off spikes (explainable and reproducible).

API endpoints (main.py)

POST /api/home
- request.type == "getgeneralanalysis"
  - Returns cached analysis if present and fresh (<24h) otherwise triggers fresh analysis.
- request.type == "redogeneralanalysis"
  - Force a fresh analysis and update cache (saved to cached_analysis.json).
POST /api/playground
- request.type == "getplaygroundgeneralanalysis"
  - Run general analysis with custom parameters (subreddits, limit, comment_limit, sort, time, period).
- request.type == "getplaygroundspecificanalysis"
  - Run specific-stock analysis with custom parameters (subreddits, stocks, etc).
Notes:
- API validates origin and referer headers (configured for https://www.reddishtrends.com).
- Playground endpoints do not write to the global cache.

Scheduler & caching

Cached file: cached_analysis.json.
APScheduler jobs:
- Cron job: daily at 12:00 PM US/Eastern → scheduled_analysis() → perform_general_analysis().
- Interval fallback: every 24 hours to guarantee at least one run per day.
- Startup job checks cache age and triggers update if outdated.
Cached responses include "last_updated" timestamp.

Quick examples

Call the engine API (curl)

curl -X POST http://localhost:5000/api/home \
  -H "Content-Type: application/json" \
  -d '{"request":{"type":"getgeneralanalysis"}}'

Playground example (Python)

import requests
payload = {
  "request": {
    "type":"getplaygroundgeneralanalysis",
    "parameters": {
      "subreddits":["wallstreetbets","stocks"],
      "limit":20,
      "comment_limit":5,
      "sort":"hot",
      "time":None,
      "period":"1mo"
    }
  }
}
r = requests.post("http://localhost:5000/api/playground", json=payload)
print(r.json())

Internal usage (engine dev)

from market_sentiment_analysis import run_general_analysis
from data_processing import get_top_stock, get_worst_stock, get_rising_stock

analysis = run_general_analysis(["wallstreetbets","stocks"], limit=10)
top = get_top_stock(analysis)
worst = get_worst_stock(analysis)
rising = get_rising_stock(analysis, limit=3)

Sample Top_Stock output (abbreviated)

{
  "symbol":"$SPY",
  "company_name":"SPDR S&P 500 ETF Trust",
  "count":3,
  "sentiment":8.71,
  "price":576.68,
  "percentage_change":-1.21,
  "rsi":28.53,
  "GPT_Analysis":{"overview":"...","prediction":"...","Confidence Score":78}
}

Operational tips

Put API keys and secrets in a .env file (do not commit).
Rate-limit protections: keep limit and comment_limit reasonable for production to avoid API caps.
Add unit tests around data_processing.py for tie-breaks and edge cases (delisted symbols, missing data).

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
assets		assets
tests		tests
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
app.py		app.py
cached_analysis.json		cached_analysis.json
data_processing.py		data_processing.py
gpt_processing.py		gpt_processing.py
main.py		main.py
market_analysis.py		market_analysis.py
market_sentiment_analysis.py		market_sentiment_analysis.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt
sentiment_analysis.py		sentiment_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddish Trends Engine — backend of Reddish Trends

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

HaiderMalikk/Reddish-Trends-Engine

Folders and files

Latest commit

History

Repository files navigation

Reddish Trends Engine — backend of Reddish Trends

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages