Bayesian Estimation of Sentiment Impact on Stock Prices

Latest headlines → VADER sentiment → Bayesian Student-t regression (PyMC) → next-day log-return and multi-day price forecasts with uncertainty.

Authors
Shreemadhi Babu Rajendra Prasad (24207575) · Saipavan Narayanasamy (24233785) - M.Sc. in Data & Computational Science, University College Dublin

Poster: Project Poster

About the project

Goal. Turn daily headlines into a quantitative sentiment signal and measure its predictive effect on next-day returns; produce uncertainty-aware price forecasts over short horizons.

Why Student-t? Heavy-tailed residuals guard against outliers and volatility clustering common in returns.

Why Bayesian? Full posteriors + diagnostics (ESS, $\hat{R}$) + calibrated prediction intervals.

Why Streamlit? A fast, transparent interface to explore data, diagnostics, and forecasts.

Workflow overview

Overview

We build a small research app that:

pulls the latest news headlines per ticker,
scores each headline with NLTK VADER (compound),
aggregates to a daily sentiment signal (z_t), and
fits a Bayesian Student-t regression (with PyMC) for next-day log-return and 3-day price forecasts, reporting 94% HDIs for parameters and 90% prediction intervals (PIs) for prices.

What the app does

Headlines → sentiment: For each ticker, fetch recent public headlines and score with VADER (compound). Average by day to create (z_t).
Bayesian regression: Fit a Student-t regression of next-day log-return on yesterday’s sentiment (z_{t-1}) (lag-1). Heavy tails robustify against outliers.
Uncertainty first-class: Report 94% HDIs for (\alpha,\beta,\sigma,\nu) and 90% PIs for predicted prices.
Forecasts: Produce next-3-day price forecast table and chart.
Comparison: Side-by-side β (sentiment effect) table across two tickers + indexed history vs mean forecast plot.
Reproducible logging: Append each run to a local CSV at results/predictions_log.csv (kept out of Git by .gitignore).

Bayesian Model

We model daily log-returns with heavy tails:

$$ r_t ;=; \alpha + \beta, z_{t-1} + \varepsilon_t, \qquad \varepsilon_t \sim \text{Student-t}(\nu, 0, \sigma) $$

$r_t$: next-day log-return
$z_{t-1}$: yesterday’s (lag-1) VADER daily average
Parameters $(\alpha,\beta,\sigma,\nu)$ are inferred with PyMC (NUTS).
β answers: does yesterday’s sentiment move tomorrow’s return?
Price forecasts are obtained by transforming simulated log-return paths to prices.

Install & Run

Python 3.9+ recommended · Docs

1) Create & activate a virtual environment

Windows

python -m venv venv
venv\Scripts\activate

macOS/Linux

python -m venv venv
source venv/bin/activate

2) Install packages

pip install -r requirements.txt

3) One-time: download VADER lexicon used by NLTK

python -m nltk.downloader vader_lexicon

4) Launch the app

streamlit run app/streamlit_app.py

Open the local URL shown by Streamlit (http://localhost:8501).

Using the app

Inputs

Ticker A (required) and Ticker B (optional)
Run Speed: Fast / Standard / Accurate (controls MCMC draws / tuning)

Tabs

Ticker 1 / Ticker 2: company blurb, Latest Headlines, Predicted Log-Return, Next Price (90% PI), Today’s Sentiment, Posterior Summary, 3-day Price Forecast (table + chart).
Comparison: quick table of (\beta) (mean + 94% HDI) and Day-1 price forecast; indexed history vs mean forecast dots.
Run log: a banner displays whether the log CSV was Created or Appended. You can also download the log directly from the UI.

Outputs & Run Log

Figures & tables shown in the UI

Predicted log-return (mean + 94% HDI)
Next price (mean + 90% PI)
Posterior summary for (\alpha, \beta, \sigma, \nu) with diagnostics (ESS, (\hat{R}))
3-day price forecast: (day_ahead, price_mean, price_p05, price_p95) + chart

Run log CSV: results/predictions_log.csv (local; ignored by Git)
Contains timestamp, tickers, posterior summaries and key forecast numbers (including day-ahead price mean and PI endpoints).
Useful for auditing, comparisons across runs, and lightweight experimentation.

Outputs

Forecast charts

Comparison

Comparison table

Tested Predictions

We tested the app and predicted the next day return of BIC on 20th Aug and checked against the actual closing price on 21st Aug using yahoo finance.

Repo structure

project/
├─ app/
│  └─ streamlit_app.py          
├─ poster/
│  └─ final_project_poster_A0.pdf
├─ literature/                   
├─ outputs/                      
├─ results/
│  └─ predictions_log.csv        
├─ requirements.txt
└─ README.md
├─ requirements.txt
└─ README.md

Limitations & future work

Predictability may be weak/noisy; real-world alpha is hard.
Headline sampling & VADER rules can bias the signal — try domain-tuned or LLM sentiment.
Extend to multivariate models (market/sector factors), hierarchical priors, or state-space models with stochastic volatility.
Evaluation: add rolling backtests; CRPS/quantile loss for PIs; compare with AR/ARX/GARCH baselines.
Scheduled data refresh, richer news sources, and caching.

Disclaimer: For research/education only — not financial advice.

Tech stack

Python · Streamlit · PyMC · ArviZ · NumPy · pandas · Matplotlib · NLTK (VADER) · requests/bs4 · yfinance

License

MIT — see LICENSE.

Cite

If you reference this project:

Narayanasamy, S.; Rajendra Prasad, S.B. (2025). Bayesian Estimation of Sentiment Impact on Stock Prices. Version 1.0.0. MIT License. Poster: poster/final_project_poster_A0.pdf.

BibTeX

@misc{narayanasamy_prasad_2025,
  title={Bayesian Estimation of Sentiment Impact on Stock Prices},
  author={Narayanasamy, Sai Pavan and Rajendra Prasad, Shreemadhi Babu},
  year={2025},
  note={Version 1.0.0. Poster: poster/final_project_poster_A0.pdf},
  howpublished={GitHub repository}
}

Acknowledgments

VADER sentiment (NLTK)
Public headline sources used by the app; Yahoo price data
UCD — ACM40960 Projects in Maths Modelling

Contributors

Saipavan Narayanasamy (24233785) - mailto:saipavan.narayanasamy@ucdconnect.ie
Shreemadhi Babu Rajendra Prasad (24207575) - mailto:shreemadhi.baburajendrapra@ucdconnect.ie

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Bayesian Estimation of Sentiment Impact on Stock Prices

Table of Contents

About the project

Workflow overview

Overview

What the app does

Bayesian Model

Install & Run

1) Create & activate a virtual environment

2) Install packages

3) One-time: download VADER lexicon used by NLTK

4) Launch the app

Using the app

Outputs & Run Log

Outputs

Forecast charts

Comparison

Comparison table

Tested Predictions

Repo structure

Limitations & future work

Tech stack

License

Cite

BibTeX

Acknowledgments

Contributors

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
outputs		outputs
poster		poster
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Uh oh!

License

Uh oh!

ACM40960/project-bayesian-stock-impact-estimation

Folders and files

Latest commit

History

Repository files navigation

Bayesian Estimation of Sentiment Impact on Stock Prices

Table of Contents

About the project

Workflow overview

Overview

What the app does

Bayesian Model

Install & Run

1) Create & activate a virtual environment

2) Install packages

3) One-time: download VADER lexicon used by NLTK

4) Launch the app

Using the app

Outputs & Run Log

Outputs

Forecast charts

Comparison

Comparison table

Tested Predictions

Repo structure

Limitations & future work

Tech stack

License

Cite

BibTeX

Acknowledgments

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages