This project evaluates multiple approaches to sentiment classification on financial text data — ranging from traditional ML models to zero-shot large language models (LLMs).
We benchmark the performance of several models on financial sentiment classification, comparing accuracy, F1 scores, and per-source breakdowns across diverse datasets.
| Model | Type | Notes |
|---|---|---|
| Logistic Regression | Traditional ML | TF-IDF + LogisticRegression |
| FinBERT | Transformer | Finetuned BERT on finance texts |
| SpaCy + TextBlob | Rule-based | Polarity → Label mapping |
| o4-mini (Azure) | LLM | o4-mini via Azure OpenAI |
project-root/
├── notebooks/ ← Main experiment notebooks
│ ├── 1_data_preprocessing.ipynb
│ ├── 2_logistic_regression_baseline.ipynb
│ ├── 3_finbert_inference.ipynb
│ ├── 4_spacy_textblob.ipynb
│ ├── 5_o4mini_inference.ipynb
│ └── demo.ipynb ← 📊 Summary notebook with visuals
│
├── scripts/ ← Reusable utilities
│ ├── metrics.py
│ ├── preprocessing.py
│ └── plot_utils.py
│
├── data/
│ └── processed/ ← Cleaned CSVs from preprocessing
│
├── models/ ← Saved predictions & classification reports
│
└── README.md
git clone https://github.com/your-username/financial-sentiment-analysis.git
cd financial-sentiment-analysis
pip install -r requirements.txtRequires: Python 3.10+, Jupyter, scikit-learn, transformers, spacy, seaborn
Optional:
python -m spacy download en_core_web_smStart by running the following notebooks in order:
1_data_preprocessing.ipynb– Loads and prepares datasets2_logistic_regression_baseline.ipynb3_finbert_inference.ipynb4_spacy_textblob.ipynb5_o4mini_inference.ipynbdemo.ipynb– 📊 Visit this first! Summary visualizations and model comparison.
A side-by-side comparison of model F1 scores by source and sentiment class is provided in
demo.ipynb.
To run 5_o4mini_inference.ipynb, you need:
- Azure OpenAI access
- A deployed
o4-minimodel (deployment name:o4-mini) - Your endpoint and key in the notebook (suggest using environment variables)
This project is for educational and research purposes only. You may adapt and share with attribution.