Optimizing Stock Price Forecasting Using Modern RNNs

Evaluating the Impact of Sentiment Analysis on LSTM, GRU, and Attention-CNN-LSTM Models

📄 WRITE-UP: Project Write-Up
📔 NOTEBOOK: Jupyter Notebook

📌 Overview

This project focuses on optimizing the predictive capabilities of modern RNN models for stock price movements of TSLA, AAPL, and GOOG. The goal is to enhance forecasting accuracy by utilizing historical stock data and news sentiment data. The analysis evaluates the performance of LSTM, GRU, and Attention-CNN-LSTM models, tested with and without sentiment data, to determine their effectiveness in stock price prediction.

🚀 Key Takeaways

Insight 1: [Insight 1 description]
Insight 2: [Insight 2 description]
Insight 3: [Insight 3 description]
Insight 4: [Insight 4 description]
Insight 5: [Insight 5 description]

👇 Jump to results and discussion

📋 Motivation

My aim behind this project was to explore the dynamic field of stock price prediction by modeling a variety of stocks from the technology sector: the volatile TSLA, an innovative AAPL, and a robust GOOG. Professionally, this project deepened my expertise in advanced time series analysis, enhancing forecasting accuracy and offering insightful solutions to investors and financial analysts. Additionally, I have explored the role of architecture complexity in time series modeling and demonstrated the impact of language data on stock predictions. This showcases my technical skills and also offers valuable experience at the intersection of finance and technology.

🎯 Approach

Data Collection: Acquired stock data from Yahoo Finance and news data from Alpha Vantage covering TSLA, AAPL, and GOOG from March 3rd, 2022, to July 3rd, 2024.
Data Preprocessing: Cleaned and engineered features for both stock and news datasets, including handling missing values, converting date formats, creating lag features, and performing sentiment analysis.
Baseline Modeling: Implemented Moving Average and XGBoost Regressor models for initial stock price predictions based on historical data and engineered features.
Model Development: Developed LSTM, GRU, and Attention-CNN-LSTM models to capture sequential dependencies and sentiment-driven insights for improved stock price prediction.
Evaluation: Evaluated model performance using metrics such as RMSE and accuracy, visualizing predictions against actual stock prices and sentiment trends to assess effectiveness in capturing market dynamics and sentiment-driven fluctuations.

💾 Dataset

Time Frame: March 3, 2022 to July 3, 2024 (2.25 years)
Stock Choices
- TSLA: Chosen to challenge the models with its high volatility during the selected period, complemented by extensive news and social media coverage, and leadership in AI and autonomous driving technologies.
- AAPL: Selected for its consistent innovation, significant media presence, and diverse product range.
- GOOG: Included due to its robust news coverage, stable growth trajectory, and extensive technological initiatives.
Sources
- News Data: Alpha Vantage
  - Provides comprehensive financial APIs for stock market data, including news headlines and sentiment analysis.
- Stock Data: Yahoo Finance (via yfinance package)
  - Offers historical and real-time stock data accessible programmatically through the yfinance Python package.

🔨 Preprocessing

Preprocessing

Standard data preprocessing: missing value check, duplicate removal, data type conversion.
I filtered the data to ensure both news and stock data consist of the same window of data, in particular after feature engineering.
Utilized the SpaCy and SpaCy-Cleaner libraries to implement functions to remove stopwords, punctuation, and lemmatize tokens to normalize text data.

Feature Engineering

Implemented the FinBERT model using PyTorch (model's default library) to tokenize and compute sentiment scores (positive, negative, neutral) for both news titles and summaries.
Computed mean sentiment scores (title_pos, title_neg, title_neu, summary_pos, summary_neg, summary_neu) grouped by stock and date to summarize sentiment trends.
Extracted the time features such as day of the week, day, month, and year from the date column to capture temporal patterns.
Created lag features (lag_1 to lag_7) to incorporate historical prices as predictors.
Computed rolling mean and standard deviation over 7-day and 14-day windows to capture short-term trends and volatility.

🧠 Model Development

Prediction Models

Baseline: Moving Average and `XGBoost Regressor`

Moving Average: the average closing price over the past 7 and 14 days is used as the prediction.
XGBoost Regressor: an implementation of gradient boosting using decision trees, which builds sequentially optimizing models, incorporates regularization, handles missing values, supports parallel processing, and prunes branches for enhanced performance and scalability.

Vanilla and Sentiment-Enhanced RNN Models

LSTM: uses input, output, and forget gates to manage long-term dependencies and prevent the vanishing gradient problem in sequential data.
GRU: a simplified LSTM using reset and update gates to manage dependencies and improve training efficiency.
Attention-CNN-LSTM (only with sentiment data): a hybrid model combining convolutional layers for feature extraction, attention mechanisms for focusing on relevant information, and LSTM layers for sequence prediction.

Implementation

1. Data Preparation: `StockData`

Packages all data preparation tasks, including scaling, sequence creation, and train-test splitting.
Handles the inverse transformation of scaled data for result interpretation.
Plots the data by training and test sets for visualization.

2. Model Development: `Hypermodel`

Implements Keras Tuner for hyperparameter tuning for model selection.
Defines the model architecture and hyperparameter search space.
Compiles the model for fine-tuning with the chosen optimizer and loss function.

3. Training and Evaluation: `StockModel`

Integrates data and hypermodel classes for model training and evaluation.
Conducts hyperparameter tuning to find the best model configuration.
Evaluates the best model on the test data, calculating RMSE and accuracy.
Plots predicted vs. actual stock prices for visualization of model performance.

📈 Results and Discussion

Model Architecture and Performance

GRU models generally outperformed LSTM models in terms of both accuracy and RMSE, indicating that the GRU's simpler architecture might be more effective for stock price prediction in this dataset.
Attention-CNN-LSTM models, despite their complexity, did not provide significant performance improvements and consistently performed worse than LSTMs and GRUs.

Role of Sentiment in Stock Prediction

Incorporating sentiment data consistently resulted in lower accuracy and higher RMSE, indicating it added noise rather than improving model performance, particularly noticeably with LSTM models for GOOG.
GRU models showed robust performance both with and without sentiment data, suggesting they handle noise better, but still did not gain significant benefit from sentiment analysis.

Overarching Conclusions

Simpler recurrent architectures such as GRUs yielded better and more robust results in terms of accuracy, RMSE, and noise-handling compared to traditional or complex hybrid models.
Filtering and refining input data, such as focusing on relevant news articles, may reduce noise and improve the relevance of sentiment analysis and its impact on stock prediction models.
Achieving a balance between model complexity and input features is crucial for optimal stock prediction. Overly complex models like Attention-CNN-LSTM performed significantly worse than all other models. Furthermore, models incorporating sentiment data consistently displayed higher RMSE than those without it. Finding the right balance between model complexity and input features is essential for achieving the best performance.

🪐 Future Work

Additional Refinement of News Data: Focus on news directly related to the target stock to reduce noise and improve relevance.
Testing with Other Stocks and Asset Types: Evaluate the model's performance on a further range of stocks and asset classes such as other tech companies and different index funds.
Incorporating Alternative Data Sources: Integrate data from financial reports or economic indicators to enhance model inputs and potentially improve prediction accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
notebooks		notebooks
output		output
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_quarto.yml		_quarto.yml
eval.txt		eval.txt
model-architectures.md		model-architectures.md
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimizing Stock Price Forecasting Using Modern RNNs

Evaluating the Impact of Sentiment Analysis on LSTM, GRU, and Attention-CNN-LSTM Models

📌 Overview

🚀 Key Takeaways

📂 Table of Contents

📋 Motivation

🎯 Approach

💾 Dataset

🔨 Preprocessing

Preprocessing

Feature Engineering

🧠 Model Development

Prediction Models

Baseline: Moving Average and `XGBoost Regressor`

Vanilla and Sentiment-Enhanced RNN Models

Implementation

1. Data Preparation: `StockData`

2. Model Development: `Hypermodel`

3. Training and Evaluation: `StockModel`

📈 Results and Discussion

Model Architecture and Performance

Role of Sentiment in Stock Prediction

Overarching Conclusions

🪐 Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

marcocamilo/stock-sentiment-prediction

Folders and files

Latest commit

History

Repository files navigation

Optimizing Stock Price Forecasting Using Modern RNNs

Evaluating the Impact of Sentiment Analysis on LSTM, GRU, and Attention-CNN-LSTM Models

📌 Overview

🚀 Key Takeaways

📂 Table of Contents

📋 Motivation

🎯 Approach

💾 Dataset

🔨 Preprocessing

Preprocessing

Feature Engineering

🧠 Model Development

Prediction Models

Baseline: Moving Average and XGBoost Regressor

Vanilla and Sentiment-Enhanced RNN Models

Implementation

1. Data Preparation: StockData

2. Model Development: Hypermodel

3. Training and Evaluation: StockModel

📈 Results and Discussion

Model Architecture and Performance

Role of Sentiment in Stock Prediction

Overarching Conclusions

🪐 Future Work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Baseline: Moving Average and `XGBoost Regressor`

1. Data Preparation: `StockData`

2. Model Development: `Hypermodel`

3. Training and Evaluation: `StockModel`

Packages