This project aims to predict the closing price of TATA Motors stock using various machine learning algorithms and technical indicators. The data used for this project ranges from 2020 to 2023.
The following libraries are used in this project:
- pandas for data manipulation and analysis
- yfinance for downloading stock data from Yahoo Finance
- numpy for mathematical operations
- matplotlib and seaborn for plotting and visualization
- plotly for interactive charts
- cufflinks for creating charts using pandas dataframes
- warnings for ignoring warnings
- sklearn for machine learning algorithms and evaluation metrics
The data for TATA Motors stock is collected from yfinance library and the start and end dates are set from 1999 to 2023. The data is then explored to understand the general characteristics and trends of the stock.
The following exploratory data analysis techniques are used in this project:
- Calculation of mean, median, standard deviation, max and min of closing price and opening price
- Distribution of daily returns
- Candlestick chart to show the variation between the highest and lowest returns
- Moving average chart to show the trend of the closing price and Opening Price
- Correlation heatmap to show the relationship between different features
- Technical indicators such as simple moving average and Bollinger Bands to identify trends and volatility.
The following features are engineered in this project:
- Weekly moving average for the Closing Price
- Bollinger Bands for the Closing Price
The following machine learning algorithms are used in this project:
- Linear Regression
- Random Forest Regressor
- Support Vector Regression (SVR)
Before training our models, we need to standardize our data to ensure that the scale of each feature is consistent. This is important because some algorithms are sensitive to the scale of the input data.To standardize the data, we will use the MinMaxScaler class from the scikit-learn library. This class will transform our data by subtracting the mean and dividing by the standard deviation.
The models are trained and tested and the evaluation metrics used are R2 score
- R2 score for linear regression: 0.9516679456767138
- R2 score for random forest: 0.9298548382265217
- R2 score for Support Vector R: 0.34399857626537955 The best performing model is selected based on the evaluation metrics.
Based on the R2 scores, it appears that the linear regression model performed the best with an R2 score of 0.9516679456767138
Linear Regression model will be used to predict the closing price of Tata Motors stock.After that, we will evaluate the model by checking the R2 score, mean absolute error, and mean squared error.We will then use this model to make predictions on the future closing prices of Tata Motors stock.Then Plot The Actual and Prediction Data Using The Scatter Plot and Inference Chart is Plotted to show all how predicted price support the momentum for long term