This repository contains the implementation of a sentiment analysis system designed to classify e-commerce product reviews as positive or negative. The project combines machine learning, transfer learning, and web development techniques to provide an end-to-end solution for real-time sentiment analysis.
Understanding customer sentiment in product reviews is crucial for businesses and consumers. This project aims to address this challenge by:
- Developing an initial model using a Recurrent Neural Network (RNN) for sentiment classification.
- Improving accuracy using transfer learning with a pre-trained BERT model.
- Deploying the final model in a Flask-based web application that provides real-time predictions for user-input Amazon product URLs.
- Python
- TensorFlow
- PyTorch
- Hugging Face Transformers
- Flask
- BeautifulSoup
- Scrapy
- Matplotlib
- Seaborn
-
Data Collection
- Datasets of reviews were collected from Amazon, Yelp, and IMDb.
- Additional reviews were scraped using Python tools.
-
Preprocessing
- Text data was cleaned using tokenization, stop word removal, and lemmatization.
-
Model Development
- A custom RNN was initially trained but achieved low accuracy (~60%).
- A pre-trained BERT model was fine-tuned, achieving 92% accuracy.
-
Web App Deployment
- A Flask-based app was developed where users can input an Amazon product URL.
- The app scrapes reviews from the product page and performs real-time sentiment analysis.
Metric | RNN | BERT |
---|---|---|
Accuracy | ~60% | 92.18% |
Precision | - | 89.45% |
Recall | - | 95.64% |
F1-Score | - | 92.45% |
- Python 3.8 or later
- Install required libraries using the following:
pip install -r requirements.txt
-
Clone the repository:
git clone https://github.com/username/sentiment-analysis cd sentiment-analysis
-
Run the Flask app:
python app.py
-
Open your browser and navigate to
http://127.0.0.1:5000
. -
Input an Amazon product URL to get a sentiment analysis report.
- Mitigate overfitting in the BERT model using regularization techniques.
- Expand the dataset to include reviews from more diverse domains.
- Explore more advanced transfer learning models like GPT or RoBERTa.
- Kevin Igweh
- Bruna Jacinto Grassi
We thank Dr. Ajmery Sultana for her guidance and support throughout this project.