Sentiment Analysis on Amazon Product Reviews

Overview

This repository contains the project "Sentiment Analysis on Amazon Product Reviews" conducted by Filipa Rodrigues, Mariana Borralho, and Nuno Ferreira, students at ISCTE-IUL. The project aims to explore various text classification methods applied to Amazon product review datasets, utilizing both traditional machine learning models and advanced deep learning techniques.

Authors

Filipa Rodrigues - ISCTE-IUL, faprs@iscte-iul.pt, no 99865
Mariana Borralho - ISCTE-IUL, msrbo2@iscte-iul.pt, no 120417
Nuno Ferreira - ISCTE-IUL, ntrjf@iscte-iul.pt, no 120557

Project Description

The project begins with an initial sentiment classification of the test data using pre-trained models from libraries such as TextBlob, Vader Sentiment, and Stanza. After preprocessing tasks aimed at improving the analysis of supervised machine learning classifiers, models like Support Vector Machine and Logistic Regression show promising results with TF-IDF vector representation. Significant gains have been achieved using the Transformer-XL model, optimized for handling long data sequences.

Key Features

Utilization of pre-trained sentiment analysis tools (TextBlob, Vader Sentiment, Stanza).
Advanced text preprocessing techniques including tokenization, stop words removal, stemming, and lemmatization.
Implementation of machine learning models including SVM, Naïve Bayes, and Logistic Regression.
Application of Transformer models like Transformer-XL and generative model GPT-3.5 Turbo for enhanced sentiment classification.

Data Description

The dataset used in this project comprises 48,902 training reviews and 2,417 testing reviews from Amazon, categorized into 'positive' or 'negative' sentiment.

Repository Structure

data/
models/
notebooks/
src/
LICENSE
README.md

Installation

To replicate this analysis, you need to install the required Python libraries: pip install -r requirements.txt

Usage

To run the sentiment analysis, navigate to the src directory and please open the file Text_Mining.ipynb and run each cell

Results

The project has explored various combinations of preprocessing techniques and models. The Transformer-XL model provided the best performance improvement over other models, demonstrating its efficiency in handling long sequences of data relevant in natural language processing tasks.

Dependencies

Pandas
NLTK
Scikit-learn
Spacy
TensorFlow
PyTorch
Transformers

Contributions

This project welcomes contributions from other students or researchers interested in sentiment analysis. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

ISCTE-IUL Faculty Team for guidance and dataset provision. OpenAI for providing access to GPT models.

This README provides a concise yet comprehensive description of the project, making it easier for other researchers and developers to understand the scope and participate or utilize the project as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
1-s2.0-S1877050921011236-main.pdf		1-s2.0-S1877050921011236-main.pdf
Afinn.csv		Afinn.csv
Bing.csv		Bing.csv
NCR-lexicon.txt		NCR-lexicon.txt
README.md		README.md
Text_Mining.ipynb		Text_Mining.ipynb
amazon_reviews_test.csv		amazon_reviews_test.csv
amazon_reviews_train.csv		amazon_reviews_train.csv
model_combinations_results.csv		model_combinations_results.csv
notebooks-challenge.txt		notebooks-challenge.txt
notebooks-train.csv		notebooks-train.csv
results_challenge_openai.csv		results_challenge_openai.csv
results_distilber_not_finetuned.csv		results_distilber_not_finetuned.csv
results_distilbert.csv		results_distilbert.csv
results_dt.csv		results_dt.csv
results_gb.csv		results_gb.csv
results_grucsv		results_grucsv
results_knn.csv		results_knn.csv
results_logistic.csv		results_logistic.csv
results_lstm.csv		results_lstm.csv
results_nb.csv		results_nb.csv
results_prompt1.csv		results_prompt1.csv
results_prompt2.csv		results_prompt2.csv
results_rf.csv		results_rf.csv
results_rnn.csv		results_rnn.csv
results_svm.csv		results_svm.csv
results_train_openai.csv		results_train_openai.csv
results_transformersXl_finetuned.csv		results_transformersXl_finetuned.csv
results_transformersXl_not_finetuned.csv		results_transformersXl_not_finetuned.csv
text_processing_combinations_results.csv		text_processing_combinations_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sentiment Analysis on Amazon Product Reviews

Overview

Authors

Project Description

Key Features

Data Description

Repository Structure

Installation

Usage

Results

Dependencies

Contributions

License

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

nuno5645/text_mining

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis on Amazon Product Reviews

Overview

Authors

Project Description

Key Features

Data Description

Repository Structure

Installation

Usage

Results

Dependencies

Contributions

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages