Dynamic-Price-Optimization-using-Sentiment-Analysis-on-Amazon-Reviews-Dataset

This project focuses on dynamically optimizing product prices based on sentiment analysis of customer reviews. The model combines sentiment analysis with reinforcement learning to adjust prices in a way that maximizes revenue while maintaining customer satisfaction.

Technologies Used

Apache Spark: The base engine for large-scale data processing.
GCP Dataproc Clusters: Used for processing data at a large scale on Google Cloud Platform.
Databricks Workspace: Used as an alternative environment for running the model.

Dataset

We used the Amazon Reviews Dataset (2023), available on Hugging Face, which includes customer reviews for a variety of products. The dataset provides information such as:

asin: Unique identifier for each product.
rating: Customer ratings (1-5).
text: Review text.
timestamp: Unix timestamp for the review.
verified_purchase: Indicates if the review is from a verified purchase.

You can find the dataset here: Amazon Reviews Dataset

Documentation

Apache Spark Documentation: Link
GCP Documentation: Link

Running the Code

For Databricks Workspace

Upload the Tables:
- First, upload the All_Beauty.jsonl and meta_All_Beauty.jsonl files to the /FileStore/tables directory in the Databricks.
Import the Notebook:
- Import the Review-based-price-optimization.ipynb file into your Databricks workspace.
Attach a Cluster:
- Attach a cluster to your notebook and run the code.

For GCP Dataproc Cluster

Set Up GCP Bucket:
- Set up a GCP bucket using the GCP Console and create two folders inside it: /data and /scripts.
Upload Files:
- Upload the install_text_blob.sh file to the /scripts folder in the bucket using the following GCP command:
```
gsutil cp install_textblob.sh gs://<your-bucket-name>/scripts/install_textblob.sh
```
- Also upload your dataset (e.g., Clothing_Shoes_and_Jewelry.jsonl) to the /data folder using the following GCP command:
```
gsutil cp /<file-location>/Clothing_Shoes_and_Jewelry.jsonl gs://<your-bucket-name>/data/
```

Set Up Dataproc Cluster:

Set up the GCP Dataproc cluster using GCP Console or the following GCP command, and make sure to use install_text_blob.sh as an initialization action.

./google-cloud-sdk/bin/gcloud dataproc clusters create <your-cluster-name> --enable-component-gateway --bucket <your-bucket-name> \
--region us-central1 --master-machine-type n2-standard-2 --master-boot-disk-type pd-balanced \
--initialization-actions=gs://<your-bucket-name>/scripts/install_textblob.sh \
--master-boot-disk-size 32 --num-workers 2 --worker-machine-type n2-standard-2 --worker-boot-disk-type pd-balanced \
--worker-boot-disk-size 32 --image-version 2.2-debian12 --project <your-project-name>

Export Environment Variables:

Once the cluster is running, SSH into the Master node and export the following environment variables:

export DATA_BUCKET="gs://<your-bucket-name>/data"
export file_location="Clothing_Shoes_and_Jewelry.jsonl"
export meta_file_location="meta_Clothing_Shoes_and_Jewelry.jsonl"

Upload the Python Script:
- Upload the Review-based-price-optimization.py file to the Master node.

Run the Spark Job:

Use the following spark-submit command to start the execution of the script:

spark-submit Review-based-price-optimization.py \
    --cluster=my-dataproc-cluster \
    --region=us-central1 \
    --properties=DATA_BUCKET=gs://<your-bucket-name>,DATA_LOCATION=us-central1

Observe the Results:
- Once the job has completed, observe the output and results from the model. The results will show the agent's actions for price adjustments and the corresponding rewards based on sentiment and ratings.

Contributing

We welcome contributions! If you have ideas to improve the model or encounter any issues, feel free to fork the repository and submit a pull request.

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dynamic_Price_Optimization_Using_Reviews_Sentiment.py		Dynamic_Price_Optimization_Using_Reviews_Sentiment.py
Dynamic_Price_Optimization_Using_Reviews_Sentiment_Notebook.ipynb		Dynamic_Price_Optimization_Using_Reviews_Sentiment_Notebook.ipynb
LICENSE		LICENSE
README.md		README.md
commands.txt		commands.txt
install_text_blob.sh		install_text_blob.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dynamic-Price-Optimization-using-Sentiment-Analysis-on-Amazon-Reviews-Dataset

Technologies Used

Dataset

Documentation

Running the Code

For Databricks Workspace

For GCP Dataproc Cluster

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

sidkush/Dynamic-Price-Optimization-using-Sentiment-Analysis-on-Amazon-Reviews-Dataset

Folders and files

Latest commit

History

Repository files navigation

Dynamic-Price-Optimization-using-Sentiment-Analysis-on-Amazon-Reviews-Dataset

Technologies Used

Dataset

Documentation

Running the Code

For Databricks Workspace

For GCP Dataproc Cluster

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages