Skip to content

AhmedEssam19/Streaming-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Architecture

  • We create a TCP socket between Twitter’s API and Spark, which waits for the call of the Spark Structured Streaming and then sends the Twitter data.
  • We receive the data from the TCP socket and preprocess it with the pyspark library, which is Python’s API for Spark.
  • We pass the data to RESTful API where the model is located.
  • We apply sentiment analysis using fine-tuned BERT-Base-Uncased and return Positive, Negative, or Neutral.
  • Finally, we save the tweet and the sentiment analysis polarity in a parquet file.

Steps to Run The Application

  • Run pip install -r requirements.py command.
  • Download the saved model from this link into the same path of *.py files (03_Scripts directory).
  • Go to 03_Scripts directory.
  • Run chmod u+x run_server command.
  • Run ./run_server command.
  • Type the keyword you want in the ‘keyword.txt’ file.
  • In another terminal, run python twitter_connection.py.
  • In another terminal, run python sentiment_analysis.py.
  • After a reasonable amount of time, terminate the running of twitter_connection.py and sentiment_analysis.py.
  • In the end run, get_insights.py to get some insights about the keyword like how many tweets included that keyword and if the public is positive, negative, or neutral about it.

Evaluation

  • Test set (unseen data): the model achieved 86.8% accuracy on the test set and 86.3% on the validation set.
  • Streaming data (keyword: Harry Potter)

alt text

  • Streaming data (keyword: Interstellar)

alt text

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published