Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 729 Bytes

README.md

File metadata and controls

13 lines (11 loc) · 729 Bytes

reddis_data_viz

Data science work for RedditInsight.

  1. Segmented data by subreddit
  2. Used NLTK to separate the words in titles by their parts of speech
  3. Developed frequency analysis of nouns by subreddit
  4. Munged dataset for predictive model- extracted day of week, and hour of day the post was created. Developed categorical variable out of the subreddit and domain features.
  5. Evaluated predictive value of model, decided to focus on data visualizations.
  6. Developed clustering analysis of subreddit data for subreddits that had natural topic segmentation.
  7. Developed noun frequency analysis by subreddit
  8. Visualizations created from this work are in- https://github.com/sheltowt/redditD3