This is a set of comments scraped from posts on Reddit. Top level comments were saved from the fifty top subreddits by subscriber count. (as of April of 2020) There are a maximum of hundred comments saved from the maximum of the top 1000 posts.
The comments are in separate .txt
files by subreddit. There are two separate .txt
files included, one of them has data on each of the files, (word count, character count) the other one has a list of all subreddits formatted as a Python list, for easy use.
For some reason, uploading the zip file straight to Github ran into problems, so I have uploaded it to Google Drive instead. I decided to keep this repository so people have an easier time finding it.
- Python
- Python Reddit API Wrapper - PRAW