Intro-Hadoop-MapReduce

Analyze discussion forum data with Hadoop and MapReduce

Useful field names:

Final Project:

"forum_node.tsv" "id" "title" "tagnames" "author_id" "body" "node_type" "parent_id" "abs_parent_id" "added_at" "score" "state_string" "last_edited_id" "last_activity_by_id" "last_activity_at" "active_revision_id" "extra" "extra_ref_id" "extra_count" "marked"

"forum_users.tsv" "user_ptr_id" "reputation" "gold" "silver" "bronze"

StackExchange:

[('Body', "text here"), ('ViewCount', '1191'), ('Title', 'Why does the Macbook Pro Unibody crash on hibernate under Windows?'), ('LastActivityDate', '2009-07-15T21:15:21.323'), ('AnswerCount', '3'), ('CommentCount', '2'), ('AcceptedAnswerId', '3841'), ('Score', '4'), ('PostTypeId', '1'), ('OwnerUserId', '26'), ('Tags', ''), ('CreationDate', '2009-07-15T07:17:13.970'), ('FavoriteCount', '1'), ('Id', '37')]

Analyzing Reddit comments:

subreddit: The subreddit the comment was posted in
author: Username of the comment author
body: Comment text
create_utc: UTC timestamp of when the comment was posted
ups: Comment upvotes
downs: Comment downvotes
gilded: 1 if the user was given Reddit gold for the comment, 0 otherwise
archived: 1 if the comment was archived, 0 otherwise

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Reddit_comments		Reddit_comments
StackExchange_Posts		StackExchange_Posts
final_project		final_project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Intro-Hadoop-MapReduce

Useful field names:

About

Uh oh!

Releases

Packages

Languages

tiechengsu/Intro-Hadoop-MapReduce

Folders and files

Latest commit

History

Repository files navigation

Intro-Hadoop-MapReduce

Useful field names:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages