Skip to content

AlexanderVhd/DM-Map-Reduce-Program

Repository files navigation

DM-Map-Reduce-Program

Overview: This program reads from a CSV file that contains tweets, in this case the local csv file contains Trump tweets (over 7000 tweets), and calculates the frequency of words that are used in all the tweets using a custom map reduce algorithm. There are two ways the algorithm counts the words: sequentially and in parallel (parallel method involves multiple cores performing the tasks). The user can choose which method to count the words.

Language used: Python for both sequential and distributed

Distributed Solution: I used multiprocessing for the parallel solution which involved importing the module and using the pool class to map the sublists. I used pool.imap_unordered instead of pool.map since it makes the processes independant and doesn't wait for the other processes' results to execute which makes it slightly faster than pool.map

User Flow: User Selects Counting Method

image

User Flow: User Selects Sequential Method

image

User Flow: User Selects Parallel Method

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages