Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
-
Updated
Mar 17, 2025 - Python
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
A Python word counter module to quickly count number of words in a sentence
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real-world tasks…
Just some misc python scripts, mainly for practice and learning
A simple tool to perform a basic topic analysis on Facebook-post comments
Goodreads Year In Books Word Count
News trend analysis using Hadoop in a virtualized CentOS environment
Step By Step guide for Hadoop installation on Ubuntu 16.04.3 with MapReduce example using Streaming
This repository contains solutions to common mapper and reducer problems in Hadoop using Python
State of the Union dataset
A framework to run map reduce program. Implemented based on map reduce paper
Add a description, image, and links to the wordcount topic page so that developers can more easily learn about it.
To associate your repository with the wordcount topic, visit your repo's landing page and select "manage topics."