This repository contains some analytics projects using Big Data eco-systems (Hadoop, Spark, Storm, Hbase and Zookeeper)listed below:
Some real world use cases using hadoop map reduce design pattern (TopK, Secondary Sorting, Filtering, Summarization, Join, Friend Recommendation)
Some simplified real world scenarios using Apache Spark, MLlib (Email spam detection, User Purchase statistics, Twitter data analysis with Hive,etc)
This projects contains some simple examples with storm (Github commit count, Twitter stream analysis,Topology statistics,etc)
An example of Hbase Aggregation client to carry out( row count, min-max, average) values of a table.Also a region co-processor to hook value before get operation.
An example of distributed queue using apache zookeeper and curator framework from Netflix.