Big Data Analytics:

This repository contains some analytics projects using Big Data eco-systems (Hadoop, Spark, Storm, Hbase and Zookeeper)listed below:

Hadoop Analytics

Some real world use cases using hadoop map reduce design pattern (TopK, Secondary Sorting, Filtering, Summarization, Join, Friend Recommendation)

Spark Analytics

Some simplified real world scenarios using Apache Spark, MLlib (Email spam detection, User Purchase statistics, Twitter data analysis with Hive,etc)

Storm Analytics

This projects contains some simple examples with storm (Github commit count, Twitter stream analysis,Topology statistics,etc)

Hbase-coprocessor

An example of Hbase Aggregation client to carry out( row count, min-max, average) values of a table.Also a region co-processor to hook value before get operation.

Zookeeper distributed-queue

An example of distributed queue using apache zookeeper and curator framework from Netflix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Big Data Analytics:

Hadoop Analytics

Spark Analytics

Storm Analytics

Hbase-coprocessor

Zookeeper distributed-queue

Files

README.md

Latest commit

History

README.md

File metadata and controls

Big Data Analytics:

Hadoop Analytics

Spark Analytics

Storm Analytics

Hbase-coprocessor

Zookeeper distributed-queue