Skip to content

Latest commit

 

History

History
24 lines (13 loc) · 970 Bytes

README.md

File metadata and controls

24 lines (13 loc) · 970 Bytes

Big Data Analytics:

This repository contains some analytics projects using Big Data eco-systems (Hadoop, Spark, Storm, Hbase and Zookeeper)listed below:

Hadoop Analytics

Some real world use cases using hadoop map reduce design pattern (TopK, Secondary Sorting, Filtering, Summarization, Join, Friend Recommendation)

Spark Analytics

Some simplified real world scenarios using Apache Spark, MLlib (Email spam detection, User Purchase statistics, Twitter data analysis with Hive,etc)

Storm Analytics

This projects contains some simple examples with storm (Github commit count, Twitter stream analysis,Topology statistics,etc)

Hbase-coprocessor

An example of Hbase Aggregation client to carry out( row count, min-max, average) values of a table.Also a region co-processor to hook value before get operation.

Zookeeper distributed-queue

An example of distributed queue using apache zookeeper and curator framework from Netflix.