hadoop-proof-of-concept

This repository contains the proof of concept for various hadoop technologies.

Hadoop Mapreduce is shown on IPL Dataset.

Objective with Mapreduce is to determine:

The total number of matches played in every season.
The number of matches played in a particular stadium.
The decision on winning the toss and how many times batting and fielding were selected on winning toss from season1 to season 9.
The number of matches played by particular team at particular stadium.
The total number of matches played by every team.
The total number of matches won by every team.
The total number of matches won by every team.
How many times a team has won toss & match at a particular stadium.

Hive is shown on McDonalds Dataset.

Objective with Hive is to determine:

The total count from the menu.
The max calories of different categories.
The top 10 items with highest calories.
The top 10 items with highest sugars.
The top 10 items with highest proteins.
The top 10 categories-items with highest Vitamin A & Vitamin C.
To Partition the menu according to Category (Example: Category = Breakfast, Category = Beverages, Category = Deserts).
To pick any value from Bucket after partitioning.

Pig is shown on 2016 Olympics in Rio de Janerio Dataset.

Objective with Pig is to determine:

Sqoop is shown on Loandata Dataset.

Objective with Sqoop is:

1.To show export & import commands using sqoop.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
2016 Olympics in Rio de Janerio-Dataset		2016 Olympics in Rio de Janerio-Dataset
IPL-Dataset		IPL-Dataset
Loandata Dataset		Loandata Dataset
McDonalds-Dataset		McDonalds-Dataset
README.md		README.md

Provide feedback