This repository contains the proof of concept for various hadoop technologies.
Hadoop Mapreduce is shown on IPL Dataset.
Objective with Mapreduce is to determine:
- The total number of matches played in every season.
- The number of matches played in a particular stadium.
- The decision on winning the toss and how many times batting and fielding were selected on winning toss from season1 to season 9.
- The number of matches played by particular team at particular stadium.
- The total number of matches played by every team.
- The total number of matches won by every team.
- The total number of matches won by every team.
- How many times a team has won toss & match at a particular stadium.
Hive is shown on McDonalds Dataset.
Objective with Hive is to determine:
- The total count from the menu.
- The max calories of different categories.
- The top 10 items with highest calories.
- The top 10 items with highest sugars.
- The top 10 items with highest proteins.
- The top 10 categories-items with highest Vitamin A & Vitamin C.
- To Partition the menu according to Category (Example: Category = Breakfast, Category = Beverages, Category = Deserts).
- To pick any value from Bucket after partitioning.
Pig is shown on 2016 Olympics in Rio de Janerio Dataset.
Objective with Pig is to determine:
- Find total participants by country.
- Find total male & female participants.
- Find total male participants per country and female participants by country.
- Find total gold & silver won.
- Find oldest participant.
- Find youngest participant.
- Find number of participants with respect to a particular sport & country.
- Find total participants per sport.
Sqoop is shown on Loandata Dataset.
Objective with Sqoop is:
1.To show export & import commands using sqoop.