Skip to content

This is a basic project made with PigLatin scripts to perform a movie analysis

Notifications You must be signed in to change notification settings

aparajithguha/PigLatin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Movie Analysis using Pig

This is a basic project made with PigLatin scripts to perform a movie analysis along with Read, write operations in various formats.

Load the data

movies = load 'movie.txt' using PigStorage(',') as (id:int, title:chararray, year:int, rating:float, views:int)

Splitting movies based on low and high rated

SPLIT movies into lowrated if rating<3.0f, highrated if rating>3.0f;

COUNT the low and high rated movies

grouplow = group lowrated all;
grouphigh = group highrated all;

lowratedcount = foreach grouplow generate COUNT(lowrated.id);
dump lowratedcount;

highratedcount = foreach grouphigh generate COUNT(highrated.id);
dump highratedcount;

avg movie views

groupmovies = group movies all;
avgmovies = foreach groupmovies generate AVG(movies.views)
dump avgmovies

max viewed low rated movie

maxview = foreach grouplow generate MAX(lowrated.title),MAX(lowrated.views)

min viewed high rated movie

minview = foreach grouphigh generate min(highrated.title),min(highrated.views)

low rated movies with average rating by year

groupyear = group lowrated by year ASC;

maxyear4 = foreach groupyear generate group as release, COUNT(lowrated.rating) as count, AVG(lowrated.rating);

high rated movies by year

hgroupyear = group highrated by year DESC;

hyear = foreach hgroupyear generate group as relase, COUNT(highrated.rating) as count;

store data using JSON and PIG and Avro

store lowrated into 'json' USING JsonStorage();
store highrated into 'jsonh' USING JsonStorage();

store lowrated into 'avro' USING AvroStorage();
store highrated into 'havro' USING AvroStorage();

store maxyear4 into 'text' USING PigStorage(',')

About

This is a basic project made with PigLatin scripts to perform a movie analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published