This is an assignment from CSE545 - Big Data Analytics course, to work with Map reduce and Spark.
- a1p1_mane - Implemented a back-end for a MapReduce system and tested it on a couple MapReduce jobs such as word count and set difference
- a1p2a_mane - Implemented WordCount and SetDifference in Spark with certain restrictions
- a1p2b_mane - Search for mentions of industry words in the blog authorship corpus.The goal here is to first find all of the possible industries in which bloggers were classified. Then, to search each blogger’s posts for mentions of those industries and, counting the mentions by month and year.