Network Based Recommendation Engine using Map-Reduce (in Python) to run on top of Hadoop!
The recommendation engine is divided into three parts :
- Pre-process Job
- Resource Allocation Job
- Recommendation Job
Each job has their own mapper and reducer.
MongoDB is used in this version for real-time data storage and streaming.
Movie-Lens data set : ml-1m is used as reference!
To run python map-reduce programs we need Hadoop Streaming. Command for hadoop streaming is inside bin directory of this project, please replace directories as per your need.