This is a movie recommendation system that recommends movie based on the ratings givev by the user. It uses Django for its backend This is the code of webite hosted on this link- https://movie-watch-time.herokuapp.com/mainpage/form
This website uses user-user collaborative filter, item-item collaborative filter and matrix factorization technique through gradient desent to recommend movies to the user.
Note - numpy and pandas are only used for data loading and doing some basic computations like matrix multiplication. All the algorithms are implemented from scratch without use of any special library.
Whenever a user lands on the website, he has to rate 5 movies. When he submit the rating his/her ratings are compared with the rating of all the users(there are 500 of them) in the database for those particular movies and find the simalirity using consine simairity. If the simalirity is above certain threshold then it takes the mean of top k users that are most similar and recommend the movie accordingly.
If the simalirity is not found then item-item filter is activated and returns the movies related to the movies that are liked by the user and some top movies which have highest imdb rating in the database.And after recommending to the user the program runs gradient decent on the new rating recieved and store them in database for future reference.
The ratings of the user are taken from Movielens2M dataset and matrix factorization eas ran on the initial user rating matrix to fill in all the boxs of the matrix
For optimum performnace and to get a dense matrix the movies(there are 499 of them in database) that were most rated by the users are taken from the 2M dataset.This will also ensure that movies in the databse are the popular one and majority of users know some of them
Download this repository
Use the package manager pip to install dependencies of this project
pip install -r requirements.txt
To run the website on the local host
cd movie_recommendation_system
python manage.py runserver
Now open your web browser on paste the following url
http://127.0.0.1:8000/mainpage/form
This is the folder where Django website resides. All the frontend and backend
This folder contains all the python script that I used to do diffferent operations ranging from applying gradient decent on inital matrix to selecting top 500 movies from 2M movies and a lot more all the way to script used to check if movies of all genre is present in database.
The csv files of original dataset were huge and difficult to work with so to ease up the process I extracted only the useful information from the whole dataset and stored it in this folder. Unfortunately due to restriction of github of file size I could not upload all the files.
This folder contains some txt files which were used to store data for testing purposes.
https://www.templatemonster.com/website-templates/57969.html
Movie lens 2M dataset
StackOverflow
https://towardsdatascience.com/various-implementations-of-collaborative-filtering-100385c6dfe0
https://github.com/yanneta/pytorch-tutorials/blob/master/collaborative-filtering-nn.ipynb
https://medium.com/@james_aka_yale/the-4-recommendation-engines-that-can-predict-your-movie-tastes-bbec857b8223
https://beckernick.github.io/matrix-factorization-recommender/
https://medium.com/@cfpinela/recommender-systems-user-based-and-item-based-collaborative-filtering-5d5f375a127f
https://towardsdatascience.com/collaborative-filtering-based-recommendation-systems-exemplified-ecbffe1c20b1
https://medium.com/@connectwithghosh/simple-matrix-factorization-example-on-the-movielens-dataset-using-pyspark-9b7e3f567536
https://github.com/khanhnamle1994/movielens
http://www.albertauyeung.com/post/python-matrix-factorization/
https://stackoverflow.com/questions/52993954/how-to-store-np-arrays-into-psql-database-and-django
https://docs.scipy.org/doc/numpy-1.15.0/