Undergraduate project of the course Business Management Information Methods (BSc in Computer Science at University of Study in Milan-Bicocca).
This project focuses on analyzing the Movielens 100k dataset to develop a recommendation system based on K-NN (K-Nearest Neighbors). The key phases include:
- Data Acquisition: Utilization of the Movielens 100k dataset containing 100,000 movie ratings from 943 users, along with demographic information.
- Exploratory Data Analysis: Examination of user characteristics (age, gender, occupation) and movie features (genres) to understand their influence on ratings.
- Recommendation System: Implementation of the K-NN algorithm to predict missing ratings, optimizing the hyperparameter K, and utilizing similarity metrics such as Cosine Similarity and Pearson Correlation.
- Clustering: Grouping users into clusters based on the similarity of their ratings to improve personalized recommendations.
The objective is to comprehend user preferences and deliver targeted recommendations.
Is possible to read more details about the results obtained in the Report(IT version).
This project is licensed under the MIT License - see the LICENSE file for details.