Sparse Factor Analysis Collaborative Filtering

This is the directory for the CS229 project by Andy Gilbert and Andrew Hilger.

Summary

This project implements a sparse factor analysis collaborative filtering algorithm for building a recommendation system.

In principle only two files need to be run. RecommenderSystem.py performs business and user clustering and saves the tourist and local dataset for a given metro area. Train.py performs the training using the sparse factor analysis and also tests at each iteration and exports testing and training results for analysis.

PlotResults.ipynb can be used to view those results.

For other files uses see the description below.

Python Files:

canny_cf.py: contains the functions for the Sparse Factor Analysis Collaborative Filtering.
convert_json_to_csv.py: contains function to convery yelp json files to a csv format.
get_city_proportion.py: contains code to calculate feautures for users and businesses.
get_metro_features.py: contains additional functions to calculate features for users and businesses.
plot_and_split.py: Contains code to plot clusters onto a world, EU, or US map. Also contains some splitting code for the business dataset.
RecomenderSystem.py: An export of RecommenderSystem.ipynb, used to run the code on the server. The code goes through the clustering and splitting of the dataset into local and tourist versions and saves the split versions for a given metro area.
RunCanny.py: Used to test the cann_cf.py code
Train.py: Used to perform the Training on the tourist and local datasets using sparse factor analysis

Jupyter Notebooks:

PlotResults.ipynb: Notebook used to visualize the results of training.
RecommenderSystem.ipynb: Notebook used to perform data cleaning, feature analysis and extraction, and clustering. Ultimately, exports tourist and local split datasaets

Other

map/: contains map data for US state map.
requirements.txt: the necessary python packages for the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse Factor Analysis Collaborative Filtering

Summary

Python Files:

Jupyter Notebooks:

Other

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data		data
map		map
.gitignore		.gitignore
Do Things.ipynb		Do Things.ipynb
GetFeatures.ipynb		GetFeatures.ipynb
PlotResults.ipynb		PlotResults.ipynb
README.md		README.md
RecommenderSystem.ipynb		RecommenderSystem.ipynb
RecommenderSystem.py		RecommenderSystem.py
RunCanny.ipynb		RunCanny.ipynb
RunCanny.py		RunCanny.py
Train.py		Train.py
canny_cf.py		canny_cf.py
convert_json_to_csv.py		convert_json_to_csv.py
get_city_proportion.py		get_city_proportion.py
get_metro_features.py		get_metro_features.py
plot_and_split.py		plot_and_split.py
requirements.txt		requirements.txt

adgilbert/collaborative-filtering

Folders and files

Latest commit

History

Repository files navigation

Sparse Factor Analysis Collaborative Filtering

Summary

Python Files:

Jupyter Notebooks:

Other

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages