This is (1) an implementation of model and (2) reproducible experiments for the paper.
We compare DRMdot and DRML2 with the following baselines.
- SLIM: Sparse LInear Methods for Top-N Recommender systems
- CDAE: Collaborative Denoising AutoEncoder
- BPR: Bayesian Personalized Ranking for Matrix Factorization
- WMF: Weighted Matrix Factorization
- WARP: Weighted Approximated Ranking Pairwise Matrix Factorization
- CML: Collaborative Metric Learning
- SQLRANK: Stochastic Queueing List Ranking
- SRRMF: Square Loss Ranking Regularizer Matrix Factorization
We did not implement SQLRANK and SRRMF, because there exist implementations by the original paper authors. The implementation of SQLRANK is written in Julia and the implementation of SRRMF is written in Matlab.
Except these two models, we wrote scripts for training and evaluating baselines, and our models in Python.
We assume that you have installed Python using Anaconda, and your environment is equipped with CUDA. It should be possible to use other Python distributions, but we did not tested.
We use PyTorch with Cuda as the backend for our ML implementations.
- PyTorch (version: 1.5.0)
- CUDA (version: 10.1)
For Conda:
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
For usual Python:
pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
will install appropriate PyTorch with cuda 10.1 version. Please refer here and here to see appropriate PyTorch version for your environment.
We use the packages as listed below(alphabetical order):
External Libraries:
- PyTorch *
- SLIM
- Follow instruction guide in here.
- It is required only when evaluating SLIM in our code.
We assume that you have installed Python using Anaconda and your environment is equipped with CUDA. It should be possible to use other Python distributions, but we did not tested.
If you are using Conda,
conda create --name drm_test python=3.7.3
conda activate drm_test
or refer to virtualenv
.
- Install packages and train datasets.
pip install -r requirements.txt
cd ~/code/eval/
python setup.py build_ext --inplace
cd ~/code/
- Install submodules (implicit, SLIM, spotlight).
git submodule update --init --recursive
-
Preprocess the raw dataset. Use Data preprocessing.ipynb
-
Run
python <some_name>-pt.py
. We noted the instruction in the codes. You can use-h
command to check instruction.
python <some_name>-pt.py
For example,
python mp-ours-pt.py --dataset_name=ml-20m --kk=50 --infer_dot=0
- After running
python <some_name>-pt.py
, runpython <some_name>-fb.py
.
python <some_name>-fb.py
For example,
python ours-fb.py --dataset_name=ml-20m --pos_sample=1 --infer_dot=0
- If you need to check the result of training, you can read it with Evaluation result.ipynb.
We use these datasets:
- SketchFab: https://github.com/EthanRosenthal/rec-a-sketch
- Epinion: https://www.cse.msu.edu/~tangjili/datasetcode/truststudy.htm
- MovieLens-20M: https://grouplens.org/datasets/movielens/20m/
- Melon: https://arena.kakao.com/c/7/data
-
I have a
ModuleNotFoundError: No module named 'models'
error.- Go to
~/code/
, and then run the.py
file.
- Go to
-
I have a
ModuleNotFoundError: No module named 'eval.rec_eval'
error.- Go to
code/eval
, and run the commandpython setup.py build_ext --inplace
.
- Go to
-
I have a
ModuleNotFoundError: No module named 'SLIM'
error.- You need to install SLIM package directly from https://github.com/KarypisLab/SLIM.
- Early Stopping used
- validate every 5 epochs (3 for WMF)
- for SLIM, we did not conduct early stopping.
- We used batch size 500 for CDAE. We did not see performance difference among different batch sizes.
- for SLIM, we use coordinate descent.