Test for Collaborative Filtering

For an example, let's suppose the following data:

There are 6 products (R1,R2,R3,R4,R5 and R6);
There are 3 "training set" users (Bob, Chris and Diana);
The query instance is Alice's ratings on the 5.

User/Product	R1	R2	R3	R4	R5	R6
Alice	2	-	4	4	-	5
Bob	1	5	4	-	3	4
Chris	5	2	-	2	1	-
Diana	3	-	2	2	-	4

But the data does not come in this table format. Actually, let's imagine a log file where each row are: movieID, customerID,rating. For simplicity, let's suppose Alice's ID 1, Bob's ID 2, Chris' ID 3 and Diana's 4.

Our training data, then, come in the following format. Notice since Alice is the query instance, she does not appear in the training data.

movieID	customerID	rating
1	2	1
2	2	5
3	2	4
5	2	3
6	2	4
1	3	5
2	3	2
4	3	2
5	3	1
1	4	3
3	4	2
4	4	2
6	4	4

Query instance (Alice's logs) is:

movieID	customerID	rating
1	1	2
3	1	4
4	1	4
6	1	5

What we want to know (to predict) is Alice's rate on movie 5. We can easily estimate this value using mtxslv_collab_filter(). Let's define training and testing data:

training_set_example = np.array([[1,2,1],[2,2,5],[3,2,4],[5,2,3],
                         [6,2,4],[1,3,5],[2,3,2],[4,3,2],
                         [5,3,1],[1,4,3],[3,4,2],[4,4,2],
                         [6,4,4]])

testing_instance = np.array([[1,1,2],[3,1,4],[4,1,4],[6,1,5]])

Now apply the function!

estimativa = mtxslv_collab_filter(testing_instance,5,training_set_example)

The manually calculated value is 4.336075363. Test it yourself, estimativa is 4.336096188777122.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test for Collaborative Filtering

Clone this wiki locally