mahout - What's difference between Collaborative Filtering Item-based recommendation and Content-based recommendation -

- September 15, 2010

i puzzled item-based recommendation in 《mahout in action》.there algorithm in book:

for every item u has no preference yet   every item j u has preference     compute similarity s between , j     add u's preference j, weighted s, running average return top items, ranked weighted average

what can calculate similarity between items? if using content, isn't content-based recommendation ?

item-based collaborative filtering

the original item-based recommendation totally based on user-item ranking (e.g., user rated movie 3 stars, or user "likes" video). when compute similarity between items, not supposed know other users' history of ratings. similarity between items computed based on ratings instead of meta data of item content.

let me give example. suppose have access rating data below:

user 1 likes: movie, cooking user 2 likes: movie, biking, hiking user 3 likes: biking, cooking user 4 likes: hiking

suppose want make recommendations user 4.

first create inverted index items, get:

movie:     user 1, user 2 cooking:   user 1, user 3 biking:    user 2, user 3 hiking:    user 2, user 4

since binary rating (like or not), can use similarity measure jaccard similarity compute item similarity.

                                 |user1| similarity(movie, cooking) = --------------- = 1/3                                |user1,2,3|

in numerator, user1 element movie , cooking both has. in denominator union of movie , cooking has 3 distinct users (user1,2,3). |.| here denote size of set. know similarity between movie , cooking 1/3 in our case. same thing possible item pairs (i,j).

after done similarity computation pairs, say, need make recommendation user 4.

look @ similarity score of similarity(hiking, x) x other tags might have.

if need make recommendation user 3, can aggregate similarity score each items in list. example,

score(movie)  = similarity(biking, movie) + similarity(cooking, movie) score(hiking) = similarity(biking, hiking) + similarity(cooking, hiking)

content-based recommendation

the point of content-based have know content of both user , item. construct user-profile , item-profile using content of shared attribute space. example, movie, represent movie stars in , genres (using binary coding example). user profile, can same thing based on users likes movie stars/genres etc. similarity of user , item can computed using e.g., cosine similarity.

here concrete example:

suppose our user-profile (using binary encoding, 0 means not-like, 1 means like), contains user's preference on 5 movie stars , 5 movie genres:

         movie stars 0 - 4    movie genres user 1:    0 0 0 1 1          1 1 1 0 0 user 2:    1 1 0 0 0          0 0 0 1 1 user 3:    0 0 0 1 1          1 1 1 1 0

suppose our movie-profile:

         movie stars 0 - 4    movie genres movie1:    0 0 0 0 1          1 1 0 0 0 movie2:    1 1 1 0 0          0 0 1 0 1 movie3:    0 0 1 0 1          1 0 1 0 1

to calculate how movie user, use cosine similarity:

                                 dot-product(user1, movie1) similarity(user 1, movie1) = ---------------------------------                                     ||user1|| x ||movie1||                                0x0+0x0+0x0+1x0+1x1+1x1+1x1+1x0+0x0+0x0                            = -----------------------------------------                                          sqrt(5) x sqrt(3)                             = 3 / (sqrt(5) x sqrt(3)) = 0.77460

similarly:

similarity(user 2, movie2) = 3 / (sqrt(4) x sqrt(5)) = 0.67082  similarity(user 3, movie3) = 3 / (sqrt(6) x sqrt(5)) = 0.54772

if want give 1 recommendation user i, pick movie j has highest similarity(i, j).

hope helps.

Search This Blog

HPH

mahout - What's difference between Collaborative Filtering Item-based recommendation and Content-based recommendation -

item-based collaborative filtering

content-based recommendation

Comments

Post a Comment

Popular posts from this blog

c++ - Function signature as a function template parameter -

algorithm - What are some ways to combine a number of (potentially incompatible) sorted sub-sets of a total set into a (partial) ordering of the total set? -

How to call a javascript function after the page loads with a chrome extension? -