mahout - What's difference between Collaborative Filtering Item-based recommendation and Content-based recommendation -


i puzzled item-based recommendation in 《mahout in action》.there algorithm in book:

for every item u has no preference yet   every item j u has preference     compute similarity s between , j     add u's preference j, weighted s, running average return top items, ranked weighted average 

what can calculate similarity between items? if using content, isn't content-based recommendation ?

item-based collaborative filtering

the original item-based recommendation totally based on user-item ranking (e.g., user rated movie 3 stars, or user "likes" video). when compute similarity between items, not supposed know other users' history of ratings. similarity between items computed based on ratings instead of meta data of item content.

let me give example. suppose have access rating data below:

user 1 likes: movie, cooking user 2 likes: movie, biking, hiking user 3 likes: biking, cooking user 4 likes: hiking 

suppose want make recommendations user 4.

first create inverted index items, get:

movie:     user 1, user 2 cooking:   user 1, user 3 biking:    user 2, user 3 hiking:    user 2, user 4 

since binary rating (like or not), can use similarity measure jaccard similarity compute item similarity.

                                 |user1| similarity(movie, cooking) = --------------- = 1/3                                |user1,2,3| 

in numerator, user1 element movie , cooking both has. in denominator union of movie , cooking has 3 distinct users (user1,2,3). |.| here denote size of set. know similarity between movie , cooking 1/3 in our case. same thing possible item pairs (i,j).

after done similarity computation pairs, say, need make recommendation user 4.

  • look @ similarity score of similarity(hiking, x) x other tags might have.

if need make recommendation user 3, can aggregate similarity score each items in list. example,

score(movie)  = similarity(biking, movie) + similarity(cooking, movie) score(hiking) = similarity(biking, hiking) + similarity(cooking, hiking)  

content-based recommendation

the point of content-based have know content of both user , item. construct user-profile , item-profile using content of shared attribute space. example, movie, represent movie stars in , genres (using binary coding example). user profile, can same thing based on users likes movie stars/genres etc. similarity of user , item can computed using e.g., cosine similarity.

here concrete example:

suppose our user-profile (using binary encoding, 0 means not-like, 1 means like), contains user's preference on 5 movie stars , 5 movie genres:

         movie stars 0 - 4    movie genres user 1:    0 0 0 1 1          1 1 1 0 0 user 2:    1 1 0 0 0          0 0 0 1 1 user 3:    0 0 0 1 1          1 1 1 1 0 

suppose our movie-profile:

         movie stars 0 - 4    movie genres movie1:    0 0 0 0 1          1 1 0 0 0 movie2:    1 1 1 0 0          0 0 1 0 1 movie3:    0 0 1 0 1          1 0 1 0 1 

to calculate how movie user, use cosine similarity:

                                 dot-product(user1, movie1) similarity(user 1, movie1) = ---------------------------------                                     ||user1|| x ||movie1||                                0x0+0x0+0x0+1x0+1x1+1x1+1x1+1x0+0x0+0x0                            = -----------------------------------------                                          sqrt(5) x sqrt(3)                             = 3 / (sqrt(5) x sqrt(3)) = 0.77460 

similarly:

similarity(user 2, movie2) = 3 / (sqrt(4) x sqrt(5)) = 0.67082  similarity(user 3, movie3) = 3 / (sqrt(6) x sqrt(5)) = 0.54772 

if want give 1 recommendation user i, pick movie j has highest similarity(i, j).

hope helps.


Comments

Popular posts from this blog

Perl - how to grep a block of text from a file -

delphi - How to remove all the grips on a coolbar if I have several coolbands? -

javascript - Animating array of divs; only the final element is modified -