Lab assignment 2: RS frameworks¶
How not to write everything by yourself?¶
- utilize one of great many different RecSys frameworks (https://github.com/ACMRecSys/recsys-evaluation-frameworks)
- varying coverage of algorithms, metrics, use-cases etc.
- We want to start simple => LensKit for Python (LKPY)
2025 note: the newest LensKit version (2025.1) slightly deviates from the tutorial below, so check which version you're installing and if needed, see appropriate tutorials on LensKit website.
Task 1: Setting things up¶
install LensKit (pip install Lenskit, or pip install lenskit==0.14.4)
familiarize yourself with the framework + what is supported (data, algorithms, evaluation methods): https://lkpy.lenskit.org/stable/index.html, https://lkpy.lenskit.org/0.14.4/index.html, https://github.com/lenskit/lkpy
from lenskit import batch, topn, util
from lenskit import crossfold as xf
from lenskit.algorithms import Recommender, funksvd, item_knn, basic
from lenskit import topn
import pandas as pd
Task 2: Simple recommender output¶
- continue with data from labs1 (including your ratings)
- familiarize yourself with lenskit.Recommender methods
- implement basic training loop for one algorithm (e.g., ItemItem KNN, use all available data)
- recommend for yourself
Starting code¶
# update this with your real ratings (note that OID = movieId from the moviesDF)
myRatings = {
"UID":[611,611],
"OID":[174055,152081],
"rating":[4.0,5.0],
"timestamp":[0,0]
}
#moviesDF and df: same as in labs1
moviesDF = pd.read_csv("movies.csv", sep=",")
moviesDF.movieId = moviesDF.movieId.astype(int)
moviesDF.set_index("movieId", inplace=True)
df = pd.read_csv("ratings.csv", sep=",")
df.columns=["UID","OID","rating","timestamp"]
ratingCounts = df.groupby("OID")["UID"].count()
moviesDF["RatingCount"] = ratingCounts
moviesDF["year"] = moviesDF.title.str.extract(r'\(([0-9]+)\)')
moviesDF["year"] = moviesDF.year.astype("float")
moviesDF.fillna(0, inplace=True)
moviesDF.tail()
title | genres | RatingCount | year | |
---|---|---|---|---|
movieId | ||||
193581 | Black Butler: Book of the Atlantic (2017) | Action|Animation|Comedy|Fantasy | 1.0 | 2017.0 |
193583 | No Game No Life: Zero (2017) | Animation|Comedy|Fantasy | 1.0 | 2017.0 |
193585 | Flint (2017) | Drama | 1.0 | 2017.0 |
193587 | Bungo Stray Dogs: Dead Apple (2018) | Action|Animation | 1.0 | 2018.0 |
193609 | Andrew Dice Clay: Dice Rules (1991) | Comedy | 1.0 | 1991.0 |
#append your ratings to the dataFrame df.
df = pd.concat([df,pd.DataFrame(myRatings)], ignore_index=True)
#add movieTitles to the df (for clarity)
movieTitles = moviesDF.title.loc[df.OID]
df["movieTitle"] = movieTitles.values
df.columns = ["user","item","rating","timestamp","title"] #LensKit require "user","item","rating" column names
df.tail()
user | item | rating | timestamp | title | |
---|---|---|---|---|---|
100833 | 610 | 168250 | 5.0 | 1494273047 | Get Out (2017) |
100834 | 610 | 168252 | 5.0 | 1493846352 | Logan (2017) |
100835 | 610 | 170875 | 3.0 | 1493846415 | The Fate of the Furious (2017) |
100836 | 611 | 174055 | 4.0 | 0 | Dunkirk (2017) |
100837 | 611 | 152081 | 5.0 | 0 | Zootopia (2016) |
Usage of the "most popular" algorithm from LensKit¶
- Change this code with the method you're interested in (e.g., ItemItem KNN)
top_k = 20
pop_alg = basic.PopScore(score_method='quantile')
pop_alg_clone = util.clone(pop_alg) # some algorithms behave strange if they are fitted multiple times
pop_rec = Recommender.adapt(pop_alg_clone) #wrapper around an algorithm (select top-scoring items as recommendations by default)
pop_rec.fit(df) #normally, some train-test split should be performed before this
users = [611] #all you're interested in - normally, those are all users in the test set
recs = batch.recommend(pop_rec, users, top_k, n_jobs=1)
recs.head()
#In batch.recommend, n_jobs=1 is set to prevent the following error, that some of the students've been running into:
# BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
#setting higher n_jobs may speed-up the process for those not affected with this error
item | score | user | rank | |
---|---|---|---|---|
0 | 356 | 1.000000 | 611 | 1 |
1 | 318 | 0.996737 | 611 | 2 |
2 | 296 | 0.993594 | 611 | 3 |
3 | 593 | 0.990549 | 611 | 4 |
4 | 2571 | 0.987782 | 611 | 5 |
rec_items = recs["item"].values.tolist()
moviesDF.loc[rec_items]
title | genres | RatingCount | year | |
---|---|---|---|---|
movieId | ||||
356 | Forrest Gump (1994) | Comedy|Drama|Romance|War | 329.0 | 1994.0 |
318 | Shawshank Redemption, The (1994) | Crime|Drama | 317.0 | 1994.0 |
296 | Pulp Fiction (1994) | Comedy|Crime|Drama|Thriller | 307.0 | 1994.0 |
593 | Silence of the Lambs, The (1991) | Crime|Horror|Thriller | 279.0 | 1991.0 |
2571 | Matrix, The (1999) | Action|Sci-Fi|Thriller | 278.0 | 1999.0 |
260 | Star Wars: Episode IV - A New Hope (1977) | Action|Adventure|Sci-Fi | 251.0 | 1977.0 |
480 | Jurassic Park (1993) | Action|Adventure|Sci-Fi|Thriller | 238.0 | 1993.0 |
110 | Braveheart (1995) | Action|Drama|War | 237.0 | 1995.0 |
589 | Terminator 2: Judgment Day (1991) | Action|Sci-Fi | 224.0 | 1991.0 |
527 | Schindler's List (1993) | Drama|War | 220.0 | 1993.0 |
2959 | Fight Club (1999) | Action|Crime|Drama|Thriller | 218.0 | 1999.0 |
1 | Toy Story (1995) | Adventure|Animation|Children|Comedy|Fantasy | 215.0 | 1995.0 |
1196 | Star Wars: Episode V - The Empire Strikes Back... | Action|Adventure|Sci-Fi | 211.0 | 1980.0 |
2858 | American Beauty (1999) | Drama|Romance | 204.0 | 1999.0 |
50 | Usual Suspects, The (1995) | Crime|Mystery|Thriller | 204.0 | 1995.0 |
47 | Seven (a.k.a. Se7en) (1995) | Mystery|Thriller | 203.0 | 1995.0 |
780 | Independence Day (a.k.a. ID4) (1996) | Action|Adventure|Sci-Fi|Thriller | 202.0 | 1996.0 |
150 | Apollo 13 (1995) | Adventure|Drama|IMAX | 201.0 | 1995.0 |
1198 | Raiders of the Lost Ark (Indiana Jones and the... | Action|Adventure | 200.0 | 1981.0 |
4993 | Lord of the Rings: The Fellowship of the Ring,... | Adventure|Fantasy | 198.0 | 2001.0 |
Task 3: Explore parameter space¶
- check what is the impact on your recommendations if you, e.g., change the neighbors volume, aggregation, or feedback type
- try at least 5-10 configurations
- note what configurations gave you good/bad results, mark the best one
Task 4: Another algorithm¶
- select another algorithm from the matrix factorization family (e.g., FunkSVD, lenskit_implicit.BPR, or ALS.BiasedMF)
- construct the initial loop and experiment a bit with hyperparameters (e.g., learning rate, regularization, #features, #iterations for FunkSVD)
- note what configurations gave you good/bad results, mark the best one
- compare with KNN-based approaches
Task 5: Hyperparameter Tuning¶
Try to get the best hyperparameter values automatically, using off-line evaluation¶
Following steps are nicely described in the LensKit Getting started tutorial (https://lkpy.lenskit.org/0.14.4/gettingstarted#Running-the-Evaluation)
Split data to train and validation sets
Utilize simple GRID search, define a few reasonable values for the most meaningful parameters
For each configuration:
- Train the recommending algorithm (using train set)
- Let the algorithm recommend for all users
- Evaluate the recommendations (select a target metric of your choice, e.g., hit rate or nDCG)
select the best configuration (i.e., the one with the highest nDCG)
Use this new configuration to recommend yourself - how good/bad were the recommendations? Were they better than your original algorithm from previous labs?
Alternatively, check LensKitAuto which can do some of the heavy-lifting for you (https://github.com/ISG-Siegen/lenskit-auto/tree/main)
#TODO: define hyperparameter configurations to be evaluated (simple grid search is just fine)
#LensKit provides several data partitioning variants. The algorithms we played with could not work under strong generalization paradigm (partitioning users; xf.partition_users())
# therefore, we focus on weak generalization, i.e., assigning each row to a train or test set at random
# for simplicity, we only create one train/test variant, i.e., no crossvalidation
train,test = xf.sample_rows(df, None, 20000) #define random sampled train and test sets, test set has a fixed size ~20%
print(train.shape, test.shape)
#TODO: foreach hyperparameter setting fit the algorithm, test recommendations and store results
#TODO: identify the best variant of hyperparameters
#TODO: fit the best variant on all data and recommend to you
(80838, 5) (20000, 5)