{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lab assignment 2: RS frameworks\n",
    "\n",
    "### How not to write everything by yourself?\n",
    "- utilize one of great many different RecSys frameworks (https://github.com/ACMRecSys/recsys-evaluation-frameworks)\n",
    "- varying coverage of algorithms, metrics, use-cases etc. \n",
    "- We want to start simple => LensKit\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Task 1: Setting things up\n",
    "\n",
    "- install LensKit (https://lkpy.readthedocs.io/, https://github.com/lenskit/lkpy) \n",
    "- familiarize yourself with the framework + what is supported (data, algorithms, evaluation methods)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\lpesk\\AppData\\Roaming\\Python\\Python38\\site-packages\\pandas\\core\\computation\\expressions.py:20: UserWarning: Pandas requires version '2.7.3' or newer of 'numexpr' (version '2.7.1' currently installed).\n",
      "  from pandas.core.computation.check import NUMEXPR_INSTALLED\n"
     ]
    }
   ],
   "source": [
    "from lenskit import batch, topn, util\n",
    "from lenskit import crossfold as xf\n",
    "from lenskit.algorithms import Recommender, funksvd, item_knn, basic\n",
    "from lenskit import topn\n",
    "\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task 2: Simple recommender output\n",
    "- continue with data from labs1 (including your ratings)\n",
    "- familiarize yourself with lenskit.Recommender methods\n",
    "- implement basic training loop for one algorithm (e.g., ItemItem KNN)\n",
    "- recommend for yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Starting code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# update this with your real ratings (note that OID = movieId from the moviesDF)\n",
    "myRatings = {\n",
    "    \"UID\":[611,611],\n",
    "    \"OID\":[174055,152081],\n",
    "    \"rating\":[4.0,5.0],\n",
    "    \"timestamp\":[0,0]\n",
    "            }"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>genres</th>\n",
       "      <th>RatingCount</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>movieId</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>193581</th>\n",
       "      <td>Black Butler: Book of the Atlantic (2017)</td>\n",
       "      <td>Action|Animation|Comedy|Fantasy</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2017.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193583</th>\n",
       "      <td>No Game No Life: Zero (2017)</td>\n",
       "      <td>Animation|Comedy|Fantasy</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2017.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193585</th>\n",
       "      <td>Flint (2017)</td>\n",
       "      <td>Drama</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2017.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193587</th>\n",
       "      <td>Bungo Stray Dogs: Dead Apple (2018)</td>\n",
       "      <td>Action|Animation</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2018.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193609</th>\n",
       "      <td>Andrew Dice Clay: Dice Rules (1991)</td>\n",
       "      <td>Comedy</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1991.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                             title  \\\n",
       "movieId                                              \n",
       "193581   Black Butler: Book of the Atlantic (2017)   \n",
       "193583                No Game No Life: Zero (2017)   \n",
       "193585                                Flint (2017)   \n",
       "193587         Bungo Stray Dogs: Dead Apple (2018)   \n",
       "193609         Andrew Dice Clay: Dice Rules (1991)   \n",
       "\n",
       "                                  genres  RatingCount    year  \n",
       "movieId                                                        \n",
       "193581   Action|Animation|Comedy|Fantasy          1.0  2017.0  \n",
       "193583          Animation|Comedy|Fantasy          1.0  2017.0  \n",
       "193585                             Drama          1.0  2017.0  \n",
       "193587                  Action|Animation          1.0  2018.0  \n",
       "193609                            Comedy          1.0  1991.0  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#moviesDF and df: same as in labs1\n",
    "\n",
    "moviesDF = pd.read_csv(\"movies.csv\", sep=\",\")\n",
    "moviesDF.movieId = moviesDF.movieId.astype(int)\n",
    "moviesDF.set_index(\"movieId\", inplace=True)\n",
    "\n",
    "df = pd.read_csv(\"ratings.csv\", sep=\",\")\n",
    "df.columns=[\"UID\",\"OID\",\"rating\",\"timestamp\"]\n",
    "\n",
    "ratingCounts = df.groupby(\"OID\")[\"UID\"].count()\n",
    "moviesDF[\"RatingCount\"] = ratingCounts\n",
    "moviesDF[\"year\"] = moviesDF.title.str.extract(r'\\(([0-9]+)\\)')\n",
    "moviesDF[\"year\"] = moviesDF.year.astype(\"float\")\n",
    "moviesDF.fillna(0, inplace=True)\n",
    "moviesDF.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user</th>\n",
       "      <th>item</th>\n",
       "      <th>rating</th>\n",
       "      <th>timestamp</th>\n",
       "      <th>title</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>100833</th>\n",
       "      <td>610</td>\n",
       "      <td>168250</td>\n",
       "      <td>5.0</td>\n",
       "      <td>1494273047</td>\n",
       "      <td>Get Out (2017)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100834</th>\n",
       "      <td>610</td>\n",
       "      <td>168252</td>\n",
       "      <td>5.0</td>\n",
       "      <td>1493846352</td>\n",
       "      <td>Logan (2017)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100835</th>\n",
       "      <td>610</td>\n",
       "      <td>170875</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1493846415</td>\n",
       "      <td>The Fate of the Furious (2017)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100836</th>\n",
       "      <td>611</td>\n",
       "      <td>174055</td>\n",
       "      <td>4.0</td>\n",
       "      <td>0</td>\n",
       "      <td>Dunkirk (2017)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100837</th>\n",
       "      <td>611</td>\n",
       "      <td>152081</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0</td>\n",
       "      <td>Zootopia (2016)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        user    item  rating   timestamp                           title\n",
       "100833   610  168250     5.0  1494273047                  Get Out (2017)\n",
       "100834   610  168252     5.0  1493846352                    Logan (2017)\n",
       "100835   610  170875     3.0  1493846415  The Fate of the Furious (2017)\n",
       "100836   611  174055     4.0           0                  Dunkirk (2017)\n",
       "100837   611  152081     5.0           0                 Zootopia (2016)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#append your ratings to the dataFrame df. \n",
    "df = pd.concat([df,pd.DataFrame(myRatings)], ignore_index=True)\n",
    "\n",
    "#add movieTitles to the df (for clarity)\n",
    "movieTitles = moviesDF.title.loc[df.OID]\n",
    "df[\"movieTitle\"] = movieTitles.values\n",
    "df.columns = [\"user\",\"item\",\"rating\",\"timestamp\",\"title\"] #LensKit require \"user\",\"item\",\"rating\" column names \n",
    "\n",
    "df.tail()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Usage of the \"most popular\" algorithm from LensKit\n",
    "- Change this to get the methods you're interested in"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>item</th>\n",
       "      <th>score</th>\n",
       "      <th>user</th>\n",
       "      <th>rank</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>356</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>611</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>318</td>\n",
       "      <td>0.996737</td>\n",
       "      <td>611</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>296</td>\n",
       "      <td>0.993594</td>\n",
       "      <td>611</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>593</td>\n",
       "      <td>0.990549</td>\n",
       "      <td>611</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2571</td>\n",
       "      <td>0.987782</td>\n",
       "      <td>611</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   item     score  user  rank\n",
       "0   356  1.000000   611     1\n",
       "1   318  0.996737   611     2\n",
       "2   296  0.993594   611     3\n",
       "3   593  0.990549   611     4\n",
       "4  2571  0.987782   611     5"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "top_k = 20\n",
    "\n",
    "pop_alg = basic.PopScore(score_method='quantile')\n",
    "pop_alg_clone = util.clone(pop_alg) # some algorithms behave strange if they are fitted multiple times\n",
    "pop_rec = Recommender.adapt(pop_alg_clone) #wrapper around an algorithm (select top-scoring items as recommendations by default)\n",
    "pop_rec.fit(df) #normally, some train-test split should be performed before this\n",
    "users = [611] #all you're interested in - normally, those are all users in the test set\n",
    "recs = batch.recommend(pop_rec, users, top_k)\n",
    "recs.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>genres</th>\n",
       "      <th>RatingCount</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>movieId</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>356</th>\n",
       "      <td>Forrest Gump (1994)</td>\n",
       "      <td>Comedy|Drama|Romance|War</td>\n",
       "      <td>329.0</td>\n",
       "      <td>1994.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>318</th>\n",
       "      <td>Shawshank Redemption, The (1994)</td>\n",
       "      <td>Crime|Drama</td>\n",
       "      <td>317.0</td>\n",
       "      <td>1994.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>296</th>\n",
       "      <td>Pulp Fiction (1994)</td>\n",
       "      <td>Comedy|Crime|Drama|Thriller</td>\n",
       "      <td>307.0</td>\n",
       "      <td>1994.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>593</th>\n",
       "      <td>Silence of the Lambs, The (1991)</td>\n",
       "      <td>Crime|Horror|Thriller</td>\n",
       "      <td>279.0</td>\n",
       "      <td>1991.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2571</th>\n",
       "      <td>Matrix, The (1999)</td>\n",
       "      <td>Action|Sci-Fi|Thriller</td>\n",
       "      <td>278.0</td>\n",
       "      <td>1999.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>260</th>\n",
       "      <td>Star Wars: Episode IV - A New Hope (1977)</td>\n",
       "      <td>Action|Adventure|Sci-Fi</td>\n",
       "      <td>251.0</td>\n",
       "      <td>1977.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>480</th>\n",
       "      <td>Jurassic Park (1993)</td>\n",
       "      <td>Action|Adventure|Sci-Fi|Thriller</td>\n",
       "      <td>238.0</td>\n",
       "      <td>1993.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>110</th>\n",
       "      <td>Braveheart (1995)</td>\n",
       "      <td>Action|Drama|War</td>\n",
       "      <td>237.0</td>\n",
       "      <td>1995.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>589</th>\n",
       "      <td>Terminator 2: Judgment Day (1991)</td>\n",
       "      <td>Action|Sci-Fi</td>\n",
       "      <td>224.0</td>\n",
       "      <td>1991.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>527</th>\n",
       "      <td>Schindler's List (1993)</td>\n",
       "      <td>Drama|War</td>\n",
       "      <td>220.0</td>\n",
       "      <td>1993.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2959</th>\n",
       "      <td>Fight Club (1999)</td>\n",
       "      <td>Action|Crime|Drama|Thriller</td>\n",
       "      <td>218.0</td>\n",
       "      <td>1999.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Toy Story (1995)</td>\n",
       "      <td>Adventure|Animation|Children|Comedy|Fantasy</td>\n",
       "      <td>215.0</td>\n",
       "      <td>1995.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1196</th>\n",
       "      <td>Star Wars: Episode V - The Empire Strikes Back...</td>\n",
       "      <td>Action|Adventure|Sci-Fi</td>\n",
       "      <td>211.0</td>\n",
       "      <td>1980.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2858</th>\n",
       "      <td>American Beauty (1999)</td>\n",
       "      <td>Drama|Romance</td>\n",
       "      <td>204.0</td>\n",
       "      <td>1999.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50</th>\n",
       "      <td>Usual Suspects, The (1995)</td>\n",
       "      <td>Crime|Mystery|Thriller</td>\n",
       "      <td>204.0</td>\n",
       "      <td>1995.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>Seven (a.k.a. Se7en) (1995)</td>\n",
       "      <td>Mystery|Thriller</td>\n",
       "      <td>203.0</td>\n",
       "      <td>1995.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>780</th>\n",
       "      <td>Independence Day (a.k.a. ID4) (1996)</td>\n",
       "      <td>Action|Adventure|Sci-Fi|Thriller</td>\n",
       "      <td>202.0</td>\n",
       "      <td>1996.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>150</th>\n",
       "      <td>Apollo 13 (1995)</td>\n",
       "      <td>Adventure|Drama|IMAX</td>\n",
       "      <td>201.0</td>\n",
       "      <td>1995.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1198</th>\n",
       "      <td>Raiders of the Lost Ark (Indiana Jones and the...</td>\n",
       "      <td>Action|Adventure</td>\n",
       "      <td>200.0</td>\n",
       "      <td>1981.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4993</th>\n",
       "      <td>Lord of the Rings: The Fellowship of the Ring,...</td>\n",
       "      <td>Adventure|Fantasy</td>\n",
       "      <td>198.0</td>\n",
       "      <td>2001.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                     title  \\\n",
       "movieId                                                      \n",
       "356                                    Forrest Gump (1994)   \n",
       "318                       Shawshank Redemption, The (1994)   \n",
       "296                                    Pulp Fiction (1994)   \n",
       "593                       Silence of the Lambs, The (1991)   \n",
       "2571                                    Matrix, The (1999)   \n",
       "260              Star Wars: Episode IV - A New Hope (1977)   \n",
       "480                                   Jurassic Park (1993)   \n",
       "110                                      Braveheart (1995)   \n",
       "589                      Terminator 2: Judgment Day (1991)   \n",
       "527                                Schindler's List (1993)   \n",
       "2959                                     Fight Club (1999)   \n",
       "1                                         Toy Story (1995)   \n",
       "1196     Star Wars: Episode V - The Empire Strikes Back...   \n",
       "2858                                American Beauty (1999)   \n",
       "50                              Usual Suspects, The (1995)   \n",
       "47                             Seven (a.k.a. Se7en) (1995)   \n",
       "780                   Independence Day (a.k.a. ID4) (1996)   \n",
       "150                                       Apollo 13 (1995)   \n",
       "1198     Raiders of the Lost Ark (Indiana Jones and the...   \n",
       "4993     Lord of the Rings: The Fellowship of the Ring,...   \n",
       "\n",
       "                                              genres  RatingCount    year  \n",
       "movieId                                                                    \n",
       "356                         Comedy|Drama|Romance|War        329.0  1994.0  \n",
       "318                                      Crime|Drama        317.0  1994.0  \n",
       "296                      Comedy|Crime|Drama|Thriller        307.0  1994.0  \n",
       "593                            Crime|Horror|Thriller        279.0  1991.0  \n",
       "2571                          Action|Sci-Fi|Thriller        278.0  1999.0  \n",
       "260                          Action|Adventure|Sci-Fi        251.0  1977.0  \n",
       "480                 Action|Adventure|Sci-Fi|Thriller        238.0  1993.0  \n",
       "110                                 Action|Drama|War        237.0  1995.0  \n",
       "589                                    Action|Sci-Fi        224.0  1991.0  \n",
       "527                                        Drama|War        220.0  1993.0  \n",
       "2959                     Action|Crime|Drama|Thriller        218.0  1999.0  \n",
       "1        Adventure|Animation|Children|Comedy|Fantasy        215.0  1995.0  \n",
       "1196                         Action|Adventure|Sci-Fi        211.0  1980.0  \n",
       "2858                                   Drama|Romance        204.0  1999.0  \n",
       "50                            Crime|Mystery|Thriller        204.0  1995.0  \n",
       "47                                  Mystery|Thriller        203.0  1995.0  \n",
       "780                 Action|Adventure|Sci-Fi|Thriller        202.0  1996.0  \n",
       "150                             Adventure|Drama|IMAX        201.0  1995.0  \n",
       "1198                                Action|Adventure        200.0  1981.0  \n",
       "4993                               Adventure|Fantasy        198.0  2001.0  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rec_items = recs[\"item\"].values.tolist()\n",
    "moviesDF.loc[rec_items]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task 3: Explore parameter space\n",
    "- check what is the impact on your recommendations if you, e.g., change the neighbors volume, aggregation, or feedback type\n",
    "- try at least 5-10 configurations\n",
    "- note what configurations gave you good/bad results, **mark the best one**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task 4: Another algorithm\n",
    "- select another algorithm from the matrix factorization family (e.g., FunkSVD, lenskit_implicit.BPR, or ALS.BiasedMF) \n",
    "- construct the initial loop and experiment a bit with hyperparameters (e.g., learning rate, regularization, #features, #iterations for FunkSVD)\n",
    "- note what configurations gave you good/bad results, **mark the best one**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task 5: Hyperparameter Tuning\n",
    "### Try to get the best hyperparameter values automatically, using off-line evaluation\n",
    "Following steps are nicely described in the LensKit Getting started tutorial (https://lkpy.readthedocs.io/en/stable/GettingStarted.html#Running-the-Evaluation)\n",
    "- Split data to train and validation sets\n",
    "- Utilize GRID search, define a few reasonable values for the most meaningful parameters\n",
    "- For each configuration:\n",
    "  - Train the recommending algorithm (using train set)\n",
    "  - Let the algorithm recommend for all users\n",
    "  - Evaluate the recommendations (select a target metric of your choice, e.g., nDCG)\n",
    "- select the best configuration (i.e., the one with the highest nDCG)\n",
    "\n",
    "- **Use this new configuration to recommend yourself** - how good/bad were the recommendations?\n",
    "\n",
    "- Alternatively, check LensKitAuto which can do some of the heavy-lifting for you (https://github.com/ISG-Siegen/lenskit-auto/tree/main)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(80674, 5) (20164, 5)\n"
     ]
    }
   ],
   "source": [
    "#TODO: define hyperparameter configurations to be evaluated (simple grid search is just fine)\n",
    "for train,test in  xf.partition_users(df, 1, xf.SampleFrac(0.2)): #define random sampled train and test sets\n",
    "    print(train.shape, test.shape)\n",
    "    #TODO: foreach hyperparameter setting fit the algorithm, test recommendations and store results\n",
    "    #TODO: identify the best variant of hyperparameters\n",
    "    #TODO: fit the best variant on all data and recommend to you"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}