While last time we focused on out-of-the-box functionality in the recommending frameworks, this time we will explore a bit how convenient it is to actually modify some portions of the framework.
How difficult it will be to made a custom evaluation metric and integrate it into the framework. For this task, use some dataset with genre information available (e.g., MovieLens, LibraryThing, GoodBooks etc.). Apart from standard evaluation metrics, you would like to check a custom-made one as well. How difficult is this in your selected framework?
Consider "genre-wise serendipity" as your target. Serendipity aims to measure how many recommendations were both relevant and surprising. For genre-wise serendipity, we gonna define the "surprisingness" through genres. In particular, the item is surprising, if and only if its genres are not present in the genres of the existing user profile (i.e., there is no rated item with the same genre in the users train set). Relevance can be considered as an existence of the recommended item in the test set (you can also apply a filter on the numerical rating if you want to). Furthermore, you want this metric to be defined in a recall fashion - i.e., how many of the potentially serendipitous recommendations (from the user profile) the algorithm actually recommended?
Implement the metric, incorporate it into the framework and use it in some evaluation scenario (e.g., compare two algorithms, or several hyperparameters of one w.r.t. genre-wise serendipity)
Probably the most common thing you may need from RS framework is to test your own algorithm there. How difficult would that be in your framework? Another notorious issue is the usage of additional data beyond user feedback within the frameworks. Therefore, we gonna focus on content-based RS. Consider a simple Item KNN working on top of content-based similarity - an example of such is https://github.com/yjeong5126/movie_recommender/blob/master/content_based_filtering/content_based_recommender.ipynb.