Streaming and video platforms like Youtube, Netflix, and Disney+ have recommendation systems that suggest relevant movies to their users based on those users' historical interactions. For this case study, I will be building and tuning the hyperparameters of several different recommendation systems based on user ratings on films. Based on a viewer's prior ratings on a set of films, or those of similar viewers, what movies can we recommend? The data used for this project is a subset of the dataset found here. The recommendation systems made for this project can be applied to other types of items as well.
I will build the following recommendation systems for the movie ratings dataset:
user-based
collaborative filtering recommendation system (with K Nearest Neighbors)item-based
collaborative filtering recommendation system (with K Nearest Neighbors)I will also perform some exploratory data analysis on the original data, create a deployable function to output our recommendataions of n movies for any user (based on the created algorithms), determine precision and recall @ K, and visually represent our predicted values based on our optimized models.
The ratings dataset contains the following attributes:
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from surprise import accuracy
# The reader class is used to parse a file containing ratings. Should be in teh structure: user ; item ; rating
from surprise.reader import Reader
# The dataset class for loading datasets
from surprise.dataset import Dataset
# for model tuning, computes accuracy metrics for an algorithm on various combinations of parameters
# Helps find the best parameters for a prediction algorithm.
from surprise.model_selection import GridSearchCV
from surprise.model_selection import RandomizedSearchCV
# for splitting the rating data into train/test datasets
from surprise.model_selection import train_test_split
# for implementing the similarity-based recommendation system
from surprise.prediction_algorithms.knns import KNNBasic
# for implementing matrix factorization-based recommendation system
from surprise.prediction_algorithms.matrix_factorization import SVD
# Defaultdict is a sub-class of the dictionary class. Makes a dictionary-like object except
# unlike dictionaries, provides a default value for keys that do not exist as opposed to raising a KeyError.
from collections import defaultdict
# for cross validation
from surprise.model_selection import KFold
rating = pd.read_csv('/Users/faisal/Desktop/Portfolio Projects/Recommendation System Project - Movies/ratings.csv')
Let's check the info of the data
rating.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 100004 entries, 0 to 100003 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 userId 100004 non-null int64 1 movieId 100004 non-null int64 2 rating 100004 non-null float64 3 timestamp 100004 non-null int64 dtypes: float64(1), int64(3) memory usage: 3.1 MB
#Dropping timestamp column
rating = rating.drop(['timestamp'], axis=1)
rating.head(30)
userId | movieId | rating | |
---|---|---|---|
0 | 1 | 31 | 2.5 |
1 | 1 | 1029 | 3.0 |
2 | 1 | 1061 | 3.0 |
3 | 1 | 1129 | 2.0 |
4 | 1 | 1172 | 4.0 |
5 | 1 | 1263 | 2.0 |
6 | 1 | 1287 | 2.0 |
7 | 1 | 1293 | 2.0 |
8 | 1 | 1339 | 3.5 |
9 | 1 | 1343 | 2.0 |
10 | 1 | 1371 | 2.5 |
11 | 1 | 1405 | 1.0 |
12 | 1 | 1953 | 4.0 |
13 | 1 | 2105 | 4.0 |
14 | 1 | 2150 | 3.0 |
15 | 1 | 2193 | 2.0 |
16 | 1 | 2294 | 2.0 |
17 | 1 | 2455 | 2.5 |
18 | 1 | 2968 | 1.0 |
19 | 1 | 3671 | 3.0 |
20 | 2 | 10 | 4.0 |
21 | 2 | 17 | 5.0 |
22 | 2 | 39 | 5.0 |
23 | 2 | 47 | 4.0 |
24 | 2 | 50 | 4.0 |
25 | 2 | 52 | 3.0 |
26 | 2 | 62 | 3.0 |
27 | 2 | 110 | 4.0 |
28 | 2 | 144 | 3.0 |
29 | 2 | 150 | 5.0 |
plt.figure(figsize = (12, 5))
sns.countplot(x="rating", data=rating)
plt.tick_params(labelsize = 10)
plt.title("Distribution of Movie Ratings", fontsize = 20)
plt.xlabel("Rating", fontsize = 10)
plt.ylabel("Number of Ratings", fontsize = 10)
plt.show()
There are 100004 ratings in the data set. Per the histogram, we see the distribution of ratings. Rating 4 and 3 are the most used rating by the users, with a frequency of nearly 30k and 20k respectively, followed by a rating of 5, which has a frequency of about 15k. The ratings are biased towards 4, 5, and 3 more than the other ratings.
Let us see what a user-item interactions matrix looks like below, which has a cell for each user-movie pair, filled with that user's rating if available. What can be immediately noted is how sparse the matrix is with actual ratings. That is because there are many items and the majority of users will be unable to rate a significant share of those items, let alone all of them. While it is inefficient to interact with an extremely large pandas dataframe that is composed of a majority of empty values as below, we will use it to calculate the 'sparsity' of the matrix:
user_item_interactions_matrix = rating.pivot(index='userId', columns='movieId', values='rating')
user_item_interactions_matrix
movieId | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ... | 161084 | 161155 | 161594 | 161830 | 161918 | 161944 | 162376 | 162542 | 162672 | 163949 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
userId | |||||||||||||||||||||
1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 | NaN | NaN | 4.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
667 | NaN | NaN | NaN | NaN | NaN | 4.0 | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
668 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
669 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
670 | 4.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
671 | 5.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
671 rows × 9066 columns
# The sum of all the null values in each column
null_in_matrix = user_item_interactions_matrix.isnull().sum().sum()
# The sum of all the non-null values in each column
nonnull_in_matrix = user_item_interactions_matrix.count().sum()
matrix_sparsity = nonnull_in_matrix/(null_in_matrix+nonnull_in_matrix)
matrix_sparsity
0.016439141608663475
Only 1.64% of our matrix is filled with values! This should illustrate how recommendation systems work with generally sparse matrices.
# The number of unique users:
rating['userId'].nunique()
671
# The number of unique movies:
rating['movieId'].nunique()
9066
# We run a group by on userID and movieID:
rating.groupby(['userId', 'movieId']).count()
rating | ||
---|---|---|
userId | movieId | |
1 | 31 | 1 |
1029 | 1 | |
1061 | 1 | |
1129 | 1 | |
1172 | 1 | |
... | ... | ... |
671 | 6268 | 1 |
6269 | 1 | |
6365 | 1 | |
6385 | 1 | |
6565 | 1 |
100004 rows × 1 columns
rating.groupby(['userId', 'movieId']).count()['rating'].sum()
100004
The sum is equal to the # of total observations noted before, meaning that there is only one interaction between a user and a movie.
rating['movieId'].value_counts()
356 341 296 324 318 311 593 304 260 291 ... 98604 1 103659 1 104419 1 115927 1 6425 1 Name: movieId, Length: 9066, dtype: int64
The movie with ID 356 has the most interaction in the dataset, 341 times. We also see that there are more than one movie with only 1 rating (seen at the bottom of the previous list)
#Plotting distributions of ratings for 341 interactions with movieid 356
plt.figure(figsize=(7,7))
rating[rating['movieId'] == 356]['rating'].value_counts().plot(kind='bar')
plt.title("Distribution of Movie Ratings for Movie 356", fontsize = 20)
plt.xlabel('Rating')
plt.ylabel('Count')
plt.show()
We see that for movieID 356 (the most interacted movie in the set), that the most frequent ratings associated are 4 and then 5, with a count of over 100 for both. They are then followed by a rating of 3 with a frequency of less than 60, and tapering down in frequency for other ratings. This implies that the movie is liked by the majority of users.
rating['userId'].value_counts()
547 2391 564 1868 624 1735 15 1700 73 1610 ... 296 20 289 20 249 20 221 20 1 20 Name: userId, Length: 671, dtype: int64
We see that the user with userID 547 interacted the most with movies, giving 2391 ratings
What is the average number of interactions a user gave for a movie?
rating['userId'].value_counts().mean()
149.03725782414307
Approximately 149 interactions made by users on average.
# Finding user-movie interactions distribution
count_interactions = rating.groupby('userId').count()['movieId']
count_interactions
userId 1 20 2 76 3 51 4 204 5 100 ... 667 68 668 20 669 37 670 31 671 115 Name: movieId, Length: 671, dtype: int64
# Plotting user-movie interactions distribution
plt.figure(figsize=(15,7))
sns.histplot(count_interactions)
plt.title('Distribution of User-Movie Interactions',size=20)
plt.xlabel('Number of Interactions by Users')
plt.show()
As expected, the distribution shows us that the bulk of users had few interactions, and only a few users had interactions numbering in the hundreds or over a thousand.
A rank-based recommendation systems recommends to users based on the popularity of an item. This type of system is useful for dealing with the cold start
problem. This is the problem we get when we have a new user in a system: our machine won't be able to recommend to them a movie based on their historical interaction with our dataset of movies (since they are brand new), and so recommendations are based on general ranking. But even outside of a cold start situation, users may be interested in how movies are generally ranked by others.
We start by taking the average of all the ratings provided to each movie and then rank them based on their average rating.
# Calculating average ratings
average_rating = rating.groupby('movieId').mean()['rating']
# Calculating the count of ratings
count_rating = rating.groupby('movieId').count()['rating']
# Joining count and average of ratings into a datafram
counts_ratings = pd.DataFrame({'avg_rating':average_rating, 'rating_count':count_rating})
counts_ratings.head()
avg_rating | rating_count | |
---|---|---|
movieId | ||
1 | 3.872470 | 247 |
2 | 3.401869 | 107 |
3 | 3.161017 | 59 |
4 | 2.384615 | 13 |
5 | 3.267857 | 56 |
We now make a function that recommends the top n movies based on the average ratings of movies. The function will also have a minimum number of ratings required to be recommended (to exclude cases where a movie was only rated 5 stars by only a handful of people)
def top_n_movies(data, n, min_interaction=100):
# Finding list of movies above specified minimum number of interactions
recommendations = data[data['rating_count'] > min_interaction]
# Sorting values based on their average rating
recommendations = recommendations.sort_values(by='avg_rating', ascending=False)
return recommendations.index[:n]
The function can take in arguments to change n and the minimum number of interactions.
list(top_n_movies(counts_ratings, 10, 50))
[858, 318, 913, 1221, 50, 1252, 904, 1203, 527, 6016]
list(top_n_movies(counts_ratings, 10, 100))
[858, 318, 1221, 50, 527, 1193, 608, 296, 2858, 58559]
list(top_n_movies(counts_ratings, 10, 250))
[318, 296, 260, 2571, 593, 356, 480]
Note that in the last example, only 7 movies fit this requirement of a minimum of 250 ratings.
User-Based Collaborative Filtering is used by many websites for their recommendations. The model predicts the items a user might like based on ratings given to that item by users with similar tastes as the target user. So this requires some rating history from a user to then group them with similar users, and then determine a recommendation for them based on the similar users.
We can build this kind of system using only user-item interaction data, which may come in the form of ratings (as in this example), likes (e.g. likes on Facebook/Youtube/Instagram/Twitter, or swipes on a dating app), purchase/use (buying a product, or perhaps data on it being used), and reading (a book being read by someone), among other possible interactions
We will build a Similarity/Neighborhood based
system using K-nearest neighbors (KNN)
to find similar users based on the cosine
similarity metric. The surprise
library will help us build additional models.
We have to first load the rating
dataset (a pandas dataframe) into a different format used by the surprise
library, called surprise.dataset.DatasetAutoFolds
. We use the surprise
classes Reader
and Dataset
to accomplish this.
# Instantiate Reader and set the rating scale
reader = Reader(rating_scale=(0, 5))
# Load the rating dataset into the format needed
data = Dataset.load_from_df(rating[['userId', 'movieId', 'rating']], reader)
# Split the data into train and test dataset
trainset, testset = train_test_split(data, test_size=0.2, random_state=30)
# We set up our similarity paramater options for the user-based KNN algorithm in surprise. Note that 'user_based': True
sim_options = {'name': 'cosine',
'user_based': True}
# We define our K Nearest Neighbour algorithm
knn_user = KNNBasic(sim_options=sim_options,verbose=False)
# We train the algorithm on the trainset, i.e. fitting the model on the train dataset
knn_user.fit(trainset)
<surprise.prediction_algorithms.knns.KNNBasic at 0x7f8f7cd91040>
# We now predict the ratings for our testset based on our trained model:
predictions = knn_user.test(testset)
# We now compute the RMSE (root-mean-square error) to measure the difference between
# our predicted values and the actual (a lower RMSE is better):
accuracy.rmse(predictions)
RMSE: 0.9901
0.9901142861029354
The baseline model gives us an RMSE = 0.9901 on the test set.
# First let's find a test case. We turn the testset into a pandas dataframe for easy searching:
testset_df = pd.DataFrame(testset, columns=['userId', 'movieId', 'rating'])
# We select a random user from this testset dataframe
testset_df[testset_df['userId']==10]
userId | movieId | rating | |
---|---|---|---|
1339 | 10 | 1036 | 3.0 |
2528 | 10 | 735 | 4.0 |
2574 | 10 | 2840 | 3.0 |
6509 | 10 | 2890 | 4.0 |
8793 | 10 | 1196 | 4.0 |
11486 | 10 | 50 | 5.0 |
11620 | 10 | 1240 | 4.0 |
17373 | 10 | 2410 | 2.0 |
19778 | 10 | 152 | 4.0 |
# Let's use our model to predict what userId 10's rating was for movieId 1240:
knn_user.predict(10, 1240, r_ui=4, verbose=True)
# note that r_ui is our identification of the real rating value, meant to help us compare the prediction
# with the value in the output below:
user: 10 item: 1240 r_ui = 4.00 est = 4.26 {'actual_k': 40, 'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=4.262125602458993, details={'actual_k': 40, 'was_impossible': False})
The actual rating for this user-item pair is 4 and predicted rating is 4.26 by this similarity based baseline model.
Here are the different hyperparameters of the KNNBasic algorithm:
Taken from the official documentation: https://surprise.readthedocs.io/en/stable/knn_inspired.html
# setting up a parameter grid to tune the hyperparameters with different set ups
param_grid = {'k': [10, 20, 30, 40], 'min_k': [1, 2, 3, 6],
'sim_options': {'name': ['msd', 'cosine', 'pearson'],
'user_based': [True]}
}
# performing cross validation to tune the hyperparameters with the above set ups
# note that cv (=4) is referring to the cross-validation generator, and determines the splitting strategy. Default = 5
# n_jobs refers to the number of jobs to run in parallel. -1 means to use all processors.
grid_obj = GridSearchCV(KNNBasic, param_grid, measures=['rmse', 'mae'], cv=4, n_jobs=-1)
# fitting the models on our data
grid_obj.fit(data)
# best RMSE score
print(grid_obj.best_score['rmse'])
# combination of parameters that gave the best RMSE score
print(grid_obj.best_params['rmse'])
0.958735094378809 {'k': 20, 'min_k': 3, 'sim_options': {'name': 'msd', 'user_based': True}}
Once the grid search is complete, we can get the optimal values for each of those hyperparameters as shown above which results in a reduced RMSE of 0.9575. This is obtained when k = 20, min_k = 3, and we use msd (mean squared difference) for calculating similarity
Let's compare our RMSE and MAE at every split (we had cv=4) to analyze the impact of each value of the hyperparameters we set:
results_df = pd.DataFrame.from_dict(grid_obj.cv_results)
results_df
# Note that there are 48 rows, because we had 4 K values, 4 min_k values, and 3 similarity options (4x4x3=48).
# There are also 4 splits
# Note that the optimal model as determined above (K:20, min_k:3, similarity calculating with MSD) is row #18
# We see from looking at that row that it's not only ranked 1 for RMSE, but also in terms of MAE.
split0_test_rmse | split1_test_rmse | split2_test_rmse | split3_test_rmse | mean_test_rmse | std_test_rmse | rank_test_rmse | split0_test_mae | split1_test_mae | split2_test_mae | ... | std_test_mae | rank_test_mae | mean_fit_time | std_fit_time | mean_test_time | std_test_time | params | param_k | param_min_k | param_sim_options | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.968500 | 0.967511 | 0.967462 | 0.973975 | 0.969362 | 0.002696 | 14 | 0.740562 | 0.741354 | 0.741308 | ... | 0.001559 | 10 | 0.124339 | 0.023717 | 0.984527 | 0.010010 | {'k': 10, 'min_k': 1, 'sim_options': {'name': ... | 10 | 1 | {'name': 'msd', 'user_based': True} |
1 | 1.013142 | 1.010957 | 1.009534 | 1.015179 | 1.012203 | 0.002145 | 47 | 0.780481 | 0.780469 | 0.777229 | ... | 0.001818 | 44 | 0.391916 | 0.039213 | 1.004618 | 0.017984 | {'k': 10, 'min_k': 1, 'sim_options': {'name': ... | 10 | 1 | {'name': 'cosine', 'user_based': True} |
2 | 1.014842 | 1.015055 | 1.014217 | 1.021676 | 1.016448 | 0.003034 | 48 | 0.786215 | 0.788655 | 0.786656 | ... | 0.002043 | 48 | 0.600040 | 0.039681 | 0.998136 | 0.027596 | {'k': 10, 'min_k': 1, 'sim_options': {'name': ... | 10 | 1 | {'name': 'pearson', 'user_based': True} |
3 | 0.959367 | 0.958528 | 0.960230 | 0.966986 | 0.961278 | 0.003350 | 5 | 0.735701 | 0.736427 | 0.737423 | ... | 0.001835 | 4 | 0.114061 | 0.024479 | 1.001976 | 0.017335 | {'k': 10, 'min_k': 2, 'sim_options': {'name': ... | 10 | 2 | {'name': 'msd', 'user_based': True} |
4 | 1.004415 | 1.002364 | 1.002606 | 1.008475 | 1.004465 | 0.002447 | 42 | 0.775621 | 0.775542 | 0.773344 | ... | 0.001711 | 39 | 0.356967 | 0.038834 | 0.998386 | 0.010494 | {'k': 10, 'min_k': 2, 'sim_options': {'name': ... | 10 | 2 | {'name': 'cosine', 'user_based': True} |
5 | 1.006313 | 1.004238 | 1.004409 | 1.012986 | 1.006987 | 0.003558 | 46 | 0.781966 | 0.782680 | 0.781353 | ... | 0.001927 | 47 | 0.561399 | 0.044681 | 0.995363 | 0.028158 | {'k': 10, 'min_k': 2, 'sim_options': {'name': ... | 10 | 2 | {'name': 'pearson', 'user_based': True} |
6 | 0.960375 | 0.957266 | 0.959853 | 0.965845 | 0.960835 | 0.003123 | 3 | 0.735853 | 0.735461 | 0.736993 | ... | 0.001403 | 2 | 0.108140 | 0.034608 | 1.003535 | 0.022668 | {'k': 10, 'min_k': 3, 'sim_options': {'name': ... | 10 | 3 | {'name': 'msd', 'user_based': True} |
7 | 1.004869 | 1.000625 | 1.001621 | 1.006870 | 1.003496 | 0.002501 | 39 | 0.775255 | 0.774106 | 0.772518 | ... | 0.001394 | 38 | 0.334498 | 0.045401 | 0.999577 | 0.015634 | {'k': 10, 'min_k': 3, 'sim_options': {'name': ... | 10 | 3 | {'name': 'cosine', 'user_based': True} |
8 | 1.004187 | 1.000512 | 1.002002 | 1.009106 | 1.003952 | 0.003250 | 41 | 0.780463 | 0.779976 | 0.780059 | ... | 0.001365 | 45 | 0.541609 | 0.068116 | 1.007394 | 0.028322 | {'k': 10, 'min_k': 3, 'sim_options': {'name': ... | 10 | 3 | {'name': 'pearson', 'user_based': True} |
9 | 0.964688 | 0.961550 | 0.962498 | 0.969919 | 0.964664 | 0.003241 | 10 | 0.739222 | 0.738192 | 0.739009 | ... | 0.001315 | 7 | 0.106994 | 0.025079 | 1.018442 | 0.013254 | {'k': 10, 'min_k': 6, 'sim_options': {'name': ... | 10 | 6 | {'name': 'msd', 'user_based': True} |
10 | 1.007669 | 1.003289 | 1.002651 | 1.009606 | 1.005804 | 0.002924 | 45 | 0.777555 | 0.775541 | 0.773379 | ... | 0.001809 | 40 | 0.347500 | 0.046539 | 1.034602 | 0.013462 | {'k': 10, 'min_k': 6, 'sim_options': {'name': ... | 10 | 6 | {'name': 'cosine', 'user_based': True} |
11 | 1.006310 | 1.003358 | 1.000850 | 1.010455 | 1.005243 | 0.003576 | 43 | 0.782827 | 0.780964 | 0.780324 | ... | 0.001856 | 46 | 0.530090 | 0.068997 | 1.017254 | 0.032123 | {'k': 10, 'min_k': 6, 'sim_options': {'name': ... | 10 | 6 | {'name': 'pearson', 'user_based': True} |
12 | 0.967281 | 0.965681 | 0.965182 | 0.970979 | 0.967281 | 0.002271 | 12 | 0.740574 | 0.741134 | 0.740341 | ... | 0.001098 | 9 | 0.102820 | 0.023205 | 1.158971 | 0.019833 | {'k': 20, 'min_k': 1, 'sim_options': {'name': ... | 20 | 1 | {'name': 'msd', 'user_based': True} |
13 | 1.000454 | 0.997529 | 0.996555 | 1.002829 | 0.999342 | 0.002472 | 37 | 0.771130 | 0.770975 | 0.768479 | ... | 0.001243 | 31 | 0.349755 | 0.050120 | 1.169554 | 0.020392 | {'k': 20, 'min_k': 1, 'sim_options': {'name': ... | 20 | 1 | {'name': 'cosine', 'user_based': True} |
14 | 1.003872 | 1.005062 | 1.002794 | 1.010243 | 1.005493 | 0.002857 | 44 | 0.776682 | 0.780451 | 0.776486 | ... | 0.002080 | 43 | 0.529473 | 0.068175 | 1.158393 | 0.028317 | {'k': 20, 'min_k': 1, 'sim_options': {'name': ... | 20 | 1 | {'name': 'pearson', 'user_based': True} |
15 | 0.958137 | 0.956681 | 0.957932 | 0.963968 | 0.959180 | 0.002820 | 2 | 0.735714 | 0.736207 | 0.736456 | ... | 0.001290 | 3 | 0.112827 | 0.029644 | 1.181457 | 0.011530 | {'k': 20, 'min_k': 2, 'sim_options': {'name': ... | 20 | 2 | {'name': 'msd', 'user_based': True} |
16 | 0.991615 | 0.988819 | 0.989535 | 0.996042 | 0.991503 | 0.002815 | 26 | 0.766270 | 0.766048 | 0.764594 | ... | 0.001078 | 24 | 0.363823 | 0.051842 | 1.178888 | 0.023163 | {'k': 20, 'min_k': 2, 'sim_options': {'name': ... | 20 | 2 | {'name': 'cosine', 'user_based': True} |
17 | 0.995249 | 0.994136 | 0.992874 | 1.001454 | 0.995928 | 0.003299 | 34 | 0.772432 | 0.774475 | 0.771183 | ... | 0.001826 | 37 | 0.553375 | 0.074193 | 1.143949 | 0.038753 | {'k': 20, 'min_k': 2, 'sim_options': {'name': ... | 20 | 2 | {'name': 'pearson', 'user_based': True} |
18 | 0.959146 | 0.955417 | 0.957555 | 0.962823 | 0.958735 | 0.002705 | 1 | 0.735866 | 0.735241 | 0.736027 | ... | 0.000868 | 1 | 0.106124 | 0.023634 | 1.150807 | 0.023635 | {'k': 20, 'min_k': 3, 'sim_options': {'name': ... | 20 | 3 | {'name': 'msd', 'user_based': True} |
19 | 0.992074 | 0.987056 | 0.988537 | 0.994417 | 0.990521 | 0.002895 | 23 | 0.765903 | 0.764613 | 0.763768 | ... | 0.000868 | 19 | 0.358750 | 0.044207 | 1.164428 | 0.038408 | {'k': 20, 'min_k': 3, 'sim_options': {'name': ... | 20 | 3 | {'name': 'cosine', 'user_based': True} |
20 | 0.993099 | 0.990371 | 0.990439 | 0.997529 | 0.992860 | 0.002912 | 30 | 0.770929 | 0.771771 | 0.769889 | ... | 0.001105 | 34 | 0.542027 | 0.066554 | 1.147722 | 0.040137 | {'k': 20, 'min_k': 3, 'sim_options': {'name': ... | 20 | 3 | {'name': 'pearson', 'user_based': True} |
21 | 0.963465 | 0.959709 | 0.960206 | 0.966910 | 0.962572 | 0.002890 | 7 | 0.739235 | 0.737972 | 0.738042 | ... | 0.000936 | 5 | 0.105963 | 0.019881 | 1.171825 | 0.017947 | {'k': 20, 'min_k': 6, 'sim_options': {'name': ... | 20 | 6 | {'name': 'msd', 'user_based': True} |
22 | 0.994911 | 0.989757 | 0.989581 | 0.997186 | 0.992859 | 0.003290 | 29 | 0.768204 | 0.766047 | 0.764629 | ... | 0.001354 | 25 | 0.339043 | 0.032478 | 1.166456 | 0.028285 | {'k': 20, 'min_k': 6, 'sim_options': {'name': ... | 20 | 6 | {'name': 'cosine', 'user_based': True} |
23 | 0.995246 | 0.993246 | 0.989273 | 0.998894 | 0.994165 | 0.003475 | 33 | 0.773294 | 0.772759 | 0.770153 | ... | 0.001647 | 36 | 0.522085 | 0.051545 | 1.169217 | 0.040152 | {'k': 20, 'min_k': 6, 'sim_options': {'name': ... | 20 | 6 | {'name': 'pearson', 'user_based': True} |
24 | 0.969599 | 0.968206 | 0.967466 | 0.973860 | 0.969783 | 0.002475 | 15 | 0.743918 | 0.744420 | 0.743122 | ... | 0.001094 | 14 | 0.104816 | 0.029667 | 1.270555 | 0.014865 | {'k': 30, 'min_k': 1, 'sim_options': {'name': ... | 30 | 1 | {'name': 'msd', 'user_based': True} |
25 | 0.998203 | 0.995712 | 0.994068 | 1.000349 | 0.997083 | 0.002392 | 36 | 0.770164 | 0.770413 | 0.767002 | ... | 0.001450 | 27 | 0.332329 | 0.022147 | 1.257186 | 0.017914 | {'k': 30, 'min_k': 1, 'sim_options': {'name': ... | 30 | 1 | {'name': 'cosine', 'user_based': True} |
26 | 1.001701 | 1.003070 | 1.001477 | 1.007994 | 1.003560 | 0.002631 | 40 | 0.774768 | 0.778606 | 0.775314 | ... | 0.001928 | 42 | 0.481376 | 0.031619 | 1.252545 | 0.033845 | {'k': 30, 'min_k': 1, 'sim_options': {'name': ... | 30 | 1 | {'name': 'pearson', 'user_based': True} |
27 | 0.960477 | 0.959230 | 0.960234 | 0.966870 | 0.961703 | 0.003020 | 6 | 0.739058 | 0.739493 | 0.739237 | ... | 0.001204 | 8 | 0.104204 | 0.032686 | 1.271140 | 0.015019 | {'k': 30, 'min_k': 2, 'sim_options': {'name': ... | 30 | 2 | {'name': 'msd', 'user_based': True} |
28 | 0.989344 | 0.986986 | 0.987031 | 0.993545 | 0.989227 | 0.002670 | 20 | 0.765304 | 0.765486 | 0.763117 | ... | 0.001193 | 20 | 0.339827 | 0.032094 | 1.283173 | 0.036460 | {'k': 30, 'min_k': 2, 'sim_options': {'name': ... | 30 | 2 | {'name': 'cosine', 'user_based': True} |
29 | 0.993059 | 0.992122 | 0.991544 | 0.999185 | 0.993977 | 0.003055 | 32 | 0.770518 | 0.772631 | 0.770012 | ... | 0.001625 | 35 | 0.479198 | 0.035317 | 1.251465 | 0.036990 | {'k': 30, 'min_k': 2, 'sim_options': {'name': ... | 30 | 2 | {'name': 'pearson', 'user_based': True} |
30 | 0.961484 | 0.957969 | 0.959858 | 0.965728 | 0.961260 | 0.002864 | 4 | 0.739209 | 0.738527 | 0.738808 | ... | 0.000787 | 6 | 0.099319 | 0.025825 | 1.276763 | 0.016132 | {'k': 30, 'min_k': 3, 'sim_options': {'name': ... | 30 | 3 | {'name': 'msd', 'user_based': True} |
31 | 0.989804 | 0.985219 | 0.986030 | 0.991916 | 0.988243 | 0.002737 | 18 | 0.764938 | 0.764050 | 0.762291 | ... | 0.000998 | 17 | 0.311707 | 0.019349 | 1.291635 | 0.018562 | {'k': 30, 'min_k': 3, 'sim_options': {'name': ... | 30 | 3 | {'name': 'cosine', 'user_based': True} |
32 | 0.990905 | 0.988350 | 0.989105 | 0.995251 | 0.990903 | 0.002677 | 25 | 0.769015 | 0.769926 | 0.768718 | ... | 0.000896 | 28 | 0.473089 | 0.037097 | 1.259978 | 0.029378 | {'k': 30, 'min_k': 3, 'sim_options': {'name': ... | 30 | 3 | {'name': 'pearson', 'user_based': True} |
33 | 0.965792 | 0.962250 | 0.962503 | 0.969804 | 0.965087 | 0.003061 | 11 | 0.742578 | 0.741258 | 0.740823 | ... | 0.000969 | 11 | 0.094662 | 0.022660 | 1.285592 | 0.007842 | {'k': 30, 'min_k': 6, 'sim_options': {'name': ... | 30 | 6 | {'name': 'msd', 'user_based': True} |
34 | 0.992647 | 0.987925 | 0.987077 | 0.994693 | 0.990586 | 0.003182 | 24 | 0.767238 | 0.765484 | 0.763152 | ... | 0.001489 | 22 | 0.303433 | 0.015776 | 1.295503 | 0.017618 | {'k': 30, 'min_k': 6, 'sim_options': {'name': ... | 30 | 6 | {'name': 'cosine', 'user_based': True} |
35 | 0.993056 | 0.991231 | 0.987937 | 0.996619 | 0.992211 | 0.003137 | 28 | 0.771380 | 0.770915 | 0.768982 | ... | 0.001373 | 32 | 0.463117 | 0.032395 | 1.272161 | 0.023716 | {'k': 30, 'min_k': 6, 'sim_options': {'name': ... | 30 | 6 | {'name': 'pearson', 'user_based': True} |
36 | 0.971819 | 0.970652 | 0.969774 | 0.976464 | 0.972177 | 0.002579 | 16 | 0.746726 | 0.747165 | 0.745819 | ... | 0.001210 | 16 | 0.088560 | 0.016189 | 1.344052 | 0.002330 | {'k': 40, 'min_k': 1, 'sim_options': {'name': ... | 40 | 1 | {'name': 'msd', 'user_based': True} |
37 | 0.997436 | 0.994927 | 0.993781 | 1.000072 | 0.996554 | 0.002423 | 35 | 0.770404 | 0.770382 | 0.767337 | ... | 0.001468 | 29 | 0.298181 | 0.012399 | 1.355221 | 0.014596 | {'k': 40, 'min_k': 1, 'sim_options': {'name': ... | 40 | 1 | {'name': 'cosine', 'user_based': True} |
38 | 1.001040 | 1.002607 | 1.000866 | 1.007635 | 1.003037 | 0.002740 | 38 | 0.774272 | 0.778207 | 0.774710 | ... | 0.002019 | 41 | 0.462850 | 0.032911 | 1.361154 | 0.021104 | {'k': 40, 'min_k': 1, 'sim_options': {'name': ... | 40 | 1 | {'name': 'pearson', 'user_based': True} |
39 | 0.962718 | 0.961699 | 0.962559 | 0.969493 | 0.964117 | 0.003128 | 9 | 0.741866 | 0.742238 | 0.741934 | ... | 0.001318 | 13 | 0.087653 | 0.008961 | 1.355691 | 0.016998 | {'k': 40, 'min_k': 2, 'sim_options': {'name': ... | 40 | 2 | {'name': 'msd', 'user_based': True} |
40 | 0.988570 | 0.986194 | 0.986742 | 0.993266 | 0.988693 | 0.002783 | 19 | 0.765544 | 0.765455 | 0.763452 | ... | 0.001285 | 21 | 0.292765 | 0.010649 | 1.347102 | 0.018239 | {'k': 40, 'min_k': 2, 'sim_options': {'name': ... | 40 | 2 | {'name': 'cosine', 'user_based': True} |
41 | 0.992392 | 0.991654 | 0.990927 | 0.998822 | 0.993449 | 0.003145 | 31 | 0.770022 | 0.772232 | 0.769408 | ... | 0.001725 | 33 | 0.441962 | 0.022554 | 1.328246 | 0.025217 | {'k': 40, 'min_k': 2, 'sim_options': {'name': ... | 40 | 2 | {'name': 'pearson', 'user_based': True} |
42 | 0.963722 | 0.960441 | 0.962183 | 0.968354 | 0.963675 | 0.002940 | 8 | 0.742017 | 0.741272 | 0.741505 | ... | 0.000907 | 12 | 0.079550 | 0.005020 | 1.356796 | 0.012442 | {'k': 40, 'min_k': 3, 'sim_options': {'name': ... | 40 | 3 | {'name': 'msd', 'user_based': True} |
43 | 0.989031 | 0.984426 | 0.985741 | 0.991637 | 0.987709 | 0.002821 | 17 | 0.765178 | 0.764020 | 0.762626 | ... | 0.001047 | 18 | 0.296294 | 0.009255 | 1.357297 | 0.014308 | {'k': 40, 'min_k': 3, 'sim_options': {'name': ... | 40 | 3 | {'name': 'cosine', 'user_based': True} |
44 | 0.990236 | 0.987880 | 0.988487 | 0.994888 | 0.990373 | 0.002746 | 22 | 0.768519 | 0.769527 | 0.768114 | ... | 0.000995 | 26 | 0.462276 | 0.039154 | 1.383258 | 0.044460 | {'k': 40, 'min_k': 3, 'sim_options': {'name': ... | 40 | 3 | {'name': 'pearson', 'user_based': True} |
45 | 0.968020 | 0.964711 | 0.964822 | 0.972418 | 0.967493 | 0.003139 | 13 | 0.745386 | 0.744003 | 0.743520 | ... | 0.001083 | 15 | 0.079170 | 0.004983 | 1.404382 | 0.033151 | {'k': 40, 'min_k': 6, 'sim_options': {'name': ... | 40 | 6 | {'name': 'msd', 'user_based': True} |
46 | 0.991876 | 0.987135 | 0.986788 | 0.994414 | 0.990053 | 0.003222 | 21 | 0.767478 | 0.765454 | 0.763487 | ... | 0.001523 | 23 | 0.305159 | 0.015380 | 1.412816 | 0.035783 | {'k': 40, 'min_k': 6, 'sim_options': {'name': ... | 40 | 6 | {'name': 'cosine', 'user_based': True} |
47 | 0.992389 | 0.990762 | 0.987319 | 0.996256 | 0.991681 | 0.003213 | 27 | 0.770884 | 0.770516 | 0.768378 | ... | 0.001467 | 30 | 0.429268 | 0.007631 | 1.294838 | 0.043806 | {'k': 40, 'min_k': 6, 'sim_options': {'name': ... | 40 | 6 | {'name': 'pearson', 'user_based': True} |
48 rows × 22 columns
Now we will building final model by using the optimal hyperparameter values which we learned by performing grid search cross validation above.
# using the optimal similarity measure for user-user based collaborative filtering
sim_options = {'name': 'msd',
'user_based': True}
# creating an instance of KNNBasic with optimal hyperparameter values
optimized_KNN_user = KNNBasic(sim_options=sim_options, k=20, min_k=3,Verbose=False)
# training the algorithm on the trainset
optimized_KNN_user.fit(trainset)
# predicting ratings for the testset
predictions = optimized_KNN_user.test(testset)
# computing RMSE on testset
accuracy.rmse(predictions)
Computing the msd similarity matrix... Done computing similarity matrix. RMSE: 0.9529
0.9529100824421844
The above shows us that after tuning hyperparameters RMSE has gone down to 0.9529 from 0.9901, a good improvement.
userId=10
and for movieId=1240
with the optimized model¶optimized_KNN_user.predict(10, 1240, r_ui=4, verbose=True)
user: 10 item: 1240 r_ui = 4.00 est = 4.14 {'actual_k': 20, 'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=4.143332797654944, details={'actual_k': 20, 'was_impossible': False})
Using the baseline KNN model, our predicted score was 4.26, whereas now it is 4.14, which is closer to our real number of 4.
# We identify the 20 nearest neighbors to the user with userId=10:
near_uid_10 = optimized_KNN_user.get_neighbors(10, k=20)
near_uid_10
[13, 19, 22, 66, 86, 116, 124, 160, 179, 184, 207, 212, 230, 239, 269, 294, 295, 300, 301, 304]
We will create a function where the input parameters are:
We can utilize this function for our future algorithms as well.
def get_recommendation(data, user_id, top_n, algo):
# empty list for storing recommended movie ids
recommendations = []
# user-item interactions matrix based on our inputed data
user_item_interactions_matrix = data.pivot(index='userId', columns='movieId', values='rating')
# list of those movie ids which the user_id has not interacted yet (where values are still null)
non_interacted_movies = user_item_interactions_matrix.loc[user_id][user_item_interactions_matrix.loc[user_id].isnull()].index.tolist()
# looping through every movie id which user_id has not interacted yet and performing the following:
for item_id in non_interacted_movies:
# predicting the rating for the non-interacted movie id by this user, based on the inputed algorithm
est = algo.predict(user_id, item_id).est
# appending the predicted ratings to the recommendations list from above
recommendations.append((item_id, est))
# sorting the predicted ratings in descending order
recommendations.sort(key=lambda x: x[1], reverse=True)
# returing top n highest predicted rating movies for this user
return recommendations[:top_n]
recommendations = get_recommendation(rating,10,20,optimized_KNN_user)
recommendations
[(3038, 5), (309, 4.999999999999999), (6669, 4.881355932203389), (98491, 4.821987480438185), (178, 4.784881983866148), (2920, 4.784530386740332), (1860, 4.7713154312585075), (6776, 4.738562091503268), (4783, 4.733784741814604), (5017, 4.733386572357538), (4263, 4.731378922557885), (26326, 4.723524337675515), (7075, 4.704496788008566), (3414, 4.677535050537985), (1192, 4.662620550158588), (41527, 4.649484536082474), (116, 4.646309855193815), (116897, 4.6424191994394315), (2938, 4.6364787840405315), (766, 4.633780069379941)]
In an item-based collaborative filtering recommendation system, we look for similarities between a user's rating of an item and other items. If a user rated one movie highly, the system will look for items rated highly by others who rated that movie highly. The advantage of such a model in real world application is that more frequent computations need to be done for a user-user based model where similarities between every pair of users needs to be computed, and where users are added at a greater rate than items are (not to mention that a user's individual profile is constantly changing as well). Item-item based models are therefore less computationally expensive and they also tend to have more accurate predictions than user-based models in situations when fewer items are rated.
# Defining our similarity parameters. Note that for the item-based similarity, 'user_based: False' while
# it was true for our user-based systems.
sim_options = {'name': 'cosine',
'user_based': False}
# Defining our item-based nearest neighbour algorithm based on KNN
knn_item = KNNBasic(sim_options=sim_options,verbose=False)
# Fitting the model on our train dataset (training our model)
knn_item.fit(trainset)
<surprise.prediction_algorithms.knns.KNNBasic at 0x7f8f7cdf2040>
# Using our trained model to predict the ratings in our testset
predictions = knn_item.test(testset)
Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the pearson similarity matrix... Done computing similarity matrix.
# Computing the RMSE
accuracy.rmse(predictions)
RMSE: 0.9908
0.9908373031023553
The RMSE for the baseline item-based collaborative filter recommendation system is 0.9908 which is almost the same as our user-based model (0.9901).
userId=10
and for movieId=1240
as we did with the other models before¶knn_item.predict(10,1240, r_ui=4, verbose=True)
user: 10 item: 1240 r_ui = 4.00 est = 3.70 {'actual_k': 37, 'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=3.70427765198621, details={'actual_k': 37, 'was_impossible': False})
As we can see - the actual rating for this user-item pair is 4 and predicted rating is 3.7 by this item-based collaborative filter system. For our optimized user-based collaborative filter system, the rating was 4.14.
# setting up our parameter grid as before:
param_grid = {'k': [20, 30, 40, 50], 'min_k': [2, 3, 4],
'sim_options': {'name': ['msd', 'cosine'],
'user_based': [False]}
}
# performing cross validation to tune our hyperparameters based on param_grid above:
grid_obj = GridSearchCV(KNNBasic, param_grid, measures=['rmse', 'mae'], cv=4, n_jobs=-1)
# fitting the data
grid_obj.fit(data)
# best RMSE score
print(grid_obj.best_score['rmse'])
# combination of parameters that gave the best RMSE score
print(grid_obj.best_params['rmse'])
0.9345402719842373 {'k': 50, 'min_k': 4, 'sim_options': {'name': 'msd', 'user_based': False}}
The optimal value for each of our hyperparameters is therefore a K of 50, a min_k of 2, and using msd to calculate similarity.
Let's compare our RMSE and MAE at every split (we had cv=4) to analyze the impact of each value of hyperparameters:
results_df = pd.DataFrame.from_dict(grid_obj.cv_results)
results_df
split0_test_rmse | split1_test_rmse | split2_test_rmse | split3_test_rmse | mean_test_rmse | std_test_rmse | rank_test_rmse | split0_test_mae | split1_test_mae | split2_test_mae | ... | std_test_mae | rank_test_mae | mean_fit_time | std_fit_time | mean_test_time | std_test_time | params | param_k | param_min_k | param_sim_options | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.945706 | 0.942101 | 0.956574 | 0.949785 | 0.948542 | 0.005376 | 12 | 0.730235 | 0.728439 | 0.738704 | ... | 0.003903 | 11 | 11.039914 | 0.222799 | 5.749478 | 0.051576 | {'k': 20, 'min_k': 2, 'sim_options': {'name': ... | 20 | 2 | {'name': 'msd', 'user_based': False} |
1 | 1.015208 | 1.011797 | 1.023078 | 1.019955 | 1.017510 | 0.004328 | 24 | 0.793200 | 0.790785 | 0.798380 | ... | 0.002767 | 22 | 25.839880 | 0.349205 | 5.299843 | 0.061522 | {'k': 20, 'min_k': 2, 'sim_options': {'name': ... | 20 | 2 | {'name': 'cosine', 'user_based': False} |
2 | 0.945772 | 0.942114 | 0.956470 | 0.949573 | 0.948482 | 0.005313 | 11 | 0.730338 | 0.728458 | 0.738737 | ... | 0.003888 | 12 | 5.702658 | 0.250246 | 5.842253 | 0.770926 | {'k': 20, 'min_k': 3, 'sim_options': {'name': ... | 20 | 3 | {'name': 'msd', 'user_based': False} |
3 | 1.015269 | 1.011757 | 1.023001 | 1.019872 | 1.017475 | 0.004297 | 23 | 0.793299 | 0.790762 | 0.798419 | ... | 0.002776 | 24 | 24.241407 | 1.093871 | 5.623170 | 0.153250 | {'k': 20, 'min_k': 3, 'sim_options': {'name': ... | 20 | 3 | {'name': 'cosine', 'user_based': False} |
4 | 0.945643 | 0.942045 | 0.956209 | 0.949800 | 0.948424 | 0.005266 | 10 | 0.730321 | 0.728445 | 0.738548 | ... | 0.003828 | 10 | 8.440819 | 0.401175 | 6.414069 | 0.254052 | {'k': 20, 'min_k': 4, 'sim_options': {'name': ... | 20 | 4 | {'name': 'msd', 'user_based': False} |
5 | 1.015233 | 1.011725 | 1.022785 | 1.020047 | 1.017448 | 0.004269 | 22 | 0.793321 | 0.790755 | 0.798263 | ... | 0.002725 | 23 | 31.129241 | 1.004177 | 5.843478 | 0.108527 | {'k': 20, 'min_k': 4, 'sim_options': {'name': ... | 20 | 4 | {'name': 'cosine', 'user_based': False} |
6 | 0.938093 | 0.933224 | 0.948094 | 0.941103 | 0.940128 | 0.005390 | 9 | 0.723661 | 0.721115 | 0.731421 | ... | 0.003797 | 8 | 6.367453 | 0.173658 | 5.878463 | 0.226006 | {'k': 30, 'min_k': 2, 'sim_options': {'name': ... | 30 | 2 | {'name': 'msd', 'user_based': False} |
7 | 0.999796 | 0.997233 | 1.009743 | 1.006647 | 1.003355 | 0.005044 | 21 | 0.779248 | 0.778057 | 0.786132 | ... | 0.003217 | 19 | 22.142526 | 0.658425 | 5.864040 | 0.161969 | {'k': 30, 'min_k': 2, 'sim_options': {'name': ... | 30 | 2 | {'name': 'cosine', 'user_based': False} |
8 | 0.938159 | 0.933237 | 0.947988 | 0.940889 | 0.940068 | 0.005332 | 8 | 0.723764 | 0.721133 | 0.731453 | ... | 0.003792 | 9 | 5.057507 | 0.327190 | 6.165638 | 0.961311 | {'k': 30, 'min_k': 3, 'sim_options': {'name': ... | 30 | 3 | {'name': 'msd', 'user_based': False} |
9 | 0.999857 | 0.997192 | 1.009664 | 1.006562 | 1.003319 | 0.005008 | 20 | 0.779348 | 0.778034 | 0.786170 | ... | 0.003213 | 21 | 24.888792 | 0.495823 | 5.517469 | 0.095663 | {'k': 30, 'min_k': 3, 'sim_options': {'name': ... | 30 | 3 | {'name': 'cosine', 'user_based': False} |
10 | 0.938030 | 0.933167 | 0.947726 | 0.941118 | 0.940010 | 0.005280 | 7 | 0.723747 | 0.721121 | 0.731265 | ... | 0.003724 | 7 | 4.643938 | 0.084388 | 5.695832 | 0.357415 | {'k': 30, 'min_k': 4, 'sim_options': {'name': ... | 30 | 4 | {'name': 'msd', 'user_based': False} |
11 | 0.999820 | 0.997160 | 1.009446 | 1.006740 | 1.003292 | 0.004985 | 19 | 0.779370 | 0.778027 | 0.786015 | ... | 0.003168 | 20 | 22.822024 | 1.284043 | 5.692456 | 0.223372 | {'k': 30, 'min_k': 4, 'sim_options': {'name': ... | 30 | 4 | {'name': 'cosine', 'user_based': False} |
12 | 0.934353 | 0.929619 | 0.943507 | 0.938002 | 0.936370 | 0.005080 | 6 | 0.720238 | 0.717555 | 0.727392 | ... | 0.003616 | 5 | 3.970753 | 0.438742 | 6.409834 | 0.304561 | {'k': 40, 'min_k': 2, 'sim_options': {'name': ... | 40 | 2 | {'name': 'msd', 'user_based': False} |
13 | 0.993172 | 0.988549 | 1.001167 | 0.997651 | 0.995135 | 0.004742 | 18 | 0.773059 | 0.770761 | 0.778338 | ... | 0.002873 | 16 | 18.918488 | 1.433587 | 6.235717 | 0.652947 | {'k': 40, 'min_k': 2, 'sim_options': {'name': ... | 40 | 2 | {'name': 'cosine', 'user_based': False} |
14 | 0.934419 | 0.929632 | 0.943401 | 0.937787 | 0.936310 | 0.005016 | 5 | 0.720341 | 0.717573 | 0.727425 | ... | 0.003606 | 6 | 5.313690 | 1.430690 | 7.476163 | 0.634561 | {'k': 40, 'min_k': 3, 'sim_options': {'name': ... | 40 | 3 | {'name': 'msd', 'user_based': False} |
15 | 0.993234 | 0.988508 | 1.001088 | 0.997565 | 0.995099 | 0.004714 | 17 | 0.773159 | 0.770738 | 0.778376 | ... | 0.002874 | 18 | 24.812498 | 1.922524 | 6.881592 | 0.284170 | {'k': 40, 'min_k': 3, 'sim_options': {'name': ... | 40 | 3 | {'name': 'cosine', 'user_based': False} |
16 | 0.934289 | 0.929563 | 0.943137 | 0.938017 | 0.936251 | 0.004978 | 4 | 0.720324 | 0.717561 | 0.727237 | ... | 0.003546 | 4 | 7.349227 | 1.143405 | 6.895141 | 0.268848 | {'k': 40, 'min_k': 4, 'sim_options': {'name': ... | 40 | 4 | {'name': 'msd', 'user_based': False} |
17 | 0.993197 | 0.988476 | 1.000868 | 0.997744 | 0.995071 | 0.004684 | 16 | 0.773180 | 0.770731 | 0.778221 | ... | 0.002834 | 17 | 24.626819 | 2.129313 | 7.755744 | 1.674537 | {'k': 40, 'min_k': 4, 'sim_options': {'name': ... | 40 | 4 | {'name': 'cosine', 'user_based': False} |
18 | 0.932615 | 0.928321 | 0.941267 | 0.936434 | 0.934659 | 0.004774 | 3 | 0.718627 | 0.715796 | 0.725237 | ... | 0.003494 | 2 | 7.622799 | 1.045246 | 7.149842 | 0.171801 | {'k': 50, 'min_k': 2, 'sim_options': {'name': ... | 50 | 2 | {'name': 'msd', 'user_based': False} |
19 | 0.988790 | 0.983407 | 0.995726 | 0.992014 | 0.989984 | 0.004521 | 15 | 0.768854 | 0.765924 | 0.773967 | ... | 0.002952 | 13 | 23.518415 | 0.959352 | 7.466235 | 0.824199 | {'k': 50, 'min_k': 2, 'sim_options': {'name': ... | 50 | 2 | {'name': 'cosine', 'user_based': False} |
20 | 0.932682 | 0.928334 | 0.941161 | 0.936219 | 0.934599 | 0.004706 | 2 | 0.718730 | 0.715815 | 0.725269 | ... | 0.003477 | 3 | 7.018627 | 0.287776 | 6.880196 | 0.338007 | {'k': 50, 'min_k': 3, 'sim_options': {'name': ... | 50 | 3 | {'name': 'msd', 'user_based': False} |
21 | 0.988852 | 0.983366 | 0.995647 | 0.991928 | 0.989948 | 0.004498 | 14 | 0.768953 | 0.765901 | 0.774005 | ... | 0.002959 | 15 | 21.669544 | 0.890367 | 6.969265 | 0.630893 | {'k': 50, 'min_k': 3, 'sim_options': {'name': ... | 50 | 3 | {'name': 'cosine', 'user_based': False} |
22 | 0.932552 | 0.928264 | 0.940896 | 0.936449 | 0.934540 | 0.004674 | 1 | 0.718713 | 0.715802 | 0.725081 | ... | 0.003427 | 1 | 5.394106 | 0.511203 | 6.754545 | 0.414349 | {'k': 50, 'min_k': 4, 'sim_options': {'name': ... | 50 | 4 | {'name': 'msd', 'user_based': False} |
23 | 0.988816 | 0.983333 | 0.995425 | 0.992108 | 0.989921 | 0.004464 | 13 | 0.768975 | 0.765894 | 0.773850 | ... | 0.002916 | 14 | 19.356863 | 0.485334 | 6.149450 | 0.598869 | {'k': 50, 'min_k': 4, 'sim_options': {'name': ... | 50 | 4 | {'name': 'cosine', 'user_based': False} |
24 rows × 22 columns
# creating an instance of KNNBasic with optimal hyperparameter values
optimized_KNN_item = KNNBasic(sim_options={'name': 'msd', 'user_based': False}, k=50, min_k=4,verbose=False)
# Fitting the trainset to the algorithm (training our model)
optimized_KNN_item.fit(trainset)
<surprise.prediction_algorithms.knns.KNNBasic at 0x7f8f7cdf2910>
# predicting ratings for the testset based on our model:
predictions = optimized_KNN_item.test(testset)
# Calculating the RMSE
accuracy.rmse(predictions)
RMSE: 0.9296
0.9296272004578772
Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the cosine similarity matrix... Done computing similarity matrix.
After our hyperperamater tuning, RMSE for the testset has improved from 0.9908 (baseline) to 0.9299 (optimized) for the item-based collaborative filter recommendation system, which is stronger than our optimized user-based model (RMSE: 0.9529)
optimized_KNN_item.predict(10,1240, r_ui=4, verbose=True)
user: 10 item: 1240 r_ui = 4.00 est = 3.82 {'actual_k': 37, 'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=3.8240093739711267, details={'actual_k': 37, 'was_impossible': False})
The estimated rating is 3.82, which is better than our unoptimized item-based prediction, which was 3.7.
optimized_KNN_item.predict(10, 105, verbose=True)
user: 10 item: 105 r_ui = None est = 3.67 {'actual_k': 34, 'was_impossible': False}
Prediction(uid=10, iid=105, r_ui=None, est=3.6674504899091827, details={'actual_k': 34, 'was_impossible': False})
The baseline model predicted 3.7, whereas the optimized model is giving an estimated rating of 3.67.
optimized_KNN_item.get_neighbors(10, k=20)
[44, 93, 112, 132, 138, 210, 223, 231, 350, 363, 389, 400, 451, 457, 502, 520, 529, 577, 586, 684]
Note that these are different than the nearest neighbors produced by the optimized_KNN_user algorithm: [13, 19, 22, 66, 86, 116, 124, 160, 179, 184, 207, 212, 230, 239, 269, 294, 295, 300, 301, 304]
recommendations = get_recommendation(rating, 10, 20, optimized_KNN_item)
recommendations
[(78321, 4.8175582990397805), (3158, 4.686746987951807), (3161, 4.666666666666667), (2801, 4.652173913043478), (2837, 4.594594594594595), (3357, 4.594594594594595), (3207, 4.565217391304349), (4972, 4.565217391304349), (26394, 4.565217391304349), (6268, 4.560975609756098), (1870, 4.5), (2388, 4.5), (3790, 4.5), (30883, 4.5), (43177, 4.449993480245142), (6598, 4.433333333333333), (8199, 4.414452709883103), (760, 4.411764705882352), (4568, 4.409090909090908), (6506, 4.409090909090908)]
Compare with those obtained from our user-based recommendation system:
(3038, 5), (309, 4.999999999999999), (6669, 4.881355932203389), (98491, 4.821987480438185), (178, 4.784881983866148), (2920, 4.784530386740332), (1860, 4.7713154312585075), (6776, 4.738562091503268), (4783, 4.733784741814604), (5017, 4.733386572357538), (4263, 4.731378922557885), (26326, 4.723524337675515), (7075, 4.704496788008566), (3414, 4.677535050537985), (1192, 4.662620550158588), (41527, 4.649484536082474), (116, 4.646309855193815), (116897, 4.6424191994394315), (2938, 4.6364787840405315), (766, 4.633780069379941)
Matrix factorization breaks down ("factorizes") the original user-item interaction matrix into component matrices that let us assign latent features to both the items and users to help find recommendations for each user. For example, movies may be broken down into latent features of movie genres (comedy, romance, thriller, action, etc.), and similarly users may assigned to latent features regarding those genres (user1 only likes comedy and action, while user2 only likes comedy and romance, etc.).
There are several ways of factorizing a matrix, including Singular Value Decomposition (SVD), Stochastic Gradient Descent (SGD), and Alternating Least Squares (ALS). For this project, we will use Singular Value Decomposition.
SVD decomposes a user-item matrix into the following three matrixes:
# instantiating SVD matrix factorization:
svd = SVD()
# Fitting the model and training the algorithm:
svd.fit(trainset)
<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7f8f7d9949d0>
# Running the trained algorithm on the testset to predict values
predictions = svd.test(testset)
predictions
# You can see the real rating (r_ui) vs the estimate (est) predicted below:
[Prediction(uid=498, iid=6934, r_ui=3.0, est=2.65314780152513, details={'was_impossible': False}), Prediction(uid=254, iid=287, r_ui=4.0, est=3.1465212523468953, details={'was_impossible': False}), Prediction(uid=577, iid=357, r_ui=4.5, est=4.57062338026445, details={'was_impossible': False}), Prediction(uid=30, iid=1686, r_ui=4.0, est=4.027992576360476, details={'was_impossible': False}), Prediction(uid=28, iid=953, r_ui=2.0, est=4.595855097406188, details={'was_impossible': False}), Prediction(uid=413, iid=708, r_ui=4.0, est=3.578715137977655, details={'was_impossible': False}), Prediction(uid=213, iid=1961, r_ui=2.5, est=3.1162348373534763, details={'was_impossible': False}), Prediction(uid=240, iid=380, r_ui=1.5, est=3.511808457093065, details={'was_impossible': False}), Prediction(uid=644, iid=2858, r_ui=4.0, est=4.3011159258110245, details={'was_impossible': False}), Prediction(uid=458, iid=2571, r_ui=4.5, est=4.235376364622334, details={'was_impossible': False}), Prediction(uid=648, iid=2360, r_ui=4.0, est=3.5906993548477906, details={'was_impossible': False}), Prediction(uid=468, iid=2935, r_ui=3.5, est=3.251395024375941, details={'was_impossible': False}), Prediction(uid=15, iid=4033, r_ui=4.0, est=2.4490133505190728, details={'was_impossible': False}), Prediction(uid=91, iid=1270, r_ui=5.0, est=4.799798895092669, details={'was_impossible': False}), Prediction(uid=607, iid=185, r_ui=3.0, est=3.0911351654005506, details={'was_impossible': False}), Prediction(uid=472, iid=3363, r_ui=4.0, est=4.0332817425457845, details={'was_impossible': False}), Prediction(uid=74, iid=1377, r_ui=4.0, est=3.7208206247662754, details={'was_impossible': False}), Prediction(uid=418, iid=4085, r_ui=3.5, est=3.7928636750251368, details={'was_impossible': False}), Prediction(uid=140, iid=1197, r_ui=3.5, est=4.219938210602982, details={'was_impossible': False}), Prediction(uid=461, iid=2473, r_ui=1.0, est=2.1900752576817193, details={'was_impossible': False}), Prediction(uid=658, iid=1271, r_ui=5.0, est=4.1564438718348375, details={'was_impossible': False}), Prediction(uid=133, iid=6953, r_ui=1.5, est=2.2900909972292616, details={'was_impossible': False}), Prediction(uid=552, iid=466, r_ui=3.0, est=2.633532545771328, details={'was_impossible': False}), Prediction(uid=220, iid=3053, r_ui=4.0, est=3.1262860012005405, details={'was_impossible': False}), Prediction(uid=654, iid=4993, r_ui=5.0, est=5, details={'was_impossible': False}), Prediction(uid=442, iid=8207, r_ui=4.5, est=4.020503012803998, details={'was_impossible': False}), Prediction(uid=584, iid=519, r_ui=2.0, est=3.1211959350963117, details={'was_impossible': False}), Prediction(uid=648, iid=47, r_ui=3.0, est=3.571780556881917, details={'was_impossible': False}), Prediction(uid=460, iid=2997, r_ui=5.0, est=3.593246432480685, details={'was_impossible': False}), Prediction(uid=88, iid=1307, r_ui=4.0, est=3.6648269155885567, details={'was_impossible': False}), Prediction(uid=7, iid=1394, r_ui=3.0, est=3.586495343004258, details={'was_impossible': False}), Prediction(uid=580, iid=1377, r_ui=3.5, est=2.950133700512026, details={'was_impossible': False}), Prediction(uid=15, iid=4210, r_ui=4.5, est=2.48151680541915, details={'was_impossible': False}), Prediction(uid=468, iid=912, r_ui=3.5, est=3.8618033876298945, details={'was_impossible': False}), Prediction(uid=500, iid=69406, r_ui=4.0, est=3.29467040158341, details={'was_impossible': False}), Prediction(uid=547, iid=4116, r_ui=4.0, est=3.192096429640739, details={'was_impossible': False}), Prediction(uid=110, iid=34, r_ui=3.0, est=4.060097070133576, details={'was_impossible': False}), Prediction(uid=176, iid=50068, r_ui=5.0, est=2.8574154718413314, details={'was_impossible': False}), Prediction(uid=468, iid=6662, r_ui=3.0, est=3.0439583661528284, details={'was_impossible': False}), Prediction(uid=621, iid=2926, r_ui=5.0, est=4.002198166476512, details={'was_impossible': False}), Prediction(uid=243, iid=7925, r_ui=3.5, est=3.4364840116960518, details={'was_impossible': False}), Prediction(uid=17, iid=7044, r_ui=2.5, est=3.490579234208077, details={'was_impossible': False}), Prediction(uid=15, iid=2161, r_ui=3.0, est=2.4707896831059255, details={'was_impossible': False}), Prediction(uid=56, iid=48982, r_ui=4.0, est=3.5206199331880437, details={'was_impossible': False}), Prediction(uid=34, iid=3062, r_ui=4.0, est=3.8674844274419544, details={'was_impossible': False}), Prediction(uid=73, iid=141866, r_ui=4.0, est=3.3809586172251667, details={'was_impossible': False}), Prediction(uid=73, iid=2901, r_ui=3.5, est=3.6074793249535655, details={'was_impossible': False}), Prediction(uid=366, iid=46578, r_ui=4.0, est=3.6216177759201695, details={'was_impossible': False}), Prediction(uid=195, iid=3361, r_ui=3.0, est=2.964145092934821, details={'was_impossible': False}), Prediction(uid=103, iid=1552, r_ui=2.0, est=3.1113659566890375, details={'was_impossible': False}), Prediction(uid=39, iid=338, r_ui=3.0, est=3.4458570836971956, details={'was_impossible': False}), Prediction(uid=664, iid=84601, r_ui=3.5, est=3.7065472776315866, details={'was_impossible': False}), Prediction(uid=128, iid=225, r_ui=3.0, est=3.7778989414427135, details={'was_impossible': False}), Prediction(uid=491, iid=2013, r_ui=5.0, est=3.143017250818892, details={'was_impossible': False}), Prediction(uid=609, iid=2267, r_ui=5.0, est=2.2045252289854065, details={'was_impossible': False}), Prediction(uid=295, iid=3510, r_ui=3.5, est=3.846913215666073, details={'was_impossible': False}), Prediction(uid=297, iid=608, r_ui=2.0, est=3.867472059364122, details={'was_impossible': False}), Prediction(uid=624, iid=480, r_ui=3.0, est=3.5579094553548374, details={'was_impossible': False}), Prediction(uid=3, iid=593, r_ui=3.0, est=4.1246786664328745, details={'was_impossible': False}), Prediction(uid=159, iid=913, r_ui=4.0, est=4.036180133419946, details={'was_impossible': False}), Prediction(uid=587, iid=2243, r_ui=4.5, est=4.109355971859891, details={'was_impossible': False}), Prediction(uid=159, iid=2688, r_ui=3.5, est=3.0018160980925606, details={'was_impossible': False}), Prediction(uid=56, iid=2502, r_ui=4.0, est=4.177320824532177, details={'was_impossible': False}), Prediction(uid=4, iid=1298, r_ui=4.0, est=4.476719534357683, details={'was_impossible': False}), Prediction(uid=300, iid=4022, r_ui=4.5, est=4.061164172595064, details={'was_impossible': False}), Prediction(uid=23, iid=805, r_ui=3.5, est=3.5233916807766876, details={'was_impossible': False}), Prediction(uid=150, iid=3039, r_ui=3.5, est=3.4710010975401273, details={'was_impossible': False}), Prediction(uid=605, iid=1474, r_ui=2.0, est=2.4485623728995325, details={'was_impossible': False}), Prediction(uid=194, iid=172, r_ui=3.0, est=2.836745778166533, details={'was_impossible': False}), Prediction(uid=137, iid=1848, r_ui=2.0, est=3.2514441443056485, details={'was_impossible': False}), Prediction(uid=505, iid=2288, r_ui=3.0, est=3.6002553068032004, details={'was_impossible': False}), Prediction(uid=231, iid=1240, r_ui=5.0, est=4.060015377874343, details={'was_impossible': False}), Prediction(uid=642, iid=1597, r_ui=2.0, est=3.625507962552107, details={'was_impossible': False}), Prediction(uid=580, iid=2080, r_ui=3.5, est=3.630818533941275, details={'was_impossible': False}), Prediction(uid=418, iid=4155, r_ui=3.0, est=3.481554461260737, details={'was_impossible': False}), Prediction(uid=466, iid=1220, r_ui=4.0, est=4.443251988305794, details={'was_impossible': False}), Prediction(uid=550, iid=1291, r_ui=4.0, est=4.067989093643902, details={'was_impossible': False}), Prediction(uid=584, iid=8665, r_ui=5.0, est=4.255661799467348, details={'was_impossible': False}), Prediction(uid=452, iid=457, r_ui=4.0, est=4.062370073398119, details={'was_impossible': False}), Prediction(uid=641, iid=1391, r_ui=3.0, est=3.583021678284209, details={'was_impossible': False}), Prediction(uid=584, iid=527, r_ui=5.0, est=4.694969192684612, details={'was_impossible': False}), Prediction(uid=468, iid=25916, r_ui=2.5, est=2.870895378361822, details={'was_impossible': False}), Prediction(uid=534, iid=479, r_ui=4.0, est=3.655682048618046, details={'was_impossible': False}), Prediction(uid=128, iid=509, r_ui=5.0, est=4.314020113817726, details={'was_impossible': False}), Prediction(uid=205, iid=42723, r_ui=0.5, est=3.0504961101700148, details={'was_impossible': False}), Prediction(uid=452, iid=4638, r_ui=1.0, est=2.6675256625023653, details={'was_impossible': False}), Prediction(uid=12, iid=1215, r_ui=5.0, est=2.9243010881392695, details={'was_impossible': False}), Prediction(uid=15, iid=70862, r_ui=2.5, est=2.619036094117699, details={'was_impossible': False}), Prediction(uid=529, iid=920, r_ui=4.0, est=3.4996465508187873, details={'was_impossible': False}), Prediction(uid=577, iid=8387, r_ui=1.0, est=3.7141494136683684, details={'was_impossible': False}), Prediction(uid=575, iid=4557, r_ui=3.0, est=3.124769472480914, details={'was_impossible': False}), Prediction(uid=104, iid=80219, r_ui=4.0, est=4.048383966162777, details={'was_impossible': False}), Prediction(uid=190, iid=2861, r_ui=3.0, est=3.6086541969483283, details={'was_impossible': False}), Prediction(uid=522, iid=99145, r_ui=3.5, est=3.1499738918831484, details={'was_impossible': False}), Prediction(uid=88, iid=497, r_ui=4.0, est=3.691445119183034, details={'was_impossible': False}), Prediction(uid=25, iid=1354, r_ui=1.0, est=3.656271607837323, details={'was_impossible': False}), Prediction(uid=624, iid=7132, r_ui=5.0, est=3.3137824751969536, details={'was_impossible': False}), Prediction(uid=461, iid=1247, r_ui=3.0, est=4.0903446223210755, details={'was_impossible': False}), Prediction(uid=547, iid=334, r_ui=1.0, est=3.2630963846474055, details={'was_impossible': False}), Prediction(uid=328, iid=141, r_ui=3.5, est=3.517054310266442, details={'was_impossible': False}), Prediction(uid=452, iid=349, r_ui=4.0, est=3.1624515017553017, details={'was_impossible': False}), Prediction(uid=468, iid=7147, r_ui=3.0, est=3.3941708910957606, details={'was_impossible': False}), Prediction(uid=345, iid=260, r_ui=4.0, est=4.242743731387357, details={'was_impossible': False}), Prediction(uid=423, iid=2174, r_ui=2.0, est=3.637595831095281, details={'was_impossible': False}), Prediction(uid=568, iid=733, r_ui=4.0, est=4.308785272571628, details={'was_impossible': False}), Prediction(uid=196, iid=3618, r_ui=2.0, est=3.689719378837985, details={'was_impossible': False}), Prediction(uid=652, iid=7260, r_ui=5.0, est=4.192193538565518, details={'was_impossible': False}), Prediction(uid=31, iid=55280, r_ui=4.5, est=4.0433373103669945, details={'was_impossible': False}), Prediction(uid=457, iid=145, r_ui=2.0, est=2.062460757370408, details={'was_impossible': False}), Prediction(uid=236, iid=1222, r_ui=3.5, est=4.271277040602393, details={'was_impossible': False}), Prediction(uid=73, iid=1704, r_ui=4.0, est=3.996455805622231, details={'was_impossible': False}), Prediction(uid=664, iid=733, r_ui=4.0, est=3.8008204910386483, details={'was_impossible': False}), Prediction(uid=136, iid=500, r_ui=3.5, est=3.7033672181768686, details={'was_impossible': False}), Prediction(uid=575, iid=3600, r_ui=4.0, est=3.416727389030527, details={'was_impossible': False}), Prediction(uid=426, iid=8368, r_ui=4.0, est=3.516459413787746, details={'was_impossible': False}), Prediction(uid=311, iid=916, r_ui=4.5, est=3.643075265223524, details={'was_impossible': False}), Prediction(uid=358, iid=12, r_ui=1.0, est=2.5778296100577744, details={'was_impossible': False}), Prediction(uid=667, iid=272, r_ui=5.0, est=3.953142870332538, details={'was_impossible': False}), Prediction(uid=28, iid=920, r_ui=5.0, est=4.671259706468373, details={'was_impossible': False}), Prediction(uid=442, iid=3448, r_ui=4.5, est=4.298653422168819, details={'was_impossible': False}), Prediction(uid=273, iid=59315, r_ui=4.0, est=4.665979898425199, details={'was_impossible': False}), Prediction(uid=390, iid=208, r_ui=3.0, est=2.6539590073804447, details={'was_impossible': False}), Prediction(uid=564, iid=2449, r_ui=1.0, est=3.453878805503145, details={'was_impossible': False}), Prediction(uid=355, iid=3300, r_ui=4.0, est=3.4089207336555125, details={'was_impossible': False}), Prediction(uid=73, iid=8914, r_ui=3.0, est=3.157949703054485, details={'was_impossible': False}), Prediction(uid=312, iid=1777, r_ui=2.0, est=3.0632297843737097, details={'was_impossible': False}), Prediction(uid=238, iid=6942, r_ui=2.5, est=3.6763414595841484, details={'was_impossible': False}), Prediction(uid=547, iid=99145, r_ui=4.5, est=3.379596107939648, details={'was_impossible': False}), Prediction(uid=660, iid=60069, r_ui=5.0, est=4.105524317842056, details={'was_impossible': False}), Prediction(uid=232, iid=1135, r_ui=4.0, est=3.970943604873557, details={'was_impossible': False}), Prediction(uid=442, iid=52435, r_ui=4.0, est=4.669754288522899, details={'was_impossible': False}), Prediction(uid=472, iid=45501, r_ui=3.0, est=3.3306024221544988, details={'was_impossible': False}), Prediction(uid=595, iid=1916, r_ui=5.0, est=3.8338168942430664, details={'was_impossible': False}), Prediction(uid=185, iid=2018, r_ui=3.0, est=3.7995185465855696, details={'was_impossible': False}), Prediction(uid=358, iid=1566, r_ui=4.0, est=2.7907099452124617, details={'was_impossible': False}), Prediction(uid=575, iid=2174, r_ui=2.0, est=3.5879822172467932, details={'was_impossible': False}), Prediction(uid=73, iid=48, r_ui=2.0, est=2.9216982481444305, details={'was_impossible': False}), Prediction(uid=379, iid=527, r_ui=4.5, est=4.301536633691579, details={'was_impossible': False}), Prediction(uid=283, iid=3746, r_ui=5.0, est=3.429519831514789, details={'was_impossible': False}), Prediction(uid=61, iid=53464, r_ui=2.5, est=2.7566356789992086, details={'was_impossible': False}), Prediction(uid=433, iid=4776, r_ui=3.0, est=3.332512420544828, details={'was_impossible': False}), Prediction(uid=295, iid=628, r_ui=4.0, est=4.300611500122906, details={'was_impossible': False}), Prediction(uid=150, iid=5502, r_ui=2.0, est=3.0487213983598114, details={'was_impossible': False}), Prediction(uid=654, iid=2947, r_ui=4.5, est=4.20848921669169, details={'was_impossible': False}), Prediction(uid=307, iid=6322, r_ui=3.5, est=3.6796981029365656, details={'was_impossible': False}), Prediction(uid=97, iid=1084, r_ui=2.5, est=3.4283322685235245, details={'was_impossible': False}), Prediction(uid=189, iid=3869, r_ui=3.0, est=3.114729782351914, details={'was_impossible': False}), Prediction(uid=575, iid=1835, r_ui=2.0, est=3.383638400993034, details={'was_impossible': False}), Prediction(uid=465, iid=4246, r_ui=4.5, est=3.7569354905307395, details={'was_impossible': False}), Prediction(uid=242, iid=1343, r_ui=4.0, est=4.34187045475172, details={'was_impossible': False}), Prediction(uid=84, iid=8961, r_ui=4.0, est=3.867742274253184, details={'was_impossible': False}), Prediction(uid=165, iid=1625, r_ui=3.0, est=3.1495805652709317, details={'was_impossible': False}), Prediction(uid=30, iid=5015, r_ui=4.0, est=3.2013784552469042, details={'was_impossible': False}), Prediction(uid=142, iid=3534, r_ui=3.0, est=2.4207033366165347, details={'was_impossible': False}), Prediction(uid=384, iid=55232, r_ui=3.0, est=3.104873895270951, details={'was_impossible': False}), Prediction(uid=219, iid=2692, r_ui=5.0, est=3.689039724928956, details={'was_impossible': False}), Prediction(uid=98, iid=71535, r_ui=5.0, est=3.7844627171019924, details={'was_impossible': False}), Prediction(uid=585, iid=1262, r_ui=5.0, est=4.388572240461743, details={'was_impossible': False}), Prediction(uid=598, iid=39, r_ui=3.0, est=3.534671267231176, details={'was_impossible': False}), Prediction(uid=324, iid=68319, r_ui=3.0, est=3.221675589519946, details={'was_impossible': False}), Prediction(uid=636, iid=640, r_ui=4.0, est=3.322003689111413, details={'was_impossible': False}), Prediction(uid=275, iid=80906, r_ui=5.0, est=4.267002402543054, details={'was_impossible': False}), Prediction(uid=569, iid=1956, r_ui=3.0, est=3.7367159602720745, details={'was_impossible': False}), Prediction(uid=19, iid=366, r_ui=3.0, est=3.346306041837432, details={'was_impossible': False}), Prediction(uid=148, iid=1234, r_ui=4.0, est=4.387754024699763, details={'was_impossible': False}), Prediction(uid=547, iid=53000, r_ui=4.0, est=3.297479593722119, details={'was_impossible': False}), Prediction(uid=262, iid=6016, r_ui=2.0, est=3.2878521219441135, details={'was_impossible': False}), Prediction(uid=426, iid=1580, r_ui=3.0, est=3.5619366156518733, details={'was_impossible': False}), Prediction(uid=101, iid=1270, r_ui=4.5, est=4.084509501588117, details={'was_impossible': False}), Prediction(uid=244, iid=8783, r_ui=2.0, est=3.0156637647201974, details={'was_impossible': False}), Prediction(uid=624, iid=2822, r_ui=2.0, est=2.4314202510956426, details={'was_impossible': False}), Prediction(uid=417, iid=2359, r_ui=4.0, est=4.152230126159849, details={'was_impossible': False}), Prediction(uid=33, iid=1407, r_ui=4.0, est=3.118694681107469, details={'was_impossible': False}), Prediction(uid=379, iid=46578, r_ui=1.5, est=3.9009435243122477, details={'was_impossible': False}), Prediction(uid=452, iid=2269, r_ui=2.0, est=2.6976254090944574, details={'was_impossible': False}), Prediction(uid=611, iid=105504, r_ui=4.0, est=3.9088585060409344, details={'was_impossible': False}), Prediction(uid=300, iid=5445, r_ui=4.5, est=4.365661598650518, details={'was_impossible': False}), Prediction(uid=19, iid=537, r_ui=3.0, est=3.4683067824098894, details={'was_impossible': False}), Prediction(uid=514, iid=653, r_ui=4.0, est=2.6847555332274897, details={'was_impossible': False}), Prediction(uid=564, iid=2716, r_ui=3.0, est=4.250017732627699, details={'was_impossible': False}), Prediction(uid=388, iid=1266, r_ui=5.0, est=4.033358592838185, details={'was_impossible': False}), Prediction(uid=472, iid=3811, r_ui=4.0, est=3.81921579998523, details={'was_impossible': False}), Prediction(uid=49, iid=2414, r_ui=3.0, est=3.1654965097908945, details={'was_impossible': False}), Prediction(uid=466, iid=2571, r_ui=2.0, est=4.23282069556359, details={'was_impossible': False}), Prediction(uid=468, iid=3988, r_ui=1.5, est=2.461836522894693, details={'was_impossible': False}), Prediction(uid=363, iid=480, r_ui=4.0, est=3.881963202286653, details={'was_impossible': False}), Prediction(uid=252, iid=104, r_ui=3.0, est=3.6356393515395204, details={'was_impossible': False}), Prediction(uid=128, iid=2888, r_ui=5.0, est=3.160058967030277, details={'was_impossible': False}), Prediction(uid=312, iid=1344, r_ui=4.0, est=3.406794398605453, details={'was_impossible': False}), Prediction(uid=545, iid=515, r_ui=5.0, est=4.203117421757151, details={'was_impossible': False}), Prediction(uid=15, iid=79293, r_ui=1.5, est=2.4792467477110267, details={'was_impossible': False}), Prediction(uid=599, iid=7316, r_ui=4.0, est=3.3374342454247246, details={'was_impossible': False}), Prediction(uid=442, iid=231, r_ui=4.5, est=3.7427716724030184, details={'was_impossible': False}), Prediction(uid=561, iid=41566, r_ui=4.0, est=3.915244543664606, details={'was_impossible': False}), Prediction(uid=77, iid=741, r_ui=4.0, est=3.8572956782116807, details={'was_impossible': False}), Prediction(uid=457, iid=253, r_ui=3.0, est=2.212334376873735, details={'was_impossible': False}), Prediction(uid=592, iid=260, r_ui=5.0, est=3.988321931869019, details={'was_impossible': False}), Prediction(uid=441, iid=1208, r_ui=4.0, est=3.8559677904768797, details={'was_impossible': False}), Prediction(uid=547, iid=98473, r_ui=3.5, est=3.3517913588846695, details={'was_impossible': False}), Prediction(uid=547, iid=307, r_ui=3.0, est=3.5712226004928005, details={'was_impossible': False}), Prediction(uid=297, iid=3421, r_ui=3.0, est=3.616649470377634, details={'was_impossible': False}), Prediction(uid=287, iid=8361, r_ui=4.5, est=4.308125821767803, details={'was_impossible': False}), Prediction(uid=245, iid=337, r_ui=3.0, est=3.7532701848922922, details={'was_impossible': False}), Prediction(uid=99, iid=1920, r_ui=3.0, est=2.266314607448167, details={'was_impossible': False}), Prediction(uid=384, iid=1562, r_ui=3.0, est=2.433371338455836, details={'was_impossible': False}), Prediction(uid=338, iid=225, r_ui=3.0, est=3.204552053179155, details={'was_impossible': False}), Prediction(uid=17, iid=1732, r_ui=4.0, est=4.098704274947308, details={'was_impossible': False}), Prediction(uid=194, iid=590, r_ui=4.0, est=3.7736356869152137, details={'was_impossible': False}), Prediction(uid=580, iid=6863, r_ui=3.0, est=3.573765641594622, details={'was_impossible': False}), Prediction(uid=388, iid=2420, r_ui=4.0, est=3.2725227316472347, details={'was_impossible': False}), Prediction(uid=334, iid=85342, r_ui=4.0, est=3.494043346274895, details={'was_impossible': False}), Prediction(uid=353, iid=586, r_ui=2.5, est=2.511774606473722, details={'was_impossible': False}), Prediction(uid=358, iid=1674, r_ui=4.0, est=3.827557776909874, details={'was_impossible': False}), Prediction(uid=103, iid=7438, r_ui=5.0, est=3.584123879102968, details={'was_impossible': False}), Prediction(uid=110, iid=719, r_ui=5.0, est=3.519642512219128, details={'was_impossible': False}), Prediction(uid=141, iid=3513, r_ui=2.0, est=3.203257530611481, details={'was_impossible': False}), Prediction(uid=564, iid=149, r_ui=5.0, est=3.522980918471515, details={'was_impossible': False}), Prediction(uid=409, iid=290, r_ui=5.0, est=4.225548798361855, details={'was_impossible': False}), Prediction(uid=468, iid=6408, r_ui=3.0, est=2.870895378361822, details={'was_impossible': False}), Prediction(uid=468, iid=2455, r_ui=2.5, est=2.9650243983114875, details={'was_impossible': False}), Prediction(uid=261, iid=1584, r_ui=5.0, est=3.988830151952478, details={'was_impossible': False}), Prediction(uid=333, iid=2918, r_ui=4.0, est=4.594710880517065, details={'was_impossible': False}), Prediction(uid=299, iid=6162, r_ui=1.5, est=4.051075034500921, details={'was_impossible': False}), Prediction(uid=297, iid=1232, r_ui=5.0, est=3.556411661054438, details={'was_impossible': False}), Prediction(uid=388, iid=62, r_ui=3.0, est=3.131624137553683, details={'was_impossible': False}), Prediction(uid=15, iid=4225, r_ui=1.0, est=2.7133672579574193, details={'was_impossible': False}), Prediction(uid=463, iid=4973, r_ui=4.0, est=3.672587563654358, details={'was_impossible': False}), Prediction(uid=433, iid=34162, r_ui=3.0, est=3.5828780331384738, details={'was_impossible': False}), Prediction(uid=214, iid=2692, r_ui=5.0, est=4.331774531309706, details={'was_impossible': False}), Prediction(uid=205, iid=10, r_ui=3.5, est=3.4513989597018404, details={'was_impossible': False}), Prediction(uid=509, iid=3556, r_ui=3.0, est=3.6077132594880075, details={'was_impossible': False}), Prediction(uid=26, iid=4027, r_ui=3.0, est=3.8146985717157165, details={'was_impossible': False}), Prediction(uid=165, iid=8372, r_ui=1.0, est=2.4903050936145372, details={'was_impossible': False}), Prediction(uid=88, iid=509, r_ui=5.0, est=3.166439227710828, details={'was_impossible': False}), Prediction(uid=365, iid=3753, r_ui=4.0, est=3.6132940323702134, details={'was_impossible': False}), Prediction(uid=232, iid=1259, r_ui=4.0, est=4.224999838173461, details={'was_impossible': False}), Prediction(uid=501, iid=111, r_ui=0.5, est=4.10319756865551, details={'was_impossible': False}), Prediction(uid=365, iid=4995, r_ui=5.0, est=4.278972353999045, details={'was_impossible': False}), Prediction(uid=152, iid=36529, r_ui=3.5, est=3.465860519460434, details={'was_impossible': False}), Prediction(uid=514, iid=288, r_ui=5.0, est=3.4746992633280582, details={'was_impossible': False}), Prediction(uid=275, iid=102194, r_ui=5.0, est=4.2391584155810955, details={'was_impossible': False}), Prediction(uid=165, iid=1500, r_ui=4.0, est=3.4438187571312318, details={'was_impossible': False}), Prediction(uid=428, iid=10, r_ui=2.5, est=3.1499543298752277, details={'was_impossible': False}), Prediction(uid=121, iid=587, r_ui=3.0, est=3.524645407776588, details={'was_impossible': False}), Prediction(uid=509, iid=1836, r_ui=3.0, est=2.878843666137257, details={'was_impossible': False}), Prediction(uid=547, iid=7415, r_ui=2.5, est=3.3517913588846695, details={'was_impossible': False}), Prediction(uid=432, iid=32587, r_ui=5.0, est=4.574035391814301, details={'was_impossible': False}), Prediction(uid=483, iid=3527, r_ui=5.0, est=2.9813588150946693, details={'was_impossible': False}), Prediction(uid=177, iid=1831, r_ui=5.0, est=3.0267937124155013, details={'was_impossible': False}), Prediction(uid=299, iid=3160, r_ui=3.5, est=4.047613898317044, details={'was_impossible': False}), Prediction(uid=313, iid=367, r_ui=3.5, est=3.1404840798527878, details={'was_impossible': False}), Prediction(uid=654, iid=588, r_ui=4.5, est=4.122087541688773, details={'was_impossible': False}), Prediction(uid=575, iid=1250, r_ui=3.0, est=3.7149110427778806, details={'was_impossible': False}), Prediction(uid=281, iid=26131, r_ui=3.5, est=4.04593102316094, details={'was_impossible': False}), Prediction(uid=564, iid=211, r_ui=3.0, est=3.7022931695681236, details={'was_impossible': False}), Prediction(uid=624, iid=37211, r_ui=3.0, est=3.109167210539951, details={'was_impossible': False}), Prediction(uid=461, iid=3452, r_ui=2.5, est=2.7453839127766613, details={'was_impossible': False}), Prediction(uid=542, iid=60069, r_ui=4.5, est=3.949155271666814, details={'was_impossible': False}), Prediction(uid=452, iid=3088, r_ui=4.0, est=3.825775800874759, details={'was_impossible': False}), Prediction(uid=621, iid=1136, r_ui=4.5, est=4.311319962824862, details={'was_impossible': False}), Prediction(uid=379, iid=30707, r_ui=3.5, est=3.8257839011932355, details={'was_impossible': False}), Prediction(uid=475, iid=48385, r_ui=3.0, est=2.6642287241870584, details={'was_impossible': False}), Prediction(uid=268, iid=34, r_ui=4.5, est=3.75098777151399, details={'was_impossible': False}), Prediction(uid=295, iid=924, r_ui=4.0, est=4.098464983127802, details={'was_impossible': False}), Prediction(uid=96, iid=7587, r_ui=4.0, est=3.818811679194871, details={'was_impossible': False}), Prediction(uid=621, iid=2407, r_ui=2.5, est=3.555829693294888, details={'was_impossible': False}), Prediction(uid=346, iid=344, r_ui=3.5, est=3.2695836395273234, details={'was_impossible': False}), Prediction(uid=480, iid=589, r_ui=5.0, est=4.074604108191192, details={'was_impossible': False}), Prediction(uid=97, iid=5902, r_ui=2.5, est=2.983406111284946, details={'was_impossible': False}), Prediction(uid=564, iid=1297, r_ui=4.0, est=3.8304150374165284, details={'was_impossible': False}), Prediction(uid=405, iid=1958, r_ui=4.0, est=4.059267620941119, details={'was_impossible': False}), Prediction(uid=624, iid=129659, r_ui=3.5, est=2.9299471488604962, details={'was_impossible': False}), Prediction(uid=483, iid=39183, r_ui=2.0, est=3.442722416613096, details={'was_impossible': False}), Prediction(uid=417, iid=1210, r_ui=4.0, est=4.023917520800714, details={'was_impossible': False}), Prediction(uid=324, iid=52435, r_ui=4.5, est=3.76881597597719, details={'was_impossible': False}), Prediction(uid=530, iid=647, r_ui=4.0, est=3.874801908050126, details={'was_impossible': False}), Prediction(uid=412, iid=2396, r_ui=3.0, est=3.06969555083731, details={'was_impossible': False}), Prediction(uid=3, iid=866, r_ui=3.0, est=3.620448091260451, details={'was_impossible': False}), Prediction(uid=588, iid=318, r_ui=5.0, est=4.377785068753205, details={'was_impossible': False}), Prediction(uid=186, iid=2571, r_ui=5.0, est=4.5206825994638695, details={'was_impossible': False}), Prediction(uid=447, iid=410, r_ui=3.0, est=2.799664851825488, details={'was_impossible': False}), Prediction(uid=59, iid=2997, r_ui=1.5, est=3.4818613205213653, details={'was_impossible': False}), Prediction(uid=641, iid=514, r_ui=3.0, est=4.252381854188857, details={'was_impossible': False}), Prediction(uid=472, iid=1796, r_ui=2.0, est=3.6649533760607853, details={'was_impossible': False}), Prediction(uid=564, iid=580, r_ui=4.0, est=3.3020935549589505, details={'was_impossible': False}), Prediction(uid=651, iid=381, r_ui=3.0, est=3.4111046921998436, details={'was_impossible': False}), Prediction(uid=251, iid=47099, r_ui=4.5, est=4.33706285216852, details={'was_impossible': False}), Prediction(uid=30, iid=2490, r_ui=5.0, est=3.9276239968655715, details={'was_impossible': False}), Prediction(uid=46, iid=2379, r_ui=5.0, est=4.116131330085319, details={'was_impossible': False}), Prediction(uid=594, iid=2706, r_ui=3.0, est=3.437592043450384, details={'was_impossible': False}), Prediction(uid=505, iid=2750, r_ui=3.5, est=3.132065250218392, details={'was_impossible': False}), Prediction(uid=73, iid=2692, r_ui=4.5, est=4.304738769064051, details={'was_impossible': False}), Prediction(uid=597, iid=1909, r_ui=4.0, est=3.703083193899477, details={'was_impossible': False}), Prediction(uid=522, iid=3409, r_ui=2.5, est=2.9184160471757603, details={'was_impossible': False}), Prediction(uid=431, iid=3481, r_ui=5.0, est=4.198167018839633, details={'was_impossible': False}), Prediction(uid=88, iid=357, r_ui=5.0, est=3.2764052836102584, details={'was_impossible': False}), Prediction(uid=480, iid=27867, r_ui=4.0, est=3.874174112781681, details={'was_impossible': False}), Prediction(uid=358, iid=11, r_ui=3.0, est=2.822939225069593, details={'was_impossible': False}), Prediction(uid=285, iid=1213, r_ui=5.0, est=3.8934261132288794, details={'was_impossible': False}), Prediction(uid=380, iid=72489, r_ui=4.0, est=3.0652183501967216, details={'was_impossible': False}), Prediction(uid=402, iid=5066, r_ui=4.0, est=3.5276720976066853, details={'was_impossible': False}), Prediction(uid=544, iid=55820, r_ui=4.5, est=4.826751716721645, details={'was_impossible': False}), Prediction(uid=264, iid=1244, r_ui=5.0, est=3.893145937013814, details={'was_impossible': False}), Prediction(uid=26, iid=49272, r_ui=4.0, est=3.462614001754889, details={'was_impossible': False}), Prediction(uid=532, iid=466, r_ui=4.0, est=3.2341780047257207, details={'was_impossible': False}), Prediction(uid=30, iid=1584, r_ui=5.0, est=3.760639184461441, details={'was_impossible': False}), Prediction(uid=501, iid=74458, r_ui=5.0, est=3.9475760232003445, details={'was_impossible': False}), Prediction(uid=479, iid=8464, r_ui=4.0, est=3.907024751683165, details={'was_impossible': False}), Prediction(uid=574, iid=1370, r_ui=3.0, est=3.4864276526412192, details={'was_impossible': False}), Prediction(uid=472, iid=1041, r_ui=4.0, est=3.630696964783949, details={'was_impossible': False}), Prediction(uid=505, iid=1265, r_ui=3.0, est=3.175580000036824, details={'was_impossible': False}), Prediction(uid=393, iid=753, r_ui=3.5, est=3.154471839609262, details={'was_impossible': False}), Prediction(uid=114, iid=153, r_ui=3.0, est=3.1174676780313826, details={'was_impossible': False}), Prediction(uid=472, iid=390, r_ui=5.0, est=3.7791579379777502, details={'was_impossible': False}), Prediction(uid=12, iid=3408, r_ui=4.0, est=2.8277128701550285, details={'was_impossible': False}), Prediction(uid=656, iid=2987, r_ui=4.0, est=4.612663533272809, details={'was_impossible': False}), Prediction(uid=575, iid=2096, r_ui=4.0, est=3.546482610078715, details={'was_impossible': False}), Prediction(uid=428, iid=1244, r_ui=5.0, est=4.106937419650494, details={'was_impossible': False}), Prediction(uid=213, iid=117529, r_ui=4.0, est=2.5514469117251313, details={'was_impossible': False}), Prediction(uid=306, iid=508, r_ui=4.0, est=3.5676360769929727, details={'was_impossible': False}), Prediction(uid=15, iid=5463, r_ui=1.0, est=2.551202816480763, details={'was_impossible': False}), Prediction(uid=620, iid=46578, r_ui=4.0, est=2.9155859079372375, details={'was_impossible': False}), Prediction(uid=605, iid=1285, r_ui=2.0, est=3.651551890043022, details={'was_impossible': False}), Prediction(uid=268, iid=85796, r_ui=1.5, est=3.6194441826285813, details={'was_impossible': False}), Prediction(uid=239, iid=39, r_ui=5.0, est=3.796547154849449, details={'was_impossible': False}), Prediction(uid=356, iid=2722, r_ui=3.5, est=2.183576875431101, details={'was_impossible': False}), Prediction(uid=496, iid=338, r_ui=5.0, est=3.5608998940412504, details={'was_impossible': False}), Prediction(uid=432, iid=780, r_ui=5.0, est=3.738442158293248, details={'was_impossible': False}), Prediction(uid=84, iid=4701, r_ui=3.5, est=3.6004341974525307, details={'was_impossible': False}), Prediction(uid=214, iid=3527, r_ui=5.0, est=3.968250921105762, details={'was_impossible': False}), Prediction(uid=457, iid=333, r_ui=2.5, est=2.3355601754832507, details={'was_impossible': False}), Prediction(uid=378, iid=8376, r_ui=3.5, est=3.0089142050333004, details={'was_impossible': False}), Prediction(uid=571, iid=34338, r_ui=4.5, est=3.7263974328145135, details={'was_impossible': False}), Prediction(uid=564, iid=3313, r_ui=4.0, est=3.567357613490083, details={'was_impossible': False}), Prediction(uid=265, iid=2130, r_ui=4.0, est=4.062483122583292, details={'was_impossible': False}), Prediction(uid=463, iid=4641, r_ui=2.0, est=3.2897634957756354, details={'was_impossible': False}), Prediction(uid=82, iid=165, r_ui=3.0, est=3.283419049777408, details={'was_impossible': False}), Prediction(uid=587, iid=1234, r_ui=4.5, est=4.041361615764424, details={'was_impossible': False}), Prediction(uid=586, iid=4995, r_ui=5.0, est=3.48054820797989, details={'was_impossible': False}), Prediction(uid=219, iid=47640, r_ui=5.0, est=3.2057517838254546, details={'was_impossible': False}), Prediction(uid=624, iid=30812, r_ui=4.0, est=2.988595686172072, details={'was_impossible': False}), Prediction(uid=461, iid=41566, r_ui=3.0, est=3.0736198110857376, details={'was_impossible': False}), Prediction(uid=427, iid=2501, r_ui=5.0, est=4.415491341194056, details={'was_impossible': False}), Prediction(uid=120, iid=5693, r_ui=2.0, est=3.378708143752717, details={'was_impossible': False}), Prediction(uid=607, iid=7153, r_ui=4.5, est=4.236651202607002, details={'was_impossible': False}), Prediction(uid=309, iid=5349, r_ui=3.5, est=4.0702658546987, details={'was_impossible': False}), Prediction(uid=460, iid=2693, r_ui=4.0, est=3.291417976539507, details={'was_impossible': False}), Prediction(uid=313, iid=6947, r_ui=2.5, est=3.3975729994411283, details={'was_impossible': False}), Prediction(uid=577, iid=367, r_ui=3.0, est=3.7152686125842584, details={'was_impossible': False}), Prediction(uid=332, iid=6333, r_ui=1.5, est=3.5254850882895243, details={'was_impossible': False}), Prediction(uid=547, iid=938, r_ui=4.5, est=3.7176863206338577, details={'was_impossible': False}), Prediction(uid=457, iid=4002, r_ui=3.0, est=2.1182693658921306, details={'was_impossible': False}), Prediction(uid=285, iid=1608, r_ui=4.0, est=3.4404170902445905, details={'was_impossible': False}), Prediction(uid=501, iid=80693, r_ui=4.5, est=3.8631587789806208, details={'was_impossible': False}), Prediction(uid=355, iid=2710, r_ui=4.5, est=2.8632079195467335, details={'was_impossible': False}), Prediction(uid=17, iid=5867, r_ui=4.0, est=3.648478929966894, details={'was_impossible': False}), Prediction(uid=547, iid=25856, r_ui=4.0, est=3.3517913588846695, details={'was_impossible': False}), Prediction(uid=529, iid=8645, r_ui=3.5, est=3.1288893837749217, details={'was_impossible': False}), Prediction(uid=605, iid=224, r_ui=3.0, est=3.259584740472503, details={'was_impossible': False}), Prediction(uid=624, iid=2291, r_ui=3.0, est=3.580916885487311, details={'was_impossible': False}), Prediction(uid=387, iid=2396, r_ui=4.0, est=4.334368380940083, details={'was_impossible': False}), Prediction(uid=230, iid=1265, r_ui=4.0, est=4.506582900776778, details={'was_impossible': False}), Prediction(uid=365, iid=72998, r_ui=5.0, est=4.076733166102683, details={'was_impossible': False}), Prediction(uid=88, iid=1258, r_ui=3.0, est=3.631346585780175, details={'was_impossible': False}), Prediction(uid=547, iid=1719, r_ui=4.0, est=4.322010650350625, details={'was_impossible': False}), Prediction(uid=314, iid=50872, r_ui=2.5, est=4.528315442276778, details={'was_impossible': False}), Prediction(uid=105, iid=1962, r_ui=3.0, est=3.5736182337842926, details={'was_impossible': False}), Prediction(uid=91, iid=134368, r_ui=4.0, est=4.087562843485584, details={'was_impossible': False}), Prediction(uid=23, iid=316, r_ui=3.5, est=3.1261775159545713, details={'was_impossible': False}), Prediction(uid=8, iid=5952, r_ui=4.0, est=4.121336659595516, details={'was_impossible': False}), Prediction(uid=537, iid=590, r_ui=5.0, est=3.8880137485622237, details={'was_impossible': False}), Prediction(uid=441, iid=5621, r_ui=1.0, est=3.2774395513725127, details={'was_impossible': False}), Prediction(uid=56, iid=106920, r_ui=5.0, est=4.111856902687226, details={'was_impossible': False}), Prediction(uid=245, iid=527, r_ui=5.0, est=4.9285327011414894, details={'was_impossible': False}), Prediction(uid=195, iid=2423, r_ui=2.0, est=2.9695567975485013, details={'was_impossible': False}), Prediction(uid=271, iid=1291, r_ui=3.0, est=4.258351975234163, details={'was_impossible': False}), Prediction(uid=433, iid=1923, r_ui=3.0, est=3.817119200339067, details={'was_impossible': False}), Prediction(uid=294, iid=5449, r_ui=4.0, est=3.2624474963230394, details={'was_impossible': False}), Prediction(uid=510, iid=753, r_ui=3.0, est=3.1454012341583715, details={'was_impossible': False}), Prediction(uid=299, iid=3386, r_ui=5.0, est=4.3257328457511965, details={'was_impossible': False}), Prediction(uid=624, iid=3933, r_ui=0.5, est=2.9299471488604962, details={'was_impossible': False}), Prediction(uid=262, iid=76030, r_ui=2.0, est=2.571609941472188, details={'was_impossible': False}), Prediction(uid=597, iid=1307, r_ui=4.0, est=4.172517762362527, details={'was_impossible': False}), Prediction(uid=577, iid=1126, r_ui=3.5, est=3.908893778456219, details={'was_impossible': False}), Prediction(uid=468, iid=1090, r_ui=3.0, est=3.2405150378569942, details={'was_impossible': False}), Prediction(uid=574, iid=54259, r_ui=3.5, est=3.7191159418927002, details={'was_impossible': False}), Prediction(uid=300, iid=2006, r_ui=3.5, est=3.800190875044944, details={'was_impossible': False}), Prediction(uid=294, iid=2160, r_ui=3.5, est=4.1028357160056945, details={'was_impossible': False}), Prediction(uid=518, iid=1286, r_ui=4.0, est=3.520119371785259, details={'was_impossible': False}), Prediction(uid=324, iid=7502, r_ui=3.0, est=4.169486365613096, details={'was_impossible': False}), Prediction(uid=654, iid=2501, r_ui=4.5, est=4.601074244875153, details={'was_impossible': False}), Prediction(uid=146, iid=2804, r_ui=4.0, est=4.0148537786433725, details={'was_impossible': False}), Prediction(uid=30, iid=3428, r_ui=4.0, est=3.99513523628313, details={'was_impossible': False}), Prediction(uid=105, iid=784, r_ui=2.5, est=2.4084280377285716, details={'was_impossible': False}), Prediction(uid=564, iid=3428, r_ui=1.0, est=3.470387392484876, details={'was_impossible': False}), Prediction(uid=528, iid=33794, r_ui=3.5, est=3.3784136474341504, details={'was_impossible': False}), Prediction(uid=358, iid=1281, r_ui=5.0, est=4.0787821584372, details={'was_impossible': False}), Prediction(uid=481, iid=27611, r_ui=3.5, est=3.8732305989613556, details={'was_impossible': False}), Prediction(uid=489, iid=2942, r_ui=2.0, est=3.515136051254835, details={'was_impossible': False}), Prediction(uid=111, iid=3897, r_ui=3.5, est=3.981010736743363, details={'was_impossible': False}), Prediction(uid=621, iid=2174, r_ui=4.0, est=3.456242049191535, details={'was_impossible': False}), Prediction(uid=440, iid=485, r_ui=4.0, est=3.11449530579893, details={'was_impossible': False}), Prediction(uid=23, iid=597, r_ui=3.0, est=3.2223287492714676, details={'was_impossible': False}), Prediction(uid=468, iid=3099, r_ui=2.5, est=2.5840237150594034, details={'was_impossible': False}), Prediction(uid=73, iid=62956, r_ui=3.5, est=3.25808559096678, details={'was_impossible': False}), Prediction(uid=490, iid=648, r_ui=4.0, est=3.5049815243476785, details={'was_impossible': False}), Prediction(uid=287, iid=106011, r_ui=4.5, est=4.43332036960633, details={'was_impossible': False}), Prediction(uid=624, iid=2993, r_ui=4.5, est=2.9019234672103926, details={'was_impossible': False}), Prediction(uid=318, iid=605, r_ui=3.0, est=3.362529014498488, details={'was_impossible': False}), Prediction(uid=452, iid=919, r_ui=3.0, est=4.25823240279082, details={'was_impossible': False}), Prediction(uid=311, iid=1367, r_ui=2.0, est=2.78032087750592, details={'was_impossible': False}), Prediction(uid=44, iid=733, r_ui=4.0, est=3.4829717903508826, details={'was_impossible': False}), Prediction(uid=4, iid=260, r_ui=5.0, est=4.834134569088334, details={'was_impossible': False}), Prediction(uid=311, iid=6233, r_ui=2.5, est=3.0916393656172634, details={'was_impossible': False}), Prediction(uid=564, iid=1801, r_ui=1.0, est=2.7757542245860973, details={'was_impossible': False}), Prediction(uid=193, iid=356, r_ui=4.0, est=4.437067560460212, details={'was_impossible': False}), Prediction(uid=545, iid=1885, r_ui=1.0, est=3.8945903787691587, details={'was_impossible': False}), Prediction(uid=615, iid=5421, r_ui=3.5, est=3.579713312965499, details={'was_impossible': False}), Prediction(uid=232, iid=1394, r_ui=4.0, est=3.9878676787492617, details={'was_impossible': False}), Prediction(uid=384, iid=4228, r_ui=3.0, est=2.9022954495269238, details={'was_impossible': False}), Prediction(uid=130, iid=785, r_ui=2.5, est=2.7112361779461103, details={'was_impossible': False}), Prediction(uid=431, iid=1719, r_ui=4.0, est=4.0930023847550325, details={'was_impossible': False}), Prediction(uid=23, iid=5956, r_ui=3.5, est=3.387130011803874, details={'was_impossible': False}), Prediction(uid=242, iid=3354, r_ui=2.0, est=3.450870498708784, details={'was_impossible': False}), Prediction(uid=220, iid=1517, r_ui=4.0, est=3.2600073163435055, details={'was_impossible': False}), Prediction(uid=268, iid=1, r_ui=5.0, est=4.023954809504452, details={'was_impossible': False}), Prediction(uid=592, iid=1372, r_ui=4.0, est=3.792934437183616, details={'was_impossible': False}), Prediction(uid=268, iid=1968, r_ui=5.0, est=4.030047442706778, details={'was_impossible': False}), Prediction(uid=150, iid=1036, r_ui=4.0, est=4.013450149816061, details={'was_impossible': False}), Prediction(uid=463, iid=2054, r_ui=3.0, est=2.496810553810907, details={'was_impossible': False}), Prediction(uid=598, iid=3751, r_ui=4.0, est=2.902740049872007, details={'was_impossible': False}), Prediction(uid=380, iid=3901, r_ui=2.0, est=3.27104289010266, details={'was_impossible': False}), Prediction(uid=287, iid=5480, r_ui=3.5, est=4.561807694982911, details={'was_impossible': False}), Prediction(uid=461, iid=10, r_ui=2.0, est=2.3887926032674685, details={'was_impossible': False}), Prediction(uid=452, iid=3967, r_ui=5.0, est=3.0566407735738705, details={'was_impossible': False}), Prediction(uid=15, iid=1682, r_ui=2.0, est=3.678765852364079, details={'was_impossible': False}), Prediction(uid=624, iid=5438, r_ui=2.0, est=2.722902950851413, details={'was_impossible': False}), Prediction(uid=386, iid=1100, r_ui=2.0, est=2.357515801340528, details={'was_impossible': False}), Prediction(uid=612, iid=2502, r_ui=3.0, est=3.6639822191514626, details={'was_impossible': False}), Prediction(uid=658, iid=2987, r_ui=5.0, est=4.23397153724175, details={'was_impossible': False}), Prediction(uid=299, iid=60384, r_ui=5.0, est=4.051075034500921, details={'was_impossible': False}), Prediction(uid=530, iid=527, r_ui=5.0, est=4.561743563391367, details={'was_impossible': False}), Prediction(uid=104, iid=2571, r_ui=3.5, est=4.512795875089235, details={'was_impossible': False}), Prediction(uid=624, iid=7076, r_ui=4.0, est=2.9299471488604962, details={'was_impossible': False}), Prediction(uid=41, iid=1356, r_ui=4.0, est=4.274398893143793, details={'was_impossible': False}), Prediction(uid=447, iid=153, r_ui=3.0, est=2.5925481677900324, details={'was_impossible': False}), Prediction(uid=105, iid=1278, r_ui=4.0, est=3.8423938343041364, details={'was_impossible': False}), Prediction(uid=564, iid=110, r_ui=1.0, est=3.314573308855339, details={'was_impossible': False}), Prediction(uid=564, iid=2084, r_ui=1.0, est=3.78960370032024, details={'was_impossible': False}), Prediction(uid=119, iid=3011, r_ui=4.0, est=3.5083934421060183, details={'was_impossible': False}), Prediction(uid=457, iid=107348, r_ui=1.5, est=2.3573667071356725, details={'was_impossible': False}), Prediction(uid=536, iid=47, r_ui=4.0, est=4.49303909628177, details={'was_impossible': False}), Prediction(uid=495, iid=589, r_ui=4.0, est=4.285447740176559, details={'was_impossible': False}), Prediction(uid=564, iid=95, r_ui=3.0, est=3.1529872306884643, details={'was_impossible': False}), Prediction(uid=195, iid=2918, r_ui=1.0, est=3.7186208761305215, details={'was_impossible': False}), Prediction(uid=199, iid=78544, r_ui=4.0, est=3.5068019333174423, details={'was_impossible': False}), Prediction(uid=159, iid=111, r_ui=4.0, est=4.038024295045636, details={'was_impossible': False}), Prediction(uid=73, iid=114060, r_ui=4.0, est=3.3428379751197426, details={'was_impossible': False}), Prediction(uid=353, iid=21, r_ui=3.0, est=2.6434949489444195, details={'was_impossible': False}), Prediction(uid=529, iid=2057, r_ui=4.0, est=3.0472029631471877, details={'was_impossible': False}), Prediction(uid=212, iid=1028, r_ui=3.0, est=3.5615171287108156, details={'was_impossible': False}), Prediction(uid=551, iid=1033, r_ui=4.0, est=3.385951903886168, details={'was_impossible': False}), Prediction(uid=624, iid=2002, r_ui=3.0, est=2.517931304975117, details={'was_impossible': False}), Prediction(uid=182, iid=60, r_ui=4.0, est=3.2605871906416004, details={'was_impossible': False}), Prediction(uid=654, iid=1371, r_ui=4.0, est=3.757858756322069, details={'was_impossible': False}), Prediction(uid=265, iid=2132, r_ui=5.0, est=4.1970120229109495, details={'was_impossible': False}), Prediction(uid=523, iid=648, r_ui=4.5, est=3.9301896321309013, details={'was_impossible': False}), Prediction(uid=575, iid=3330, r_ui=3.0, est=3.411915683218362, details={'was_impossible': False}), Prediction(uid=337, iid=1969, r_ui=2.0, est=2.759508499082403, details={'was_impossible': False}), Prediction(uid=157, iid=59814, r_ui=4.5, est=3.332247355419215, details={'was_impossible': False}), Prediction(uid=255, iid=7265, r_ui=5.0, est=3.9987700181124612, details={'was_impossible': False}), Prediction(uid=253, iid=333, r_ui=4.0, est=3.561286862179946, details={'was_impossible': False}), Prediction(uid=452, iid=1962, r_ui=2.0, est=3.5025167932270587, details={'was_impossible': False}), Prediction(uid=509, iid=47, r_ui=4.0, est=3.7668287575342996, details={'was_impossible': False}), Prediction(uid=457, iid=1517, r_ui=3.0, est=1.9727989932924646, details={'was_impossible': False}), Prediction(uid=306, iid=2013, r_ui=4.0, est=3.3818797391079536, details={'was_impossible': False}), Prediction(uid=283, iid=41571, r_ui=3.0, est=3.443994093487323, details={'was_impossible': False}), Prediction(uid=457, iid=1240, r_ui=3.0, est=2.8217766165903653, details={'was_impossible': False}), Prediction(uid=30, iid=1729, r_ui=5.0, est=3.6248674182092784, details={'was_impossible': False}), Prediction(uid=665, iid=3263, r_ui=4.0, est=3.5642720783335546, details={'was_impossible': False}), Prediction(uid=140, iid=72641, r_ui=3.0, est=3.5735833231739966, details={'was_impossible': False}), Prediction(uid=624, iid=103384, r_ui=1.0, est=2.9127825539590537, details={'was_impossible': False}), Prediction(uid=133, iid=7147, r_ui=0.5, est=2.7562888058957618, details={'was_impossible': False}), Prediction(uid=15, iid=61323, r_ui=1.0, est=2.4782415792196684, details={'was_impossible': False}), Prediction(uid=23, iid=5992, r_ui=5.0, est=3.6679107597621714, details={'was_impossible': False}), Prediction(uid=380, iid=87306, r_ui=2.5, est=3.28509361206476, details={'was_impossible': False}), Prediction(uid=88, iid=55269, r_ui=1.5, est=3.294383178727048, details={'was_impossible': False}), Prediction(uid=363, iid=1375, r_ui=3.0, est=3.7266733818391993, details={'was_impossible': False}), Prediction(uid=599, iid=56941, r_ui=4.0, est=3.9697701217960915, details={'was_impossible': False}), Prediction(uid=529, iid=3430, r_ui=1.0, est=2.8508422535429943, details={'was_impossible': False}), Prediction(uid=70, iid=780, r_ui=4.0, est=4.553974782294869, details={'was_impossible': False}), Prediction(uid=384, iid=47997, r_ui=3.0, est=3.042753175887506, details={'was_impossible': False}), Prediction(uid=152, iid=30749, r_ui=3.5, est=3.738577248695412, details={'was_impossible': False}), Prediction(uid=654, iid=2762, r_ui=5.0, est=4.479230322900117, details={'was_impossible': False}), Prediction(uid=205, iid=57528, r_ui=4.0, est=3.250629626645109, details={'was_impossible': False}), Prediction(uid=564, iid=3691, r_ui=3.0, est=3.636160844775937, details={'was_impossible': False}), Prediction(uid=461, iid=1792, r_ui=1.0, est=2.962712303701771, details={'was_impossible': False}), Prediction(uid=303, iid=2858, r_ui=4.0, est=4.120591886994359, details={'was_impossible': False}), Prediction(uid=30, iid=4482, r_ui=4.0, est=3.3474412514641485, details={'was_impossible': False}), Prediction(uid=280, iid=17, r_ui=5.0, est=4.218694811402652, details={'was_impossible': False}), Prediction(uid=564, iid=3506, r_ui=4.0, est=3.097909390692727, details={'was_impossible': False}), Prediction(uid=562, iid=282, r_ui=3.0, est=3.378343785039321, details={'was_impossible': False}), Prediction(uid=385, iid=160, r_ui=2.0, est=2.506479077929772, details={'was_impossible': False}), Prediction(uid=387, iid=2858, r_ui=5.0, est=4.76080798235652, details={'was_impossible': False}), Prediction(uid=531, iid=3448, r_ui=3.5, est=3.005403952594523, details={'was_impossible': False}), Prediction(uid=384, iid=1732, r_ui=3.5, est=3.646343361794225, details={'was_impossible': False}), Prediction(uid=73, iid=122886, r_ui=4.5, est=3.5324119660417184, details={'was_impossible': False}), Prediction(uid=418, iid=158, r_ui=4.0, est=3.14269297939323, details={'was_impossible': False}), Prediction(uid=251, iid=8371, r_ui=5.0, est=3.979869068811439, details={'was_impossible': False}), Prediction(uid=547, iid=30820, r_ui=2.5, est=3.3634361175045853, details={'was_impossible': False}), Prediction(uid=274, iid=2054, r_ui=3.0, est=2.7759657035470635, details={'was_impossible': False}), Prediction(uid=83, iid=141, r_ui=4.0, est=3.7994930015333033, details={'was_impossible': False}), Prediction(uid=547, iid=3199, r_ui=2.0, est=3.908797635106673, details={'was_impossible': False}), Prediction(uid=597, iid=293, r_ui=4.0, est=4.640410444547342, details={'was_impossible': False}), Prediction(uid=330, iid=1459, r_ui=3.0, est=2.890532380396153, details={'was_impossible': False}), Prediction(uid=468, iid=2010, r_ui=4.0, est=3.2999456967810716, details={'was_impossible': False}), Prediction(uid=72, iid=54286, r_ui=3.0, est=3.347588510434461, details={'was_impossible': False}), Prediction(uid=598, iid=2394, r_ui=4.0, est=3.343690823394277, details={'was_impossible': False}), Prediction(uid=178, iid=858, r_ui=4.0, est=4.28074021518286, details={'was_impossible': False}), Prediction(uid=132, iid=356, r_ui=4.5, est=4.533290717489962, details={'was_impossible': False}), Prediction(uid=384, iid=1466, r_ui=3.0, est=3.4965033210785936, details={'was_impossible': False}), Prediction(uid=574, iid=2380, r_ui=1.0, est=2.231833519183933, details={'was_impossible': False}), Prediction(uid=547, iid=3928, r_ui=2.5, est=3.3381118227059665, details={'was_impossible': False}), Prediction(uid=19, iid=1193, r_ui=4.0, est=4.221187211411299, details={'was_impossible': False}), Prediction(uid=440, iid=353, r_ui=3.0, est=3.773282487476707, details={'was_impossible': False}), Prediction(uid=575, iid=3361, r_ui=3.0, est=4.022617209574774, details={'was_impossible': False}), Prediction(uid=516, iid=9, r_ui=3.0, est=3.0325540214792976, details={'was_impossible': False}), Prediction(uid=418, iid=1407, r_ui=3.5, est=3.847758255679286, details={'was_impossible': False}), Prediction(uid=462, iid=3261, r_ui=3.5, est=3.566427620379165, details={'was_impossible': False}), Prediction(uid=531, iid=1148, r_ui=4.0, est=3.455105039146198, details={'was_impossible': False}), Prediction(uid=431, iid=2502, r_ui=2.0, est=4.042867519569799, details={'was_impossible': False}), Prediction(uid=213, iid=32296, r_ui=2.0, est=2.597859134341426, details={'was_impossible': False}), Prediction(uid=363, iid=2997, r_ui=5.0, est=4.322343906441752, details={'was_impossible': False}), Prediction(uid=643, iid=204, r_ui=2.0, est=3.4035072560166872, details={'was_impossible': False}), Prediction(uid=157, iid=8361, r_ui=2.0, est=3.150913107995968, details={'was_impossible': False}), Prediction(uid=646, iid=3157, r_ui=4.0, est=4.011503734711411, details={'was_impossible': False}), Prediction(uid=385, iid=597, r_ui=3.0, est=3.3855191944010725, details={'was_impossible': False}), Prediction(uid=262, iid=2918, r_ui=4.0, est=2.716104065769068, details={'was_impossible': False}), Prediction(uid=303, iid=520, r_ui=3.0, est=2.9885689600348755, details={'was_impossible': False}), Prediction(uid=86, iid=235, r_ui=4.0, est=3.752639470297213, details={'was_impossible': False}), Prediction(uid=558, iid=3983, r_ui=5.0, est=4.621545323648904, details={'was_impossible': False}), Prediction(uid=138, iid=103249, r_ui=2.0, est=2.782438140692895, details={'was_impossible': False}), Prediction(uid=585, iid=1198, r_ui=5.0, est=4.309243088511834, details={'was_impossible': False}), Prediction(uid=400, iid=225, r_ui=3.0, est=3.397070136518846, details={'was_impossible': False}), Prediction(uid=48, iid=56757, r_ui=3.0, est=3.3577656837138026, details={'was_impossible': False}), Prediction(uid=4, iid=2248, r_ui=4.0, est=4.6001698271979565, details={'was_impossible': False}), Prediction(uid=407, iid=25, r_ui=4.0, est=4.124261620576355, details={'was_impossible': False}), Prediction(uid=547, iid=6987, r_ui=4.5, est=3.3350504945287263, details={'was_impossible': False}), Prediction(uid=396, iid=382, r_ui=3.0, est=2.9625096447161354, details={'was_impossible': False}), Prediction(uid=187, iid=48385, r_ui=4.0, est=3.598141641427074, details={'was_impossible': False}), Prediction(uid=580, iid=2797, r_ui=3.5, est=3.380437955582977, details={'was_impossible': False}), Prediction(uid=639, iid=288, r_ui=3.0, est=3.1368569599191036, details={'was_impossible': False}), Prediction(uid=232, iid=2321, r_ui=5.0, est=3.7024961455418826, details={'was_impossible': False}), Prediction(uid=642, iid=800, r_ui=4.0, est=4.275269639870481, details={'was_impossible': False}), Prediction(uid=580, iid=1092, r_ui=3.5, est=3.144657202525888, details={'was_impossible': False}), Prediction(uid=547, iid=3420, r_ui=1.0, est=3.559539705843129, details={'was_impossible': False}), Prediction(uid=384, iid=778, r_ui=3.0, est=3.9871608037796897, details={'was_impossible': False}), Prediction(uid=73, iid=49526, r_ui=2.5, est=3.4507038300616157, details={'was_impossible': False}), Prediction(uid=534, iid=2840, r_ui=4.0, est=3.3855272505744103, details={'was_impossible': False}), Prediction(uid=373, iid=2463, r_ui=3.0, est=3.7510548260654324, details={'was_impossible': False}), Prediction(uid=550, iid=1603, r_ui=3.0, est=2.8588356589384714, details={'was_impossible': False}), Prediction(uid=75, iid=800, r_ui=3.0, est=3.7283458850349165, details={'was_impossible': False}), Prediction(uid=486, iid=109487, r_ui=2.5, est=4.047099252601667, details={'was_impossible': False}), Prediction(uid=326, iid=1197, r_ui=5.0, est=3.87048185983661, details={'was_impossible': False}), Prediction(uid=23, iid=1917, r_ui=2.0, est=2.7461720787744004, details={'was_impossible': False}), Prediction(uid=18, iid=100, r_ui=4.0, est=3.2197746222884174, details={'was_impossible': False}), Prediction(uid=15, iid=1635, r_ui=4.0, est=2.77341704733748, details={'was_impossible': False}), Prediction(uid=242, iid=3106, r_ui=4.0, est=4.153893589463109, details={'was_impossible': False}), Prediction(uid=306, iid=527, r_ui=5.0, est=4.2131834595166895, details={'was_impossible': False}), Prediction(uid=389, iid=253, r_ui=2.0, est=3.7243584903295357, details={'was_impossible': False}), Prediction(uid=254, iid=597, r_ui=3.0, est=3.500019414326104, details={'was_impossible': False}), Prediction(uid=83, iid=3198, r_ui=4.0, est=4.085210293864621, details={'was_impossible': False}), Prediction(uid=514, iid=1073, r_ui=4.0, est=3.9784022684389493, details={'was_impossible': False}), Prediction(uid=214, iid=1210, r_ui=4.0, est=4.512817376513809, details={'was_impossible': False}), Prediction(uid=105, iid=370, r_ui=3.0, est=3.013192333820518, details={'was_impossible': False}), Prediction(uid=113, iid=20, r_ui=5.0, est=3.433776023729496, details={'was_impossible': False}), Prediction(uid=213, iid=72701, r_ui=2.5, est=2.7594970740528617, details={'was_impossible': False}), Prediction(uid=161, iid=329, r_ui=3.0, est=3.4934535976830343, details={'was_impossible': False}), Prediction(uid=180, iid=1193, r_ui=4.0, est=4.247954409869766, details={'was_impossible': False}), Prediction(uid=333, iid=8529, r_ui=5.0, est=4.001170903815622, details={'was_impossible': False}), Prediction(uid=134, iid=59615, r_ui=4.0, est=3.428064262553417, details={'was_impossible': False}), Prediction(uid=353, iid=165, r_ui=3.5, est=2.534845327967523, details={'was_impossible': False}), Prediction(uid=605, iid=3793, r_ui=2.0, est=3.1540715084011, details={'was_impossible': False}), Prediction(uid=569, iid=2000, r_ui=4.0, est=3.734960163418113, details={'was_impossible': False}), Prediction(uid=92, iid=539, r_ui=3.0, est=3.287261210163128, details={'was_impossible': False}), Prediction(uid=547, iid=5434, r_ui=3.5, est=3.842270966052038, details={'was_impossible': False}), Prediction(uid=391, iid=280, r_ui=4.0, est=3.896753283303343, details={'was_impossible': False}), Prediction(uid=388, iid=70, r_ui=3.0, est=3.086780976809371, details={'was_impossible': False}), Prediction(uid=157, iid=367, r_ui=1.0, est=3.0238428645304065, details={'was_impossible': False}), Prediction(uid=468, iid=70927, r_ui=3.0, est=2.870895378361822, details={'was_impossible': False}), Prediction(uid=468, iid=80599, r_ui=4.5, est=2.870895378361822, details={'was_impossible': False}), Prediction(uid=405, iid=4896, r_ui=4.5, est=3.929279802034958, details={'was_impossible': False}), Prediction(uid=358, iid=829, r_ui=1.0, est=2.3794767806253145, details={'was_impossible': False}), Prediction(uid=195, iid=1946, r_ui=3.0, est=3.3294253432456307, details={'was_impossible': False}), Prediction(uid=624, iid=4881, r_ui=3.0, est=3.481778158258484, details={'was_impossible': False}), Prediction(uid=295, iid=2699, r_ui=3.5, est=3.6116948964085704, details={'was_impossible': False}), Prediction(uid=624, iid=1032, r_ui=4.0, est=3.6768718102050455, details={'was_impossible': False}), Prediction(uid=316, iid=3967, r_ui=3.5, est=3.8187193146643827, details={'was_impossible': False}), Prediction(uid=373, iid=2073, r_ui=3.0, est=3.737811988056938, details={'was_impossible': False}), Prediction(uid=243, iid=8368, r_ui=4.0, est=3.52441697000092, details={'was_impossible': False}), Prediction(uid=624, iid=134881, r_ui=4.0, est=3.2349864188248785, details={'was_impossible': False}), Prediction(uid=105, iid=5060, r_ui=4.0, est=3.4299755877411515, details={'was_impossible': False}), Prediction(uid=416, iid=587, r_ui=3.0, est=3.2070186131864116, details={'was_impossible': False}), Prediction(uid=518, iid=1526, r_ui=3.0, est=2.9977115050628127, details={'was_impossible': False}), Prediction(uid=15, iid=1729, r_ui=1.0, est=2.8627611118461442, details={'was_impossible': False}), Prediction(uid=468, iid=947, r_ui=4.5, est=2.7228921064550495, details={'was_impossible': False}), Prediction(uid=373, iid=1784, r_ui=4.0, est=3.679630216785331, details={'was_impossible': False}), Prediction(uid=585, iid=1220, r_ui=4.0, est=4.2445544338940255, details={'was_impossible': False}), Prediction(uid=73, iid=6659, r_ui=4.0, est=3.697409946573294, details={'was_impossible': False}), Prediction(uid=213, iid=2791, r_ui=1.0, est=3.058352079701251, details={'was_impossible': False}), Prediction(uid=335, iid=2739, r_ui=4.0, est=3.825154085247744, details={'was_impossible': False}), Prediction(uid=664, iid=71838, r_ui=4.5, est=3.8005100708624324, details={'was_impossible': False}), Prediction(uid=15, iid=3129, r_ui=3.5, est=3.4646368275683708, details={'was_impossible': False}), Prediction(uid=509, iid=2231, r_ui=4.0, est=3.9935460420342643, details={'was_impossible': False}), Prediction(uid=592, iid=4299, r_ui=4.0, est=4.013034177911878, details={'was_impossible': False}), Prediction(uid=284, iid=25, r_ui=3.0, est=3.7193920567477816, details={'was_impossible': False}), Prediction(uid=480, iid=68791, r_ui=4.0, est=4.13976342457746, details={'was_impossible': False}), Prediction(uid=481, iid=6287, r_ui=2.5, est=2.977286082273093, details={'was_impossible': False}), Prediction(uid=468, iid=34, r_ui=3.5, est=3.1811604965868665, details={'was_impossible': False}), Prediction(uid=271, iid=3578, r_ui=5.0, est=3.9485941132131637, details={'was_impossible': False}), Prediction(uid=409, iid=50, r_ui=4.0, est=4.75456071280094, details={'was_impossible': False}), Prediction(uid=105, iid=4673, r_ui=2.5, est=2.9913252661129675, details={'was_impossible': False}), Prediction(uid=452, iid=1124, r_ui=4.0, est=3.1675720936097504, details={'was_impossible': False}), Prediction(uid=15, iid=78469, r_ui=3.0, est=1.9462160701034004, details={'was_impossible': False}), Prediction(uid=309, iid=1288, r_ui=5.0, est=4.517513521244588, details={'was_impossible': False}), Prediction(uid=534, iid=2, r_ui=4.0, est=3.529039175903982, details={'was_impossible': False}), Prediction(uid=452, iid=5479, r_ui=3.5, est=2.9304206515587503, details={'was_impossible': False}), Prediction(uid=4, iid=357, r_ui=5.0, est=4.257246354853987, details={'was_impossible': False}), Prediction(uid=390, iid=592, r_ui=3.0, est=3.1252242631874805, details={'was_impossible': False}), Prediction(uid=313, iid=380, r_ui=4.0, est=3.4591379916059712, details={'was_impossible': False}), Prediction(uid=373, iid=1197, r_ui=5.0, est=3.9484524096859084, details={'was_impossible': False}), Prediction(uid=36, iid=144, r_ui=4.0, est=3.3476669299800297, details={'was_impossible': False}), Prediction(uid=184, iid=337, r_ui=4.0, est=3.934118164892716, details={'was_impossible': False}), Prediction(uid=199, iid=73321, r_ui=3.5, est=3.5337114867496506, details={'was_impossible': False}), Prediction(uid=452, iid=2870, r_ui=3.0, est=3.331477344877377, details={'was_impossible': False}), Prediction(uid=73, iid=5572, r_ui=1.5, est=3.1874139290773336, details={'was_impossible': False}), Prediction(uid=624, iid=56174, r_ui=3.0, est=2.84957006505632, details={'was_impossible': False}), Prediction(uid=665, iid=3798, r_ui=4.0, est=3.304995628315366, details={'was_impossible': False}), Prediction(uid=355, iid=51662, r_ui=3.5, est=3.6440907634730784, details={'was_impossible': False}), Prediction(uid=547, iid=945, r_ui=4.0, est=3.739034990694172, details={'was_impossible': False}), Prediction(uid=283, iid=8910, r_ui=3.5, est=3.818057815527424, details={'was_impossible': False}), Prediction(uid=564, iid=2271, r_ui=5.0, est=3.865687718522979, details={'was_impossible': False}), Prediction(uid=346, iid=319, r_ui=2.0, est=3.7719498429123517, details={'was_impossible': False}), Prediction(uid=45, iid=3052, r_ui=2.5, est=3.075649877838074, details={'was_impossible': False}), Prediction(uid=625, iid=4878, r_ui=4.0, est=4.488314483647398, details={'was_impossible': False}), Prediction(uid=339, iid=861, r_ui=4.5, est=3.7837274312712066, details={'was_impossible': False}), Prediction(uid=384, iid=50189, r_ui=2.5, est=3.40068523785417, details={'was_impossible': False}), Prediction(uid=126, iid=497, r_ui=4.0, est=3.8316210980647805, details={'was_impossible': False}), Prediction(uid=564, iid=2514, r_ui=1.0, est=3.681414549805816, details={'was_impossible': False}), Prediction(uid=547, iid=2088, r_ui=2.0, est=2.9559353113133437, details={'was_impossible': False}), Prediction(uid=212, iid=6534, r_ui=2.0, est=2.5770049413257725, details={'was_impossible': False}), Prediction(uid=48, iid=31878, r_ui=3.0, est=3.7325184712035293, details={'was_impossible': False}), Prediction(uid=234, iid=44788, r_ui=4.0, est=3.805389087643579, details={'was_impossible': False}), Prediction(uid=380, iid=3536, r_ui=3.0, est=3.438800354238291, details={'was_impossible': False}), Prediction(uid=247, iid=2643, r_ui=2.0, est=2.9633078667559007, details={'was_impossible': False}), Prediction(uid=122, iid=161, r_ui=3.0, est=3.090543327480771, details={'was_impossible': False}), Prediction(uid=625, iid=99114, r_ui=4.0, est=4.212924385229049, details={'was_impossible': False}), Prediction(uid=495, iid=595, r_ui=4.0, est=4.354398622019646, details={'was_impossible': False}), Prediction(uid=468, iid=2915, r_ui=2.0, est=3.1146436597042024, details={'was_impossible': False}), Prediction(uid=382, iid=1835, r_ui=2.0, est=3.3229753413545167, details={'was_impossible': False}), Prediction(uid=547, iid=4992, r_ui=0.5, est=3.6381389712008008, details={'was_impossible': False}), Prediction(uid=380, iid=51662, r_ui=4.5, est=3.4504105538718823, details={'was_impossible': False}), Prediction(uid=165, iid=27604, r_ui=2.0, est=2.359927649704046, details={'was_impossible': False}), Prediction(uid=152, iid=253, r_ui=2.0, est=3.1717310383726036, details={'was_impossible': False}), Prediction(uid=440, iid=367, r_ui=3.0, est=3.114467720596352, details={'was_impossible': False}), Prediction(uid=472, iid=1080, r_ui=4.0, est=4.2545001240106455, details={'was_impossible': False}), Prediction(uid=306, iid=3384, r_ui=4.0, est=3.2716141461187602, details={'was_impossible': False}), Prediction(uid=59, iid=7040, r_ui=4.0, est=2.5697801445800708, details={'was_impossible': False}), Prediction(uid=294, iid=4027, r_ui=4.0, est=3.87719410611625, details={'was_impossible': False}), Prediction(uid=624, iid=96110, r_ui=4.0, est=2.736275528600622, details={'was_impossible': False}), Prediction(uid=150, iid=3147, r_ui=4.0, est=3.6406170989441007, details={'was_impossible': False}), Prediction(uid=262, iid=4878, r_ui=5.0, est=3.107983752590661, details={'was_impossible': False}), Prediction(uid=546, iid=3555, r_ui=4.0, est=4.119994439611815, details={'was_impossible': False}), Prediction(uid=17, iid=1884, r_ui=5.0, est=3.3942968271525973, details={'was_impossible': False}), Prediction(uid=54, iid=30749, r_ui=3.5, est=4.324365675376033, details={'was_impossible': False}), Prediction(uid=501, iid=47, r_ui=5.0, est=4.071325004222353, details={'was_impossible': False}), Prediction(uid=358, iid=1801, r_ui=1.0, est=2.2652436087319994, details={'was_impossible': False}), Prediction(uid=475, iid=1562, r_ui=1.0, est=0.5155251353253967, details={'was_impossible': False}), Prediction(uid=19, iid=464, r_ui=4.0, est=2.990673884365344, details={'was_impossible': False}), Prediction(uid=376, iid=3095, r_ui=4.0, est=4.193748727043205, details={'was_impossible': False}), Prediction(uid=420, iid=661, r_ui=3.0, est=3.69370241244053, details={'was_impossible': False}), Prediction(uid=620, iid=74688, r_ui=1.5, est=2.9903736475639526, details={'was_impossible': False}), Prediction(uid=457, iid=100581, r_ui=2.5, est=2.4768286814807325, details={'was_impossible': False}), Prediction(uid=388, iid=3360, r_ui=4.0, est=3.474409691580559, details={'was_impossible': False}), Prediction(uid=427, iid=5685, r_ui=3.5, est=3.773480087073829, details={'was_impossible': False}), Prediction(uid=176, iid=88129, r_ui=4.0, est=3.3156328327817848, details={'was_impossible': False}), Prediction(uid=143, iid=39414, r_ui=5.0, est=3.5760474004511154, details={'was_impossible': False}), Prediction(uid=596, iid=1343, r_ui=3.5, est=3.723675796066683, details={'was_impossible': False}), Prediction(uid=262, iid=1343, r_ui=2.0, est=2.493462926844784, details={'was_impossible': False}), Prediction(uid=575, iid=1961, r_ui=4.0, est=3.1169789704237902, details={'was_impossible': False}), Prediction(uid=98, iid=4014, r_ui=5.0, est=4.107533350139764, details={'was_impossible': False}), Prediction(uid=294, iid=28, r_ui=4.5, est=3.836726462642353, details={'was_impossible': False}), Prediction(uid=118, iid=247, r_ui=2.0, est=3.842741773369415, details={'was_impossible': False}), Prediction(uid=452, iid=1955, r_ui=4.0, est=3.182940207746574, details={'was_impossible': False}), Prediction(uid=384, iid=481, r_ui=3.0, est=3.333319540978869, details={'was_impossible': False}), Prediction(uid=374, iid=5254, r_ui=5.0, est=3.0002978610293547, details={'was_impossible': False}), Prediction(uid=220, iid=4239, r_ui=4.0, est=3.4480481850468307, details={'was_impossible': False}), Prediction(uid=88, iid=160, r_ui=2.0, est=2.157684504995828, details={'was_impossible': False}), Prediction(uid=460, iid=3751, r_ui=4.5, est=3.596374199441004, details={'was_impossible': False}), Prediction(uid=587, iid=1393, r_ui=4.0, est=4.284035477427877, details={'was_impossible': False}), Prediction(uid=36, iid=457, r_ui=3.0, est=4.055774988136486, details={'was_impossible': False}), Prediction(uid=182, iid=360, r_ui=3.0, est=3.235794939373289, details={'was_impossible': False}), Prediction(uid=480, iid=6373, r_ui=3.0, est=3.698727116908044, details={'was_impossible': False}), Prediction(uid=457, iid=8371, r_ui=1.5, est=2.3115911453729527, details={'was_impossible': False}), Prediction(uid=596, iid=47, r_ui=4.0, est=3.9550275250408826, details={'was_impossible': False}), Prediction(uid=175, iid=6157, r_ui=3.0, est=2.9281879300845155, details={'was_impossible': False}), Prediction(uid=57, iid=800, r_ui=5.0, est=4.196056648564841, details={'was_impossible': False}), Prediction(uid=87, iid=1405, r_ui=2.0, est=2.958676562869366, details={'was_impossible': False}), Prediction(uid=283, iid=8464, r_ui=2.5, est=3.7691537464981417, details={'was_impossible': False}), Prediction(uid=165, iid=4532, r_ui=2.5, est=2.554675162774303, details={'was_impossible': False}), Prediction(uid=263, iid=2019, r_ui=3.0, est=4.015348542419786, details={'was_impossible': False}), Prediction(uid=420, iid=1250, r_ui=3.0, est=4.136453598144568, details={'was_impossible': False}), Prediction(uid=111, iid=527, r_ui=4.0, est=4.333345890525637, details={'was_impossible': False}), Prediction(uid=564, iid=3087, r_ui=3.0, est=3.4216275257222004, details={'was_impossible': False}), Prediction(uid=624, iid=1019, r_ui=3.0, est=2.6077601660757193, details={'was_impossible': False}), Prediction(uid=17, iid=3503, r_ui=4.5, est=3.471445287014736, details={'was_impossible': False}), Prediction(uid=268, iid=79132, r_ui=4.5, est=3.981281934732146, details={'was_impossible': False}), Prediction(uid=239, iid=2581, r_ui=4.0, est=3.5277605577352564, details={'was_impossible': False}), Prediction(uid=134, iid=2997, r_ui=4.5, est=4.401971504075811, details={'was_impossible': False}), Prediction(uid=30, iid=3671, r_ui=4.0, est=3.964635321576899, details={'was_impossible': False}), Prediction(uid=452, iid=3987, r_ui=4.0, est=3.2221844511934075, details={'was_impossible': False}), Prediction(uid=615, iid=122882, r_ui=5.0, est=3.50708884069035, details={'was_impossible': False}), Prediction(uid=509, iid=6339, r_ui=4.5, est=3.106034311303092, details={'was_impossible': False}), Prediction(uid=195, iid=1944, r_ui=3.0, est=3.4231178238107915, details={'was_impossible': False}), Prediction(uid=457, iid=4734, r_ui=2.0, est=2.146625595300363, details={'was_impossible': False}), Prediction(uid=454, iid=2571, r_ui=4.0, est=4.368851126781687, details={'was_impossible': False}), Prediction(uid=460, iid=497, r_ui=5.0, est=4.100413882358506, details={'was_impossible': False}), Prediction(uid=547, iid=1217, r_ui=5.0, est=4.08430747234788, details={'was_impossible': False}), Prediction(uid=468, iid=7407, r_ui=2.5, est=2.870895378361822, details={'was_impossible': False}), Prediction(uid=564, iid=1566, r_ui=3.0, est=4.119257959405848, details={'was_impossible': False}), Prediction(uid=547, iid=50685, r_ui=3.0, est=3.4427535652813974, details={'was_impossible': False}), Prediction(uid=57, iid=1265, r_ui=4.0, est=4.021135800862367, details={'was_impossible': False}), Prediction(uid=196, iid=3512, r_ui=4.0, est=3.6858070037115485, details={'was_impossible': False}), Prediction(uid=452, iid=1258, r_ui=4.0, est=3.1831171161706204, details={'was_impossible': False}), Prediction(uid=585, iid=1914, r_ui=4.0, est=4.049651377890558, details={'was_impossible': False}), Prediction(uid=486, iid=104841, r_ui=2.5, est=3.809556484290706, details={'was_impossible': False}), Prediction(uid=518, iid=2718, r_ui=3.0, est=3.595236222848604, details={'was_impossible': False}), Prediction(uid=386, iid=31, r_ui=2.0, est=2.7276562212633153, details={'was_impossible': False}), Prediction(uid=559, iid=1196, r_ui=5.0, est=4.647250950421666, details={'was_impossible': False}), Prediction(uid=166, iid=1676, r_ui=3.0, est=3.0781360042282637, details={'was_impossible': False}), Prediction(uid=496, iid=266, r_ui=5.0, est=3.8145989847932156, details={'was_impossible': False}), Prediction(uid=73, iid=42011, r_ui=3.0, est=2.666526235104105, details={'was_impossible': False}), Prediction(uid=213, iid=122886, r_ui=4.0, est=3.0806580576611067, details={'was_impossible': False}), Prediction(uid=15, iid=2167, r_ui=3.0, est=2.5920613771037284, details={'was_impossible': False}), Prediction(uid=552, iid=165, r_ui=3.0, est=2.8072033423455034, details={'was_impossible': False}), Prediction(uid=587, iid=1217, r_ui=4.0, est=4.332899549780133, details={'was_impossible': False}), Prediction(uid=472, iid=2701, r_ui=3.0, est=2.47945243233002, details={'was_impossible': False}), Prediction(uid=624, iid=33166, r_ui=4.0, est=3.353664109126735, details={'was_impossible': False}), Prediction(uid=536, iid=293, r_ui=4.0, est=4.580098939936356, details={'was_impossible': False}), Prediction(uid=282, iid=1381, r_ui=0.5, est=3.072707764768421, details={'was_impossible': False}), Prediction(uid=146, iid=1196, r_ui=4.0, est=4.0748238095832185, details={'was_impossible': False}), Prediction(uid=79, iid=1101, r_ui=2.5, est=2.3292234463122043, details={'was_impossible': False}), Prediction(uid=298, iid=81229, r_ui=5.0, est=4.710375751790934, details={'was_impossible': False}), Prediction(uid=468, iid=531, r_ui=3.0, est=3.170518579777271, details={'was_impossible': False}), Prediction(uid=544, iid=134130, r_ui=5.0, est=4.635899432512957, details={'was_impossible': False}), Prediction(uid=381, iid=3994, r_ui=3.0, est=2.800129503572793, details={'was_impossible': False}), Prediction(uid=454, iid=593, r_ui=4.0, est=4.461218641587384, details={'was_impossible': False}), Prediction(uid=233, iid=100, r_ui=3.0, est=3.2323795028689024, details={'was_impossible': False}), Prediction(uid=342, iid=911, r_ui=5.0, est=4.346428777890683, details={'was_impossible': False}), Prediction(uid=295, iid=5135, r_ui=3.0, est=3.806794877195198, details={'was_impossible': False}), Prediction(uid=57, iid=1663, r_ui=4.0, est=3.707650092687217, details={'was_impossible': False}), Prediction(uid=386, iid=2671, r_ui=3.0, est=2.9826014638794067, details={'was_impossible': False}), Prediction(uid=238, iid=115569, r_ui=4.5, est=3.800102413606228, details={'was_impossible': False}), Prediction(uid=173, iid=1047, r_ui=4.0, est=4.035437757247916, details={'was_impossible': False}), Prediction(uid=40, iid=1198, r_ui=4.5, est=4.6750300266373666, details={'was_impossible': False}), Prediction(uid=30, iid=419, r_ui=2.0, est=3.341498579921659, details={'was_impossible': False}), Prediction(uid=439, iid=1391, r_ui=4.0, est=2.12591737784365, details={'was_impossible': False}), Prediction(uid=33, iid=1198, r_ui=4.0, est=4.095535883992069, details={'was_impossible': False}), Prediction(uid=119, iid=1356, r_ui=3.0, est=3.6768873531300668, details={'was_impossible': False}), Prediction(uid=176, iid=1246, r_ui=3.5, est=3.212924267349518, details={'was_impossible': False}), Prediction(uid=575, iid=1073, r_ui=4.0, est=4.08749895010842, details={'was_impossible': False}), Prediction(uid=312, iid=2959, r_ui=5.0, est=3.6504009209069403, details={'was_impossible': False}), Prediction(uid=307, iid=296, r_ui=5.0, est=4.666052605395914, details={'was_impossible': False}), Prediction(uid=585, iid=1831, r_ui=1.0, est=2.9804271402111837, details={'was_impossible': False}), Prediction(uid=475, iid=115170, r_ui=3.0, est=2.7716099656571815, details={'was_impossible': False}), Prediction(uid=178, iid=551, r_ui=3.5, est=3.368759071946859, details={'was_impossible': False}), Prediction(uid=388, iid=1178, r_ui=5.0, est=4.405665710683846, details={'was_impossible': False}), Prediction(uid=216, iid=2716, r_ui=4.5, est=4.382747879191886, details={'was_impossible': False}), Prediction(uid=275, iid=60763, r_ui=4.5, est=4.232180170596997, details={'was_impossible': False}), Prediction(uid=56, iid=457, r_ui=4.0, est=3.3626932506890017, details={'was_impossible': False}), Prediction(uid=105, iid=3686, r_ui=3.0, est=3.4672222913396094, details={'was_impossible': False}), Prediction(uid=479, iid=72998, r_ui=5.0, est=3.7859014172376364, details={'was_impossible': False}), Prediction(uid=480, iid=49278, r_ui=4.5, est=3.9034452513477493, details={'was_impossible': False}), Prediction(uid=19, iid=780, r_ui=3.0, est=2.8947526224920863, details={'was_impossible': False}), Prediction(uid=619, iid=318, r_ui=5.0, est=4.520887601333758, details={'was_impossible': False}), Prediction(uid=303, iid=4022, r_ui=3.5, est=3.4872733329130012, details={'was_impossible': False}), Prediction(uid=251, iid=34405, r_ui=3.0, est=4.3683916789907435, details={'was_impossible': False}), Prediction(uid=345, iid=4034, r_ui=3.0, est=4.085630117331853, details={'was_impossible': False}), Prediction(uid=550, iid=1200, r_ui=4.0, est=4.068155306502851, details={'was_impossible': False}), Prediction(uid=522, iid=97921, r_ui=2.5, est=3.57804174371153, details={'was_impossible': False}), Prediction(uid=311, iid=897, r_ui=2.0, est=3.0027452633790976, details={'was_impossible': False}), Prediction(uid=452, iid=3034, r_ui=4.0, est=3.520638231113678, details={'was_impossible': False}), Prediction(uid=120, iid=1748, r_ui=3.0, est=3.6544222583017048, details={'was_impossible': False}), Prediction(uid=292, iid=1219, r_ui=4.0, est=4.560675563055594, details={'was_impossible': False}), Prediction(uid=274, iid=1199, r_ui=4.0, est=3.8264181204845134, details={'was_impossible': False}), Prediction(uid=615, iid=45722, r_ui=3.0, est=3.531384364940997, details={'was_impossible': False}), Prediction(uid=213, iid=88356, r_ui=2.5, est=2.544182764987896, details={'was_impossible': False}), Prediction(uid=388, iid=30812, r_ui=3.5, est=3.6301942395876736, details={'was_impossible': False}), Prediction(uid=178, iid=60684, r_ui=3.5, est=3.1297464733597455, details={'was_impossible': False}), Prediction(uid=547, iid=32203, r_ui=4.0, est=3.3517913588846695, details={'was_impossible': False}), Prediction(uid=472, iid=5810, r_ui=3.0, est=3.8209686224750263, details={'was_impossible': False}), Prediction(uid=468, iid=3524, r_ui=3.0, est=3.086957884145284, details={'was_impossible': False}), Prediction(uid=577, iid=1955, r_ui=4.5, est=4.39395188813922, details={'was_impossible': False}), Prediction(uid=313, iid=357, r_ui=3.5, est=3.674612337444749, details={'was_impossible': False}), Prediction(uid=285, iid=2092, r_ui=2.0, est=2.989603716982149, details={'was_impossible': False}), Prediction(uid=624, iid=4448, r_ui=3.0, est=2.301022142515073, details={'was_impossible': False}), Prediction(uid=654, iid=17, r_ui=4.5, est=4.481095159224209, details={'was_impossible': False}), Prediction(uid=461, iid=5952, r_ui=4.5, est=3.991488496023693, details={'was_impossible': False}), Prediction(uid=518, iid=2641, r_ui=3.0, est=3.4053308372423325, details={'was_impossible': False}), Prediction(uid=596, iid=440, r_ui=3.0, est=3.8301816567785885, details={'was_impossible': False}), Prediction(uid=81, iid=308, r_ui=3.5, est=4.263592345555828, details={'was_impossible': False}), Prediction(uid=70, iid=92, r_ui=5.0, est=4.073347592322688, details={'was_impossible': False}), Prediction(uid=232, iid=2353, r_ui=4.0, est=4.011423356225267, details={'was_impossible': False}), Prediction(uid=652, iid=89745, r_ui=5.0, est=4.597392936429152, details={'was_impossible': False}), Prediction(uid=597, iid=1097, r_ui=4.0, est=4.033044501679754, details={'was_impossible': False}), Prediction(uid=357, iid=158, r_ui=3.0, est=3.664397721872371, details={'was_impossible': False}), Prediction(uid=478, iid=4993, r_ui=5.0, est=4.67328259789084, details={'was_impossible': False}), Prediction(uid=347, iid=6539, r_ui=3.0, est=3.763908371617503, details={'was_impossible': False}), Prediction(uid=182, iid=485, r_ui=3.0, est=3.2067667686812134, details={'was_impossible': False}), Prediction(uid=285, iid=2596, r_ui=4.0, est=3.427223896318565, details={'was_impossible': False}), Prediction(uid=283, iid=36517, r_ui=4.0, est=3.3947019038770945, details={'was_impossible': False}), Prediction(uid=4, iid=2454, r_ui=5.0, est=4.334728101081797, details={'was_impossible': False}), Prediction(uid=564, iid=1841, r_ui=3.0, est=3.444882301514455, details={'was_impossible': False}), Prediction(uid=5, iid=4995, r_ui=4.5, est=4.257359498672199, details={'was_impossible': False}), Prediction(uid=450, iid=5995, r_ui=5.0, est=4.42361808302309, details={'was_impossible': False}), Prediction(uid=213, iid=1020, r_ui=1.5, est=2.876980480966721, details={'was_impossible': False}), Prediction(uid=212, iid=2394, r_ui=3.0, est=2.939045081985911, details={'was_impossible': False}), Prediction(uid=342, iid=902, r_ui=4.0, est=4.11362437444045, details={'was_impossible': False}), Prediction(uid=362, iid=3481, r_ui=4.5, est=3.6921841353256077, details={'was_impossible': False}), Prediction(uid=534, iid=3274, r_ui=3.0, est=3.821017467808397, details={'was_impossible': False}), Prediction(uid=321, iid=2384, r_ui=4.0, est=3.312521707104706, details={'was_impossible': False}), Prediction(uid=120, iid=377, r_ui=3.0, est=3.4487835184507514, details={'was_impossible': False}), Prediction(uid=220, iid=4973, r_ui=5.0, est=4.106766264710472, details={'was_impossible': False}), Prediction(uid=430, iid=41571, r_ui=4.0, est=3.5897779411966844, details={'was_impossible': False}), Prediction(uid=207, iid=1367, r_ui=0.5, est=1.4628916350169947, details={'was_impossible': False}), Prediction(uid=358, iid=3019, r_ui=4.0, est=4.054851887898878, details={'was_impossible': False}), Prediction(uid=254, iid=537, r_ui=3.0, est=3.3088348750685395, details={'was_impossible': False}), Prediction(uid=463, iid=2505, r_ui=2.0, est=2.596509881764181, details={'was_impossible': False}), Prediction(uid=146, iid=33493, r_ui=3.5, est=3.6137610697325577, details={'was_impossible': False}), Prediction(uid=594, iid=1288, r_ui=5.0, est=4.284034969253535, details={'was_impossible': False}), Prediction(uid=623, iid=2706, r_ui=3.5, est=3.3901782999260464, details={'was_impossible': False}), Prediction(uid=124, iid=41997, r_ui=4.0, est=3.6356498913646584, details={'was_impossible': False}), Prediction(uid=318, iid=778, r_ui=4.0, est=3.953841980402715, details={'was_impossible': False}), Prediction(uid=402, iid=7099, r_ui=3.0, est=4.017648979797716, details={'was_impossible': False}), Prediction(uid=615, iid=64957, r_ui=3.0, est=3.7980647609442304, details={'was_impossible': False}), Prediction(uid=128, iid=5693, r_ui=2.0, est=4.146888714087943, details={'was_impossible': False}), Prediction(uid=199, iid=1356, r_ui=3.0, est=3.7746659014495916, details={'was_impossible': False}), Prediction(uid=15, iid=66097, r_ui=3.0, est=2.713565951941702, details={'was_impossible': False}), Prediction(uid=236, iid=2201, r_ui=4.0, est=3.7202546779924655, details={'was_impossible': False}), Prediction(uid=104, iid=27611, r_ui=3.5, est=3.962908850028473, details={'was_impossible': False}), Prediction(uid=77, iid=5349, r_ui=4.0, est=3.431643635109522, details={'was_impossible': False}), Prediction(uid=382, iid=2324, r_ui=3.0, est=3.791190245282426, details={'was_impossible': False}), Prediction(uid=452, iid=3556, r_ui=2.0, est=3.076082564931817, details={'was_impossible': False}), Prediction(uid=433, iid=380, r_ui=3.5, est=3.330415632745842, details={'was_impossible': False}), Prediction(uid=128, iid=2687, r_ui=5.0, est=4.39809058345316, details={'was_impossible': False}), Prediction(uid=48, iid=96821, r_ui=4.0, est=3.766073572254885, details={'was_impossible': False}), Prediction(uid=26, iid=79132, r_ui=3.5, est=3.636973104249833, details={'was_impossible': False}), Prediction(uid=8, iid=1961, r_ui=5.0, est=3.9334240496560904, details={'was_impossible': False}), Prediction(uid=571, iid=52462, r_ui=4.0, est=3.6441429767752327, details={'was_impossible': False}), Prediction(uid=629, iid=24, r_ui=1.0, est=3.246617037120038, details={'was_impossible': False}), Prediction(uid=661, iid=1185, r_ui=5.0, est=3.908183736370407, details={'was_impossible': False}), Prediction(uid=570, iid=68358, r_ui=1.5, est=3.366102618482349, details={'was_impossible': False}), Prediction(uid=330, iid=924, r_ui=1.0, est=3.749287679059557, details={'was_impossible': False}), Prediction(uid=388, iid=8636, r_ui=4.5, est=3.79119590487797, details={'was_impossible': False}), Prediction(uid=242, iid=368, r_ui=4.0, est=4.429757886803024, details={'was_impossible': False}), Prediction(uid=152, iid=6934, r_ui=4.5, est=2.7748565595194434, details={'was_impossible': False}), Prediction(uid=150, iid=1409, r_ui=3.5, est=2.7228453131296684, details={'was_impossible': False}), Prediction(uid=384, iid=7147, r_ui=4.0, est=3.896742010614506, details={'was_impossible': False}), Prediction(uid=189, iid=145, r_ui=2.0, est=2.7993589814726363, details={'was_impossible': False}), Prediction(uid=212, iid=508, r_ui=4.0, est=2.981106528719187, details={'was_impossible': False}), Prediction(uid=632, iid=780, r_ui=4.0, est=3.9534539492401, details={'was_impossible': False}), Prediction(uid=56, iid=2617, r_ui=4.0, est=3.2448616236530037, details={'was_impossible': False}), Prediction(uid=253, iid=2, r_ui=4.0, est=3.338063864081899, details={'was_impossible': False}), Prediction(uid=187, iid=35836, r_ui=5.0, est=3.8681840239865295, details={'was_impossible': False}), Prediction(uid=647, iid=590, r_ui=3.0, est=4.119057232197308, details={'was_impossible': False}), Prediction(uid=380, iid=33166, r_ui=4.5, est=3.8423942833524443, details={'was_impossible': False}), Prediction(uid=564, iid=2016, r_ui=3.0, est=3.7832772591589046, details={'was_impossible': False}), Prediction(uid=119, iid=2012, r_ui=2.0, est=3.127240095619669, details={'was_impossible': False}), Prediction(uid=148, iid=913, r_ui=4.0, est=4.720872124978706, details={'was_impossible': False}), Prediction(uid=311, iid=2916, r_ui=3.0, est=3.114300387097803, details={'was_impossible': False}), Prediction(uid=137, iid=367, r_ui=5.0, est=3.2848152328463742, details={'was_impossible': False}), Prediction(uid=380, iid=2791, r_ui=4.0, est=4.073121326305772, details={'was_impossible': False}), Prediction(uid=452, iid=1213, r_ui=4.0, est=3.4572037186954225, details={'was_impossible': False}), Prediction(uid=453, iid=558, r_ui=2.0, est=3.6461219116449257, details={'was_impossible': False}), Prediction(uid=288, iid=593, r_ui=5.0, est=4.404826176750042, details={'was_impossible': False}), Prediction(uid=240, iid=5816, r_ui=3.5, est=4.017625446180455, details={'was_impossible': False}), Prediction(uid=254, iid=39, r_ui=2.0, est=3.43051185055739, details={'was_impossible': False}), Prediction(uid=595, iid=3793, r_ui=2.0, est=4.2326953380071695, details={'was_impossible': False}), Prediction(uid=422, iid=70, r_ui=3.0, est=3.623944219349915, details={'was_impossible': False}), Prediction(uid=651, iid=1250, r_ui=4.0, est=4.140017904777011, details={'was_impossible': False}), Prediction(uid=608, iid=4970, r_ui=4.0, est=3.707912783719789, details={'was_impossible': False}), Prediction(uid=68, iid=2716, r_ui=4.0, est=3.731865032132945, details={'was_impossible': False}), Prediction(uid=665, iid=2761, r_ui=4.0, est=3.7120667196662263, details={'was_impossible': False}), Prediction(uid=262, iid=5617, r_ui=5.0, est=2.477731174277064, details={'was_impossible': False}), Prediction(uid=582, iid=4734, r_ui=4.0, est=3.2553802569520824, details={'was_impossible': False}), Prediction(uid=307, iid=48394, r_ui=4.0, est=4.172662707589184, details={'was_impossible': False}), Prediction(uid=546, iid=216, r_ui=5.0, est=3.87289529204714, details={'was_impossible': False}), Prediction(uid=346, iid=3703, r_ui=4.0, est=3.7022668703828887, details={'was_impossible': False}), Prediction(uid=283, iid=8966, r_ui=4.5, est=3.6331824282645595, details={'was_impossible': False}), Prediction(uid=320, iid=541, r_ui=5.0, est=3.7202825727846025, details={'was_impossible': False}), Prediction(uid=465, iid=4011, r_ui=5.0, est=4.288635163840567, details={'was_impossible': False}), Prediction(uid=95, iid=3185, r_ui=4.0, est=3.7036563728757326, details={'was_impossible': False}), Prediction(uid=234, iid=2501, r_ui=4.0, est=3.9463253441698836, details={'was_impossible': False}), Prediction(uid=580, iid=52885, r_ui=3.0, est=3.55785114536153, details={'was_impossible': False}), Prediction(uid=388, iid=4047, r_ui=4.0, est=3.434249495493531, details={'was_impossible': False}), Prediction(uid=648, iid=1228, r_ui=3.5, est=4.288524991918247, details={'was_impossible': False}), Prediction(uid=385, iid=236, r_ui=3.0, est=3.3919536189594592, details={'was_impossible': False}), Prediction(uid=577, iid=173, r_ui=4.0, est=3.3707168441436526, details={'was_impossible': False}), Prediction(uid=73, iid=474, r_ui=3.5, est=3.58640294908461, details={'was_impossible': False}), Prediction(uid=427, iid=4722, r_ui=4.0, est=3.952545518346092, details={'was_impossible': False}), Prediction(uid=108, iid=34, r_ui=3.0, est=3.51677791850462, details={'was_impossible': False}), Prediction(uid=358, iid=344, r_ui=1.0, est=2.578974747950901, details={'was_impossible': False}), Prediction(uid=434, iid=300, r_ui=5.0, est=3.6960630813745166, details={'was_impossible': False}), Prediction(uid=353, iid=208, r_ui=0.5, est=1.78752635685026, details={'was_impossible': False}), Prediction(uid=313, iid=1608, r_ui=3.0, est=3.5291392129925994, details={'was_impossible': False}), Prediction(uid=461, iid=381, r_ui=1.0, est=2.526172527751118, details={'was_impossible': False}), Prediction(uid=311, iid=3111, r_ui=3.0, est=3.3615221922393776, details={'was_impossible': False}), Prediction(uid=509, iid=3201, r_ui=4.0, est=3.806182506800779, details={'was_impossible': False}), Prediction(uid=471, iid=26007, r_ui=4.5, est=3.5754547392700435, details={'was_impossible': False}), Prediction(uid=527, iid=1682, r_ui=3.5, est=3.4672446034228277, details={'was_impossible': False}), Prediction(uid=311, iid=8482, r_ui=2.5, est=3.379881633768542, details={'was_impossible': False}), Prediction(uid=236, iid=1036, r_ui=3.5, est=4.141898004305314, details={'was_impossible': False}), Prediction(uid=664, iid=98809, r_ui=4.5, est=3.8996579340168203, details={'was_impossible': False}), Prediction(uid=494, iid=1947, r_ui=2.0, est=4.288716780961025, details={'was_impossible': False}), Prediction(uid=73, iid=32587, r_ui=4.5, est=4.072832548744717, details={'was_impossible': False}), Prediction(uid=564, iid=2241, r_ui=3.0, est=3.2005068280478093, details={'was_impossible': False}), Prediction(uid=505, iid=1036, r_ui=3.5, est=3.410761934422719, details={'was_impossible': False}), Prediction(uid=547, iid=2634, r_ui=3.0, est=3.1413975203600666, details={'was_impossible': False}), Prediction(uid=262, iid=5151, r_ui=0.5, est=1.98420854892224, details={'was_impossible': False}), Prediction(uid=488, iid=7361, r_ui=4.5, est=4.277323643637988, details={'was_impossible': False}), Prediction(uid=427, iid=114028, r_ui=4.0, est=4.112328490890295, details={'was_impossible': False}), Prediction(uid=262, iid=8783, r_ui=2.5, est=2.416627051402127, details={'was_impossible': False}), Prediction(uid=665, iid=1888, r_ui=3.0, est=2.8745958415400836, details={'was_impossible': False}), Prediction(uid=491, iid=1171, r_ui=4.0, est=3.4570584127418575, details={'was_impossible': False}), Prediction(uid=468, iid=5349, r_ui=2.5, est=3.043101585382254, details={'was_impossible': False}), Prediction(uid=610, iid=50872, r_ui=3.5, est=3.870322786606117, details={'was_impossible': False}), Prediction(uid=547, iid=127178, r_ui=3.5, est=3.3517913588846695, details={'was_impossible': False}), Prediction(uid=358, iid=19, r_ui=1.0, est=1.8281462136064368, details={'was_impossible': False}), Prediction(uid=104, iid=88744, r_ui=4.0, est=3.8354204864544763, details={'was_impossible': False}), Prediction(uid=587, iid=1284, r_ui=4.0, est=4.141318620531142, details={'was_impossible': False}), Prediction(uid=195, iid=920, r_ui=4.0, est=3.34703538341098, details={'was_impossible': False}), Prediction(uid=202, iid=2012, r_ui=4.0, est=3.4198650402075264, details={'was_impossible': False}), Prediction(uid=470, iid=3264, r_ui=4.5, est=2.764833239963988, details={'was_impossible': False}), Prediction(uid=543, iid=810, r_ui=3.0, est=4.141741530526283, details={'was_impossible': False}), Prediction(uid=402, iid=47099, r_ui=4.5, est=3.7962209243007434, details={'was_impossible': False}), Prediction(uid=659, iid=446, r_ui=3.0, est=3.7996057025648136, details={'was_impossible': False}), Prediction(uid=168, iid=39, r_ui=4.0, est=3.443169022163895, details={'was_impossible': False}), Prediction(uid=547, iid=2918, r_ui=3.5, est=3.6814338241263322, details={'was_impossible': False}), Prediction(uid=431, iid=1197, r_ui=5.0, est=4.1923895485920335, details={'was_impossible': False}), Prediction(uid=164, iid=6331, r_ui=4.5, est=3.8650602917649675, details={'was_impossible': False}), Prediction(uid=624, iid=180, r_ui=3.5, est=2.772144565306129, details={'was_impossible': False}), Prediction(uid=560, iid=527, r_ui=4.5, est=4.891908747202386, details={'was_impossible': False}), Prediction(uid=313, iid=1201, r_ui=2.0, est=4.013530502766459, details={'was_impossible': False}), Prediction(uid=564, iid=68, r_ui=4.0, est=3.9282940588716415, details={'was_impossible': False}), Prediction(uid=468, iid=46974, r_ui=2.0, est=2.8330977576582894, details={'was_impossible': False}), Prediction(uid=294, iid=3911, r_ui=4.0, est=3.949713226761313, details={'was_impossible': False}), Prediction(uid=232, iid=1597, r_ui=5.0, est=4.284880699845584, details={'was_impossible': False}), Prediction(uid=370, iid=2617, r_ui=4.0, est=3.609145422752514, details={'was_impossible': False}), Prediction(uid=608, iid=1537, r_ui=4.0, est=4.002059320102567, details={'was_impossible': False}), Prediction(uid=30, iid=5544, r_ui=4.0, est=3.9470157695629675, details={'was_impossible': False}), Prediction(uid=561, iid=2791, r_ui=3.0, est=3.9500851790998337, details={'was_impossible': False}), Prediction(uid=443, iid=293, r_ui=5.0, est=5, details={'was_impossible': False}), Prediction(uid=471, iid=1387, r_ui=3.5, est=3.8828953186757365, details={'was_impossible': False}), Prediction(uid=77, iid=4446, r_ui=3.5, est=2.8931190364916355, details={'was_impossible': False}), Prediction(uid=57, iid=2088, r_ui=2.0, est=3.156489869876485, details={'was_impossible': False}), Prediction(uid=529, iid=1416, r_ui=3.0, est=3.317294135057128, details={'was_impossible': False}), Prediction(uid=648, iid=4388, r_ui=2.0, est=2.635752267542684, details={'was_impossible': False}), Prediction(uid=516, iid=529, r_ui=3.0, est=4.089178643984386, details={'was_impossible': False}), Prediction(uid=306, iid=2145, r_ui=4.0, est=3.2826495665971613, details={'was_impossible': False}), Prediction(uid=631, iid=1722, r_ui=3.0, est=3.5633052189776664, details={'was_impossible': False}), Prediction(uid=197, iid=1391, r_ui=4.0, est=3.7193576707864158, details={'was_impossible': False}), Prediction(uid=30, iid=1036, r_ui=4.0, est=4.139326533783728, details={'was_impossible': False}), Prediction(uid=15, iid=68319, r_ui=2.0, est=1.7263586479015198, details={'was_impossible': False}), Prediction(uid=509, iid=3996, r_ui=4.5, est=3.4738111112881613, details={'was_impossible': False}), Prediction(uid=587, iid=123, r_ui=3.0, est=4.286723071510112, details={'was_impossible': False}), Prediction(uid=421, iid=454, r_ui=4.0, est=3.0716282022824672, details={'was_impossible': False}), Prediction(uid=575, iid=953, r_ui=5.0, est=4.269775696100678, details={'was_impossible': False}), Prediction(uid=390, iid=1459, r_ui=3.0, est=2.822297773888152, details={'was_impossible': False}), Prediction(uid=496, iid=485, r_ui=3.0, est=3.544655055393824, details={'was_impossible': False}), Prediction(uid=270, iid=79132, r_ui=5.0, est=4.367072386697682, details={'was_impossible': False}), Prediction(uid=412, iid=837, r_ui=3.0, est=2.9026533504877987, details={'was_impossible': False}), Prediction(uid=397, iid=70862, r_ui=3.0, est=3.5717644312965215, details={'was_impossible': False}), Prediction(uid=547, iid=951, r_ui=3.0, est=3.9508415004513773, details={'was_impossible': False}), Prediction(uid=85, iid=709, r_ui=1.0, est=3.288220116531449, details={'was_impossible': False}), Prediction(uid=15, iid=77364, r_ui=2.5, est=2.7672576129451842, details={'was_impossible': False}), Prediction(uid=547, iid=2268, r_ui=3.0, est=3.728252797556469, details={'was_impossible': False}), Prediction(uid=213, iid=2126, r_ui=1.5, est=2.3352174114017044, details={'was_impossible': False}), Prediction(uid=664, iid=134170, r_ui=4.5, est=3.6645482916894623, details={'was_impossible': False}), Prediction(uid=664, iid=56156, r_ui=2.5, est=3.3755190532258617, details={'was_impossible': False}), Prediction(uid=185, iid=440, r_ui=3.0, est=3.5555775614165013, details={'was_impossible': False}), Prediction(uid=30, iid=1366, r_ui=5.0, est=3.9336692370051196, details={'was_impossible': False}), Prediction(uid=624, iid=1037, r_ui=4.0, est=2.474477807512768, details={'was_impossible': False}), Prediction(uid=19, iid=733, r_ui=3.0, est=3.642858209029197, details={'was_impossible': False}), Prediction(uid=466, iid=2916, r_ui=3.0, est=3.8481866878826616, details={'was_impossible': False}), Prediction(uid=131, iid=2357, r_ui=4.0, est=4.01691514668075, details={'was_impossible': False}), Prediction(uid=102, iid=3354, r_ui=1.0, est=3.014643873537412, details={'was_impossible': False}), Prediction(uid=547, iid=3386, r_ui=5.0, est=2.9833081630473024, details={'was_impossible': False}), Prediction(uid=306, iid=1517, r_ui=4.0, est=3.5578263321270795, details={'was_impossible': False}), Prediction(uid=624, iid=108188, r_ui=2.5, est=3.112279240029928, details={'was_impossible': False}), Prediction(uid=187, iid=52281, r_ui=3.5, est=3.7985758721822065, details={'was_impossible': False}), ...]
# computing RMSE on the testset
accuracy.rmse(predictions)
RMSE: 0.8948
0.8948400071658592
The RMSE for the baseline SVD-based matrix factorization collaborative filtering recommendation system is 0.89, which is the lowest RMSE of all the models looked at so far.
userId=10
and for movieId=1240
¶svd.predict(10, 1240, r_ui=4, verbose=True)
user: 10 item: 1240 r_ui = 4.00 est = 4.01 {'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=4.008799081154056, details={'was_impossible': False})
This is very close to the actual value.
SVD predicts the unknown by minimizing the regularized squared error, and it achieves this minimization using stochastic gradient descent (SGD), which randomly considers user and item factors to get to the lowest point of error. See documentation here for some summary information.
The steps for implementing stochastic gradient descent are performed n_epoch
times, which can be set as a one of our hyper-parameters (default = 20). Two other hyperparameters that we will adjust, from the possible ones noted in the above documentation, include lr_all
(the learning rate for all parameters, default = 0.005) and reg_all
(the regularization term for all paramaters - default = 0.02). Adjusting these three hyperparameters, we see where we get our best results for the SVD-based matrix factorization recommendation system.
# set the parameter space to tune
param_grid = {'n_epochs': [20, 30, 40], 'lr_all': [0.005, 0.01, 0.015],
'reg_all': [0.1, 0.2, 0.4, 0.6]}
# performing 3-fold gridsearch cross validation
gs = GridSearchCV(SVD, param_grid, measures=['rmse', 'mae'], cv=4, n_jobs=-1)
# fitting data
gs.fit(data)
# best RMSE score
print(gs.best_score['rmse'])
# combination of parameters that gave the best RMSE score
print(gs.best_params['rmse'])
0.8785920937002906 {'n_epochs': 40, 'lr_all': 0.015, 'reg_all': 0.1}
The optimal values for each of those hyperparameters is shown above. We also see the best RMSE value of all of our models thus far.
Below we analyse our evaluation metrics (RMSE and MAE) at every split to see how the hyperparameters impact our results. Row #32 has our optimal results.
results_df = pd.DataFrame.from_dict(gs.cv_results)
results_df
split0_test_rmse | split1_test_rmse | split2_test_rmse | split3_test_rmse | mean_test_rmse | std_test_rmse | rank_test_rmse | split0_test_mae | split1_test_mae | split2_test_mae | ... | std_test_mae | rank_test_mae | mean_fit_time | std_fit_time | mean_test_time | std_test_time | params | param_n_epochs | param_lr_all | param_reg_all | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.889287 | 0.895684 | 0.896793 | 0.893364 | 0.893782 | 0.002875 | 12 | 0.687334 | 0.690880 | 0.692926 | ... | 0.002091 | 11 | 4.907424 | 0.035929 | 0.198053 | 0.009416 | {'n_epochs': 20, 'lr_all': 0.005, 'reg_all': 0.1} | 20 | 0.005 | 0.1 |
1 | 0.893410 | 0.899075 | 0.899628 | 0.896675 | 0.897197 | 0.002452 | 18 | 0.691434 | 0.694936 | 0.696137 | ... | 0.001870 | 18 | 4.643752 | 0.056184 | 0.203110 | 0.016370 | {'n_epochs': 20, 'lr_all': 0.005, 'reg_all': 0.2} | 20 | 0.005 | 0.2 |
2 | 0.902190 | 0.907390 | 0.907924 | 0.905125 | 0.905657 | 0.002261 | 27 | 0.700632 | 0.703894 | 0.704723 | ... | 0.001718 | 27 | 5.006993 | 0.026943 | 0.251520 | 0.034974 | {'n_epochs': 20, 'lr_all': 0.005, 'reg_all': 0.4} | 20 | 0.005 | 0.4 |
3 | 0.911375 | 0.916297 | 0.917027 | 0.914248 | 0.914737 | 0.002192 | 36 | 0.710195 | 0.713450 | 0.714080 | ... | 0.001667 | 36 | 4.866307 | 0.053226 | 0.247454 | 0.013095 | {'n_epochs': 20, 'lr_all': 0.005, 'reg_all': 0.6} | 20 | 0.005 | 0.6 |
4 | 0.884319 | 0.889818 | 0.888740 | 0.886292 | 0.887293 | 0.002140 | 7 | 0.682305 | 0.685766 | 0.685481 | ... | 0.001729 | 7 | 4.747752 | 0.047046 | 0.251120 | 0.010971 | {'n_epochs': 20, 'lr_all': 0.01, 'reg_all': 0.1} | 20 | 0.010 | 0.1 |
5 | 0.891541 | 0.896481 | 0.896100 | 0.893819 | 0.894485 | 0.001981 | 14 | 0.689554 | 0.692717 | 0.692900 | ... | 0.001661 | 16 | 4.932107 | 0.022482 | 0.262826 | 0.023376 | {'n_epochs': 20, 'lr_all': 0.01, 'reg_all': 0.2} | 20 | 0.010 | 0.2 |
6 | 0.900268 | 0.904835 | 0.904871 | 0.902560 | 0.903133 | 0.001901 | 23 | 0.698744 | 0.702013 | 0.702069 | ... | 0.001629 | 23 | 4.814769 | 0.015343 | 0.254261 | 0.025417 | {'n_epochs': 20, 'lr_all': 0.01, 'reg_all': 0.4} | 20 | 0.010 | 0.4 |
7 | 0.909512 | 0.913912 | 0.914248 | 0.911989 | 0.912415 | 0.001885 | 32 | 0.708387 | 0.711782 | 0.711854 | ... | 0.001644 | 32 | 5.084309 | 0.031786 | 0.245589 | 0.031464 | {'n_epochs': 20, 'lr_all': 0.01, 'reg_all': 0.6} | 20 | 0.010 | 0.6 |
8 | 0.878450 | 0.884841 | 0.884523 | 0.881793 | 0.882402 | 0.002571 | 5 | 0.676748 | 0.681305 | 0.681174 | ... | 0.002077 | 5 | 4.496002 | 0.059148 | 0.229115 | 0.040079 | {'n_epochs': 20, 'lr_all': 0.015, 'reg_all': 0.1} | 20 | 0.015 | 0.1 |
9 | 0.892244 | 0.896745 | 0.896276 | 0.894522 | 0.894947 | 0.001766 | 17 | 0.690150 | 0.693053 | 0.692870 | ... | 0.001526 | 17 | 4.722367 | 0.023287 | 0.228481 | 0.028373 | {'n_epochs': 20, 'lr_all': 0.015, 'reg_all': 0.2} | 20 | 0.015 | 0.2 |
10 | 0.900949 | 0.905131 | 0.904926 | 0.903382 | 0.903597 | 0.001672 | 24 | 0.699365 | 0.702518 | 0.702141 | ... | 0.001473 | 24 | 4.724121 | 0.033399 | 0.247074 | 0.027006 | {'n_epochs': 20, 'lr_all': 0.015, 'reg_all': 0.4} | 20 | 0.015 | 0.4 |
11 | 0.910023 | 0.914050 | 0.914133 | 0.912742 | 0.912737 | 0.001661 | 34 | 0.708852 | 0.712265 | 0.711960 | ... | 0.001545 | 34 | 4.811101 | 0.026821 | 0.231242 | 0.025073 | {'n_epochs': 20, 'lr_all': 0.015, 'reg_all': 0.6} | 20 | 0.015 | 0.6 |
12 | 0.886352 | 0.891981 | 0.891683 | 0.889412 | 0.889857 | 0.002254 | 8 | 0.684115 | 0.687621 | 0.688430 | ... | 0.001777 | 8 | 6.945273 | 0.037310 | 0.218898 | 0.017352 | {'n_epochs': 30, 'lr_all': 0.005, 'reg_all': 0.1} | 30 | 0.005 | 0.1 |
13 | 0.891145 | 0.896265 | 0.896691 | 0.893925 | 0.894506 | 0.002208 | 15 | 0.689179 | 0.692302 | 0.693169 | ... | 0.001694 | 15 | 7.130922 | 0.074298 | 0.282986 | 0.038113 | {'n_epochs': 30, 'lr_all': 0.005, 'reg_all': 0.2} | 30 | 0.005 | 0.2 |
14 | 0.899824 | 0.904809 | 0.905032 | 0.902426 | 0.903023 | 0.002110 | 22 | 0.698334 | 0.701653 | 0.702071 | ... | 0.001659 | 22 | 6.923961 | 0.093919 | 0.245914 | 0.036600 | {'n_epochs': 30, 'lr_all': 0.005, 'reg_all': 0.4} | 30 | 0.005 | 0.4 |
15 | 0.909080 | 0.913820 | 0.914362 | 0.911723 | 0.912246 | 0.002077 | 31 | 0.707998 | 0.711367 | 0.711677 | ... | 0.001641 | 31 | 7.255839 | 0.042217 | 0.208945 | 0.019479 | {'n_epochs': 30, 'lr_all': 0.005, 'reg_all': 0.6} | 30 | 0.005 | 0.6 |
16 | 0.877354 | 0.884722 | 0.883054 | 0.881908 | 0.881759 | 0.002733 | 4 | 0.676553 | 0.680857 | 0.679566 | ... | 0.001606 | 4 | 6.900545 | 0.017741 | 0.228924 | 0.017723 | {'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.1} | 30 | 0.010 | 0.1 |
17 | 0.890889 | 0.895224 | 0.895054 | 0.893435 | 0.893651 | 0.001741 | 11 | 0.688591 | 0.691556 | 0.691713 | ... | 0.001497 | 12 | 7.586886 | 0.073091 | 0.298611 | 0.008481 | {'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.2} | 30 | 0.010 | 0.2 |
18 | 0.899687 | 0.904074 | 0.903992 | 0.902021 | 0.902444 | 0.001791 | 20 | 0.698060 | 0.701373 | 0.701054 | ... | 0.001530 | 21 | 9.626847 | 0.019383 | 0.312823 | 0.020366 | {'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.4} | 30 | 0.010 | 0.4 |
19 | 0.908855 | 0.913046 | 0.913237 | 0.911357 | 0.911624 | 0.001758 | 30 | 0.707658 | 0.711117 | 0.710879 | ... | 0.001565 | 30 | 8.433409 | 0.082019 | 0.250213 | 0.013254 | {'n_epochs': 30, 'lr_all': 0.01, 'reg_all': 0.6} | 30 | 0.010 | 0.6 |
20 | 0.876642 | 0.882354 | 0.882829 | 0.879578 | 0.880351 | 0.002475 | 3 | 0.675280 | 0.678004 | 0.680010 | ... | 0.002022 | 3 | 7.032798 | 0.036823 | 0.216775 | 0.011125 | {'n_epochs': 30, 'lr_all': 0.015, 'reg_all': 0.1} | 30 | 0.015 | 0.1 |
21 | 0.892513 | 0.895730 | 0.895695 | 0.894369 | 0.894577 | 0.001312 | 16 | 0.690020 | 0.692230 | 0.692174 | ... | 0.001319 | 14 | 7.375063 | 0.051772 | 0.243165 | 0.023399 | {'n_epochs': 30, 'lr_all': 0.015, 'reg_all': 0.2} | 30 | 0.015 | 0.2 |
22 | 0.901343 | 0.905119 | 0.904806 | 0.903838 | 0.903776 | 0.001482 | 25 | 0.699496 | 0.702532 | 0.701867 | ... | 0.001321 | 25 | 7.453477 | 0.046341 | 0.245300 | 0.030009 | {'n_epochs': 30, 'lr_all': 0.015, 'reg_all': 0.4} | 30 | 0.015 | 0.4 |
23 | 0.910148 | 0.913905 | 0.913959 | 0.912923 | 0.912734 | 0.001549 | 33 | 0.708815 | 0.712205 | 0.711637 | ... | 0.001443 | 33 | 7.359293 | 0.039610 | 0.254271 | 0.025691 | {'n_epochs': 30, 'lr_all': 0.015, 'reg_all': 0.6} | 30 | 0.015 | 0.6 |
24 | 0.883119 | 0.887878 | 0.889380 | 0.885637 | 0.886504 | 0.002365 | 6 | 0.680979 | 0.683728 | 0.686224 | ... | 0.001994 | 6 | 9.819291 | 0.044159 | 0.215980 | 0.011752 | {'n_epochs': 40, 'lr_all': 0.005, 'reg_all': 0.1} | 40 | 0.005 | 0.1 |
25 | 0.889913 | 0.895138 | 0.895204 | 0.892408 | 0.893166 | 0.002191 | 9 | 0.687900 | 0.691328 | 0.691892 | ... | 0.001771 | 10 | 9.815194 | 0.035837 | 0.233569 | 0.023276 | {'n_epochs': 40, 'lr_all': 0.005, 'reg_all': 0.2} | 40 | 0.005 | 0.2 |
26 | 0.898868 | 0.903566 | 0.903780 | 0.901293 | 0.901877 | 0.001992 | 19 | 0.697328 | 0.700661 | 0.700842 | ... | 0.001618 | 19 | 9.699466 | 0.050049 | 0.229998 | 0.032528 | {'n_epochs': 40, 'lr_all': 0.005, 'reg_all': 0.4} | 40 | 0.005 | 0.4 |
27 | 0.908069 | 0.912672 | 0.913087 | 0.910608 | 0.911109 | 0.001990 | 28 | 0.706980 | 0.710466 | 0.710490 | ... | 0.001616 | 28 | 9.777815 | 0.041957 | 0.231125 | 0.012477 | {'n_epochs': 40, 'lr_all': 0.005, 'reg_all': 0.6} | 40 | 0.005 | 0.6 |
28 | 0.876931 | 0.880275 | 0.882167 | 0.878388 | 0.879440 | 0.001971 | 2 | 0.675817 | 0.677035 | 0.679125 | ... | 0.001749 | 2 | 9.772123 | 0.071905 | 0.194779 | 0.004813 | {'n_epochs': 40, 'lr_all': 0.01, 'reg_all': 0.1} | 40 | 0.010 | 0.1 |
29 | 0.890798 | 0.894802 | 0.894765 | 0.893061 | 0.893356 | 0.001636 | 10 | 0.688288 | 0.691205 | 0.691152 | ... | 0.001404 | 9 | 9.452569 | 0.019243 | 0.218454 | 0.021242 | {'n_epochs': 40, 'lr_all': 0.01, 'reg_all': 0.2} | 40 | 0.010 | 0.2 |
30 | 0.899917 | 0.903953 | 0.903880 | 0.902216 | 0.902491 | 0.001641 | 21 | 0.698114 | 0.701301 | 0.700807 | ... | 0.001417 | 20 | 9.314218 | 0.057738 | 0.222919 | 0.010521 | {'n_epochs': 40, 'lr_all': 0.01, 'reg_all': 0.4} | 40 | 0.010 | 0.4 |
31 | 0.908898 | 0.912813 | 0.913048 | 0.911433 | 0.911548 | 0.001650 | 29 | 0.707590 | 0.711005 | 0.710603 | ... | 0.001482 | 29 | 9.594017 | 0.061242 | 0.218691 | 0.012652 | {'n_epochs': 40, 'lr_all': 0.01, 'reg_all': 0.6} | 40 | 0.010 | 0.6 |
32 | 0.877153 | 0.879279 | 0.879857 | 0.878080 | 0.878592 | 0.001049 | 1 | 0.675511 | 0.676056 | 0.676701 | ... | 0.000748 | 1 | 9.374955 | 0.011734 | 0.217607 | 0.009979 | {'n_epochs': 40, 'lr_all': 0.015, 'reg_all': 0.1} | 40 | 0.015 | 0.1 |
33 | 0.892061 | 0.895199 | 0.895381 | 0.893628 | 0.894067 | 0.001344 | 13 | 0.689499 | 0.691623 | 0.691581 | ... | 0.001357 | 13 | 9.697708 | 0.035200 | 0.239734 | 0.019077 | {'n_epochs': 40, 'lr_all': 0.015, 'reg_all': 0.2} | 40 | 0.015 | 0.2 |
34 | 0.901978 | 0.905552 | 0.905299 | 0.904548 | 0.904344 | 0.001415 | 26 | 0.699956 | 0.702869 | 0.702148 | ... | 0.001242 | 26 | 9.914235 | 0.041879 | 0.208660 | 0.013370 | {'n_epochs': 40, 'lr_all': 0.015, 'reg_all': 0.4} | 40 | 0.015 | 0.4 |
35 | 0.910599 | 0.914152 | 0.914197 | 0.913370 | 0.913080 | 0.001469 | 35 | 0.709125 | 0.712419 | 0.711775 | ... | 0.001379 | 35 | 9.647428 | 0.022947 | 0.207708 | 0.018283 | {'n_epochs': 40, 'lr_all': 0.015, 'reg_all': 0.6} | 40 | 0.015 | 0.6 |
36 rows × 22 columns
# What if we randomized the grid search? Let's create a large number of possible hyperparameter setups that will
# be randomly selected:
param_dist = {'n_epochs': [30,35,40,45,50,55], 'lr_all': [0.005, 0.01, 0.015, 0.020, 0.025, 0.030],
'reg_all': [0.0025, 0.005, 0.075, 0.1, 0.2, 0.4, 0.6]}
# Instantiate RandomizedSearchCV:
# Note: n_iter chooses the number of iterations to try. Because n_iter=1 took a little over 1.25 minutes to complete,
# I set it to n_iter=20 for practical considerations.
gs_random = RandomizedSearchCV(SVD, param_distributions=param_dist, n_iter = 20, measures=['rmse','mae'], cv=4, random_state=1)
# fitting data
gs_random.fit(data)
# best RMSE score
print(gs_random.best_score['rmse'])
# combination of parameters that gave the best RMSE score
print(gs_random.best_params['rmse'])
0.876275522179301 {'n_epochs': 45, 'lr_all': 0.015, 'reg_all': 0.1}
We see that the best selected hyperparameters from this randomized grid search cross validation resulted in almost the same RMSE as our prior manual grid search (the RMSE is very slightly lower here), however it took less time because we only had 20 hyperparameter variations to try.
results_df = pd.DataFrame.from_dict(gs_random.cv_results)
results_df
split0_test_rmse | split1_test_rmse | split2_test_rmse | split3_test_rmse | mean_test_rmse | std_test_rmse | rank_test_rmse | split0_test_mae | split1_test_mae | split2_test_mae | ... | std_test_mae | rank_test_mae | mean_fit_time | std_fit_time | mean_test_time | std_test_time | params | param_n_epochs | param_lr_all | param_reg_all | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.958687 | 0.958557 | 0.948262 | 0.959766 | 0.956318 | 0.004675 | 19 | 0.740582 | 0.741152 | 0.733448 | ... | 0.003121 | 19 | 9.012010 | 0.394251 | 0.230763 | 0.058377 | {'n_epochs': 40, 'lr_all': 0.01, 'reg_all': 0.... | 40 | 0.010 | 0.0050 |
1 | 0.907121 | 0.909820 | 0.902198 | 0.919445 | 0.909646 | 0.006283 | 12 | 0.706435 | 0.707496 | 0.702126 | ... | 0.004543 | 12 | 9.202670 | 0.706988 | 0.215634 | 0.062542 | {'n_epochs': 40, 'lr_all': 0.02, 'reg_all': 0.6} | 40 | 0.020 | 0.6000 |
2 | 0.950210 | 0.953680 | 0.946164 | 0.961159 | 0.952803 | 0.005509 | 17 | 0.732112 | 0.734807 | 0.730986 | ... | 0.004106 | 17 | 12.252269 | 0.194457 | 0.224913 | 0.056340 | {'n_epochs': 55, 'lr_all': 0.015, 'reg_all': 0... | 55 | 0.015 | 0.0050 |
3 | 0.955102 | 0.954245 | 0.944826 | 0.960358 | 0.953633 | 0.005597 | 18 | 0.734523 | 0.737291 | 0.733171 | ... | 0.002938 | 18 | 6.540912 | 0.055646 | 0.201084 | 0.043778 | {'n_epochs': 30, 'lr_all': 0.02, 'reg_all': 0.... | 30 | 0.020 | 0.0050 |
4 | 0.899117 | 0.902282 | 0.894903 | 0.911412 | 0.901928 | 0.006069 | 10 | 0.697388 | 0.698880 | 0.694074 | ... | 0.004286 | 10 | 9.535329 | 0.097165 | 0.211461 | 0.045961 | {'n_epochs': 45, 'lr_all': 0.02, 'reg_all': 0.4} | 45 | 0.020 | 0.4000 |
5 | 0.880997 | 0.884281 | 0.877006 | 0.892107 | 0.883598 | 0.005547 | 5 | 0.678125 | 0.680950 | 0.676095 | ... | 0.003658 | 5 | 8.651881 | 0.074844 | 0.199561 | 0.053229 | {'n_epochs': 40, 'lr_all': 0.005, 'reg_all': 0... | 40 | 0.005 | 0.0750 |
6 | 0.874990 | 0.878430 | 0.871024 | 0.888693 | 0.878284 | 0.006556 | 2 | 0.672211 | 0.674450 | 0.670345 | ... | 0.004161 | 2 | 11.154367 | 0.331471 | 0.206102 | 0.044177 | {'n_epochs': 50, 'lr_all': 0.02, 'reg_all': 0.... | 50 | 0.020 | 0.0750 |
7 | 0.940037 | 0.946208 | 0.933736 | 0.951955 | 0.942984 | 0.006802 | 15 | 0.722916 | 0.731932 | 0.721874 | ... | 0.005654 | 15 | 8.775966 | 0.094439 | 0.218130 | 0.041578 | {'n_epochs': 40, 'lr_all': 0.03, 'reg_all': 0.... | 40 | 0.030 | 0.0050 |
8 | 0.932732 | 0.943317 | 0.932908 | 0.947845 | 0.939201 | 0.006579 | 14 | 0.719535 | 0.729927 | 0.721892 | ... | 0.004923 | 14 | 12.301071 | 0.270166 | 0.211972 | 0.059674 | {'n_epochs': 55, 'lr_all': 0.025, 'reg_all': 0... | 55 | 0.025 | 0.0050 |
9 | 0.888021 | 0.890684 | 0.883953 | 0.899347 | 0.890501 | 0.005642 | 8 | 0.685605 | 0.687233 | 0.683134 | ... | 0.003783 | 8 | 12.273054 | 0.263827 | 0.248644 | 0.087256 | {'n_epochs': 55, 'lr_all': 0.005, 'reg_all': 0.2} | 55 | 0.005 | 0.2000 |
10 | 0.943355 | 0.942323 | 0.935547 | 0.951422 | 0.943162 | 0.005634 | 16 | 0.727018 | 0.728850 | 0.722205 | ... | 0.004100 | 16 | 10.760119 | 0.092125 | 0.189821 | 0.040413 | {'n_epochs': 50, 'lr_all': 0.025, 'reg_all': 0... | 50 | 0.025 | 0.0050 |
11 | 0.886907 | 0.890778 | 0.883498 | 0.899486 | 0.890167 | 0.005965 | 6 | 0.684532 | 0.687203 | 0.682693 | ... | 0.004143 | 6 | 9.625142 | 0.037557 | 0.190503 | 0.041256 | {'n_epochs': 45, 'lr_all': 0.015, 'reg_all': 0.2} | 45 | 0.015 | 0.2000 |
12 | 0.887899 | 0.891053 | 0.884101 | 0.899792 | 0.890711 | 0.005792 | 9 | 0.685575 | 0.687650 | 0.683376 | ... | 0.003885 | 9 | 10.756162 | 0.032365 | 0.215820 | 0.049119 | {'n_epochs': 50, 'lr_all': 0.005, 'reg_all': 0.2} | 50 | 0.005 | 0.2000 |
13 | 0.887269 | 0.890685 | 0.883829 | 0.899973 | 0.890439 | 0.006015 | 7 | 0.685073 | 0.687218 | 0.683158 | ... | 0.004032 | 7 | 7.534344 | 0.038182 | 0.189802 | 0.041770 | {'n_epochs': 35, 'lr_all': 0.01, 'reg_all': 0.2} | 35 | 0.010 | 0.2000 |
14 | 0.924036 | 0.924825 | 0.917485 | 0.935007 | 0.925338 | 0.006267 | 13 | 0.712655 | 0.712865 | 0.707019 | ... | 0.004289 | 13 | 6.489825 | 0.049201 | 0.215146 | 0.046393 | {'n_epochs': 30, 'lr_all': 0.005, 'reg_all': 0... | 30 | 0.005 | 0.0050 |
15 | 0.899956 | 0.903093 | 0.895569 | 0.912286 | 0.902726 | 0.006132 | 11 | 0.698127 | 0.699525 | 0.694743 | ... | 0.004293 | 11 | 7.526610 | 0.010670 | 0.190042 | 0.042200 | {'n_epochs': 35, 'lr_all': 0.025, 'reg_all': 0.4} | 35 | 0.025 | 0.4000 |
16 | 0.961586 | 0.969874 | 0.961245 | 0.974632 | 0.966834 | 0.005675 | 20 | 0.742982 | 0.751158 | 0.743240 | ... | 0.004790 | 20 | 6.404530 | 0.044745 | 0.214719 | 0.050587 | {'n_epochs': 30, 'lr_all': 0.025, 'reg_all': 0... | 30 | 0.025 | 0.0025 |
17 | 0.872294 | 0.877399 | 0.869874 | 0.885535 | 0.876276 | 0.005997 | 1 | 0.669489 | 0.674021 | 0.669501 | ... | 0.004307 | 1 | 9.692392 | 0.038960 | 0.195712 | 0.051792 | {'n_epochs': 45, 'lr_all': 0.015, 'reg_all': 0.1} | 45 | 0.015 | 0.1000 |
18 | 0.877420 | 0.879946 | 0.874184 | 0.888323 | 0.879968 | 0.005238 | 4 | 0.674561 | 0.675305 | 0.672726 | ... | 0.002955 | 4 | 9.693545 | 0.044369 | 0.192091 | 0.040586 | {'n_epochs': 45, 'lr_all': 0.02, 'reg_all': 0.... | 45 | 0.020 | 0.0750 |
19 | 0.875402 | 0.880358 | 0.872699 | 0.888824 | 0.879321 | 0.006136 | 3 | 0.672417 | 0.676394 | 0.670220 | ... | 0.004267 | 3 | 9.622753 | 0.008613 | 0.214280 | 0.046178 | {'n_epochs': 45, 'lr_all': 0.015, 'reg_all': 0... | 45 | 0.015 | 0.0750 |
20 rows × 22 columns
We will build our final model using the optimal hyperparameters determined by our random grid search cross validation.
# rebuilding our SVD model using our optimized hyperparameters
svd_optimized = SVD(n_epochs=45, lr_all=0.015, reg_all=0.1)
# training the model
svd_optimized.fit(trainset)
<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7f8f78fa59a0>
# using the trained model to predice the testset
predictions = svd_optimized.test(testset)
# Calculating RMSE
accuracy.rmse(predictions)
RMSE: 0.8687
0.8687258138614241
svd_optimized.predict(10, 1240, r_ui=4, verbose=True)
user: 10 item: 1240 r_ui = 4.00 est = 4.22 {'was_impossible': False}
Prediction(uid=10, iid=1240, r_ui=4, est=4.224312327299615, details={'was_impossible': False})
The algorithm very closely predicted the rating (within margin).
get_recommendation(rating, 10, 20, svd_optimized)
[(83359, 4.855674026655343), (3030, 4.786627768371971), (83411, 4.762368437025483), (8535, 4.761581751952642), (9010, 4.748995948294785), (1859, 4.731637053342826), (5238, 4.729790688607328), (52767, 4.704635447060147), (26587, 4.674634954611091), (3310, 4.666520506954548), (7116, 4.644763240522547), (1192, 4.638970128124664), (3010, 4.633875057293273), (132333, 4.624080480097617), (41527, 4.6202561004903995), (3038, 4.617170171690977), (1860, 4.611100763448932), (83318, 4.61083247653086), (116897, 4.6092590599423815), (131724, 4.604838465427107)]
Compare the above values to the top 20 recommended films by our optimized user-based collaborative filtering model:
(3038, 5), (309, 4.999999999999999), (6669, 4.881355932203389), (98491, 4.821987480438185), (178, 4.784881983866148), (2920, 4.784530386740332), (1860, 4.7713154312585075), (6776, 4.738562091503268), (4783, 4.733784741814604), (5017, 4.733386572357538), (4263, 4.731378922557885), (26326, 4.723524337675515), (7075, 4.704496788008566), (3414, 4.677535050537985), (1192, 4.662620550158588), (41527, 4.649484536082474), (116, 4.646309855193815), (116897, 4.6424191994394315), (2938, 4.6364787840405315), (766, 4.633780069379941)
And compare to the top 20 recommended films by our optimized item-based collaborative filtering model:
(78321, 4.8175582990397805), (3158, 4.686746987951807), (3161, 4.666666666666667), (2801, 4.652173913043478), (2837, 4.594594594594595), (3357, 4.594594594594595), (3207, 4.565217391304349), (4972, 4.565217391304349), (26394, 4.565217391304349), (6268, 4.560975609756098), (1870, 4.5), (2388, 4.5), (3790, 4.5), (30883, 4.5), (43177, 4.449993480245142), (6598, 4.433333333333333), (8199, 4.414452709883103), (760, 4.411764705882352), (4568, 4.409090909090908), (6506, 4.409090909090908)
We define a function below that locates all of the actual ratings that a user has made, and places them next to the predicted rating in a Pandas dataframe. This will help us explore the accuracy of our predictions using the three optimized models we made for any user. We will visualize the predicted and actual ratings with distribution plots.
def predict_prior_interacted_ratings(data, user_id, algo):
# empty list to store the recommended movie ids
recommendation_list = []
# creating a user-item interactions matrix from our data
user_item_interactions_matrix = data.pivot(index='userId', columns='movieId', values='rating')
# creating list of movie ids which the user has interacted with prior
prior_interacted_movies = user_item_interactions_matrix.loc[user_id][user_item_interactions_matrix.loc[user_id].notnull()].index.tolist()
# create a for loop to find movie_ids that our selected user has already rated
for item_id in prior_interacted_movies:
# extract actual rating
actual_rating = user_item_interactions_matrix.loc[user_id, item_id]
# predict the rating for prior interacted movie ids using our inputted algorithm
predicted_rating = algo.predict(user_id, item_id).est
# appending the movie_id, the actual rating, and the predicted rating to our empty list
recommendation_list.append((item_id, actual_rating, predicted_rating))
# sorting the predicted ratings in descending order
recommendation_list.sort(key=lambda x: x[1], reverse=True)
# create a dataframe from our recommendations list
return pd.DataFrame(recommendation_list, columns=['movieId', 'actual_rating', 'predicted_rating'])
# returing top n highest predicted rating movies for this user
similarity based recommendation
system that was user-based
:actual_predicted_ratings = predict_prior_interacted_ratings(rating, 20, optimized_KNN_user)
df = actual_predicted_ratings.melt(id_vars='movieId', value_vars=['actual_rating', 'predicted_rating'])
sns.displot(data=df, x='value', hue='variable', kde=True);
# We overlay a kernel density estimate (KDE) plot on top ('kde=True') so that we can visualize a smoothed version
# of the histogram.
similarity based recommendation
system that was item-based
:actual_predicted_ratings = predict_prior_interacted_ratings(rating, 20, optimized_KNN_item)
df = actual_predicted_ratings.melt(id_vars='movieId', value_vars=['actual_rating', 'predicted_rating'])
sns.displot(data=df, x='value', hue='variable', kde=True);
SVD matrix factorization based recommendation
:actual_predicted_ratings = predict_prior_interacted_ratings(rating, 20, svd_optimized)
df = actual_predicted_ratings.melt(id_vars='movieId', value_vars=['actual_rating', 'predicted_rating'])
sns.displot(data=df, x='value', hue='variable', kde=True);
Analysis: We can see that distribution of predicted ratings in each of these models generally follows the distribution of actual ratings for this user. The kernel density estimate plot that is overlayed on the histogram similarly shows the density of values for our predicted ratings paralleling the distribution of the actual ratings, but having a greater magnitude in the central region representing a rating of 3-4. Of the 3 models above, the SVD-based one seems to predict ratings across the spectrum, and includes ratings close to 1 and 5 just as the actual values contain, whereas the user and item based collaborative filter systems seem to predict values closer to the center, avoiding the edges closer to 1 and 5.
While we utilized RMSE above to measure the accuracy of our models, two other popular metrics for evaluating recommendation systems include Precision@k
and Recall@k
.
After defining a 'threshhold' for what a recommendation would look like for a user, e.g., anything above 3.5 stars (as we will use below), we look at the top n recommendations, defined by k (e.g., k=5 or k=10). A relevant item
would be defined as any item whose real rating value by the user is above our threshold of 3.5, while a recommended item
is an item that our algorithm predicted would be over this 3.5 rating threshold for that user (whether or not it was actually rated as such by the user).
Precision@k
looks at the top k recommendations for a user (i.e. the top k number of items above our rating threshold), and determines what proportion is relevant to the user. It is calculated in the following way:
{# of recommended items @k that are relevant, i.e. actually over the threshold}/{# of total recommended items @k}
In other words, Precision@K tries to determine the proportion of top K recommendations that are relevant This metric is helpful in making sure we minimize our model's false positives.
Recall@K
looks at the number of recommended items @k that were in fact relevant (i.e. the user actually did rate them over our threshold), and finds what proportion those correct and relevant recommendations were out of all of the relevant items.
{# of recommended items that are relevant}/{total # of relevant items}
In other words, Recall@K tries to see how many relevant items actually end up in the top k recommendations. This is helpful if we want to make sure that movies that are in fact relevant are actually appearing in our top k recommendations and not ending up as false negatives by our model.
The following function is taken from the surprise documentation FAQs here, and lets us compute precision@k and recall@k. It creates a dictionary for precisions and recalls, and assigns the computed precision and recall values to each user based on the formulas above.
def precision_recall_at_k(predictions, k=10, threshold=3.5):
"""Return precision and recall at k metrics for each user"""
# First map the predictions to each user.
user_est_true = defaultdict(list)
for uid, _, true_r, est, _ in predictions:
user_est_true[uid].append((est, true_r))
precisions = dict()
recalls = dict()
for uid, user_ratings in user_est_true.items():
# Sort user ratings by estimated value
user_ratings.sort(key=lambda x: x[0], reverse=True)
# Number of relevant items
n_rel = sum((true_r >= threshold) for (_, true_r) in user_ratings)
# Number of recommended items in top k
n_rec_k = sum((est >= threshold) for (est, _) in user_ratings[:k])
# Number of relevant and recommended items in top k
n_rel_and_rec_k = sum(((true_r >= threshold) and (est >= threshold))
for (est, true_r) in user_ratings[:k])
# Precision@K: Proportion of recommended items that are relevant
# When n_rec_k is 0, Precision is undefined. We here set it to 0.
precisions[uid] = n_rel_and_rec_k / n_rec_k if n_rec_k != 0 else 0
# Recall@K: Proportion of relevant items that are recommended
# When n_rel is 0, Recall is undefined. We here set it to 0.
recalls[uid] = n_rel_and_rec_k / n_rel if n_rel != 0 else 0
return precisions, recalls
# A cross-validation iterator that will be referenced in our for-loop below:
kf = KFold(n_splits=5)
# List of k values
K = [5, 10]
# List of our 6 models:
models = [knn_user, optimized_KNN_user, knn_item, optimized_KNN_item, svd, svd_optimized]
for k in K:
for model in models:
print('>> k={}, model={}'.format(k,model.__class__.__name__))
p = []
r = []
for trainset, testset in kf.split(data):
model.fit(trainset)
predictions = model.test(testset, verbose=False)
precisions, recalls = precision_recall_at_k(predictions, k=k, threshold=3.5)
# Precision and recall can then be averaged over all users
p.append(sum(prec for prec in precisions.values()) / len(precisions))
r.append(sum(rec for rec in recalls.values()) / len(recalls))
print('-----> Precision: ', round(sum(p) / len(p), 3))
print('-----> Recall: ', round(sum(r) / len(r), 3))
>> k=5, model=KNNBasic -----> Precision: 0.767 -----> Recall: 0.412 >> k=5, model=KNNBasic Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. -----> Precision: 0.773 -----> Recall: 0.417 >> k=5, model=KNNBasic -----> Precision: 0.613 -----> Recall: 0.328 >> k=5, model=KNNBasic -----> Precision: 0.677 -----> Recall: 0.352 >> k=5, model=SVD -----> Precision: 0.755 -----> Recall: 0.383 >> k=5, model=SVD -----> Precision: 0.775 -----> Recall: 0.396 >> k=10, model=KNNBasic -----> Precision: 0.749 -----> Recall: 0.549 >> k=10, model=KNNBasic Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. Computing the msd similarity matrix... Done computing similarity matrix. -----> Precision: 0.75 -----> Recall: 0.556 >> k=10, model=KNNBasic -----> Precision: 0.597 -----> Recall: 0.478 >> k=10, model=KNNBasic -----> Precision: 0.66 -----> Recall: 0.5 >> k=10, model=SVD -----> Precision: 0.736 -----> Recall: 0.522 >> k=10, model=SVD -----> Precision: 0.755 -----> Recall: 0.534
Summarizing the above outputs for each model along with our earlier RMSE scores:
knn_user (RMSE: 0.9901):
optimized_KNN_user (RMSE: 0.9529):
knn_item (RMSE: 0.9908):
optimized_KNN_item (RMSE: 0.9296):
svd (RMSE: 0.8948):
svd_optimized (RMSE: 0.8762):
Comments:
We see that no model was the best on all metrics when we compared RMSE, precision, and recall when k=5 or k=10. Comparing just our 3 optimized models, however, we see that the svd_optimized matrix factorization model did overall the best in terms of having the lowest RMSE, and the highest precision @k=5 and @k=10, but that it's recall wasn't the best of all models (though it was strong).
Our opimized_KNN_item did better than our optimized_KNN_user in terms of RMSE, but the precision and recall @k=5 and k=10 are better for our optimized_KNN_user. Our optimized_KNN_user system had the best recall and precision.
While our RMSE scores tell us that svd_optimized is the best in terms of the error associated with all of our predictions, when it comes to precision and recall, optimized_KNN_user is the strongest. These varying metrics should be considered based on practical considerations when being deployed.
In this case study, we saw three different ways of building recommendation systems:
Additionally, we utilized manual and randomized search grid cross validation to tune our hyperparameters and reduce our RMSE for our similarity-based and matrix factorization collaborative filtering models.
We also utilized our models to recommend top items for any user, and also evaluated the precision and recall of our models @k.
As was noted, there are advantages and disadvantages of these various recommendation systems, which may vary depending on the needs of the company deploying the recommender.