Neighbors-based models [1] predict perference from similar items or similar users. There are two kinds of neighbors-based models: user-based and item-based, which means whether predictions are made by similar users or items. In general, item-based model is better than user-based model since items’ characters are more consistent than users’ preferences.


In this section, KNN models are introduced by user-based forms. Item-based models could be got by transposing positions of users and items.

KNN for Explicit Feedback

For explicit feedback, neighbors-based models predict ratings from similar items or similar users.


Key Hyperparameter Type Description Default
lr Lr float64 learning rate (for baseline) 0.005
reg Reg float64 regularization strength (for baseline) 0.02
n_epochs NEpochs int number of epochs (for baseline) 20
type Type string type for KNN (basic, centered, z_score, baseline) basic
user_based UserBased bool user based if true. otherwise item based true
similarity Similarity string similarity metrics (pearson, cosine, msd) msd
k K int number of neighbors 40
min_k MinK int least number of neighbors 1
shrinkage Shrinkage int shrinkage strength of similarity 100


Items rated by two different users \(u\) and \(v\) are represented by \(I_u\) and \(I_v\). Users rated two different items \(i\) and \(j\) are represented by \(U_i\) and \(U_j\). The rating that user \(u\) gives to item \(i\) is \(r_{ui}\) and the predicted rating is \(\hat r_{ui}\).


Similarity metrics define nearest neighbors. There are several most used similarity functions:


\[\cos(u,v)=\frac{\sum\limits_{k\in|I_u\cap I_v|}r_{uk}\cdot r_{vk}}{\sqrt{\sum\limits_{k\in|I_u\cap I_v|}r_{uk}^2}\cdot\sqrt{\sum\limits_{k\in|I_u\cap I_v|}r_{vk}^2}}\]


Pearson similarity is similar to cosine similarity but ratings are subtracted by means first.

\[\text{pearson}(u,v)=\frac{\sum\limits_{k\in|I_u\cap I_v|}(r_{uk}-\tilde r_u)\cdot (r_{vk}-\tilde r_v)}{\sqrt{\sum\limits_{k\in|I_u\cap I_v|}(r_{uk}-\tilde r_u)^2}\cdot\sqrt{\sum\limits_{k\in|I_u\cap I_v|}(r_{vk}-\tilde r_v)^2}}\]

where \(\tilde r_u\) is the mean of ratings rated by the user \(u\):

\[\tilde r_u = \sum_{k\in I_u} r_{uk}\]

Mean Square Distance

The Mean Square Distance is

\[\text{msd}(u,v)=\frac{1}{|I_u\cap I_v|}\sum_{k\in|I_u\cap I_v|}(r_{uk}-r_{vk})^2\]

Then, the Mean Square Distance Similarity is

\[\text{MSDSim}(u, v) = \frac{1}{\text{msd}(u, v) + 1}\]


A rating could be predict by k nearest neighbors \(\mathcal N_k(u)\) (k users with top k similarities to user \(u\)).

\[\hat r_{ui}=\frac{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)r_{vi}}{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)}\]

The basic KNN prediction has some problem. There are more advanced methods which achieve higher accuracy.

KNN with Mean

Some users tend to give higher ratings but some users tend to give lower ratings. It’s resonable to substract ratings with the mean.

\[\hat r_{ui}=\tilde r_u+\frac{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)(r_{vi}-\tilde r_v)}{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)}\]

where \(\tilde r_v\) is the mean of \(v\)-th user’s ratings.

KNN with Z-score

Different users give ratings with different scales, the solution is to standardlize ratings.

\[\hat r_{ui}=\tilde r_u+\sigma(r_u)\frac{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)\frac{r_{vj}-\tilde r_v}{\sigma(r_v)}}{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)}\]

where \(\sigma(r_v)\) is the standard deviation of \(v\)-th user’s ratings.

KNN with Baseline

Ratings could be substract by biases from baseline model as well.

\[\hat r_{ui}=b_u+\frac{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)(r_{vi}- b_v)}{\sum_{v\in \mathcal N_k(u)}\text{sim}(u,v)}\]

where \(b_u\) is the bias comes from the baseline model \(\hat r_{ui}=b+b_u+b_i\). The KNN model with baseline is the best model since biases are used.

KNN for Implicit Feedback

For implicit feedback [2], neighbors-based models predict wheater a users will interact with a item from similar items or similar users.


There are no hyperparameters for implicit version KNN.


Items interacted with user \(u\) are represented by \(I^+_u\). Users interacted with item \(i\) are represented by \(U^+_i\). The confidence of predicting user \(u\) will interact with item \(i\) is \(\hat x_{ui}\).


The similarity for implicit feedback is slightly different.

\[c_{i, j}^{\operatorname{cosine}} :=\frac{\left|U_{i}^{+} \cap U_{j}^{+}\right|}{\sqrt{\left|U_{i}^{+}\right| \cdot\left|U_{j}^{+}\right|}}\]


The prediction is given by the sum of similarities.

\[\hat{x}_{u i}=\sum_{l \in I_{u}^{+} \wedge l \neq i} c_{i l}\]


[1]Desrosiers, Christian, and George Karypis. “A comprehensive survey of neighborhood-based recommendation methods.” Recommender systems handbook. Springer, Boston, MA, 2011. 107-144.
[2]Rendle, Steffen, et al. “BPR: Bayesian personalized ranking from implicit feedback.” Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, 2009.