Slope One

Slope One [1] predicts ratings by deviations between ratings of each pair of items.

Definition

The ratings from a given user, called an evaluation, is represented as an incomplete array \(u\), where \(u_i\) is the rating of this user gives to item \(i\). The subset of the set of items consisting of all those items which are rated in u is \(S(u)\). The set of all evaluations in the training set is \(\chi\). The number of elements in a set \(S\) is \(card(S)\). The average of ratings in an evaluation u is denoted \(\overline u\). The set \(S(\chi)\) is the set of all evaluations \(u\in\chi\) such that they contain item \(i (i \in S(u))\). Predictions, which we write \(P(u)\), represent a vector where each component is the prediction corresponding to one item: predictions depend implicitly on the training set \(\chi\).

Training

The Slope One [1] scheme takes into account both information from other users who rated the same item and from the other items rated by the same users.

Given a training set \(\chi\), and any two items \(j\) and \(i\) with ratings \(u_j\) and \(u_i\) respectively in some user evaluation \(u \in S_{j, i}(\chi) )\), consider the average deviation if item \(i\) with respect to \(j\) as:

\[\operatorname{dev}_{j, i}=\sum_{u \in S_{j, i}(\chi)} \frac{u_{j}-u_{i}}{\operatorname{card}\left(S_{j, i}(\chi)\right)}\]

Predict

Given that \(\operatorname{dev}_{j, i}+u_{i}\) is a prediction for \(u_j\) given \(u_i\), a reasonable predictor might be the average of all such predictions

\[P(u)_{j}=\frac{1}{\operatorname{card}\left(R_{j}\right)} \sum_{i \in R_{j}}\left(\operatorname{dev}_{j, i}+u_{i}\right)\]

where \(R_{j}=\left\{i | i \in S(u), i \neq j, \operatorname{card}\left(S_{j, i}(\chi)\right)>0\right\}\) is the set of all relevant items. For a dense enough data set, that is, where \(\operatorname{card}\left(S_{j, i}(\chi)\right)>0\) for almost all \(i,j\), most of time \(R_{j}=S(u)\) for \(j \notin S(u)\) and \(R_{j}=S(u)-\{j\}\) when \(j \in S(u)\). Since \(\overline{u}=\sum_{i \in S(u)} \frac{u_{i}}{\operatorname{card}(S(u))} \simeq \sum_{i \in R_{j}} \frac{u_{i}}{\operatorname{card}\left(R_{j}\right)}\) for most \(j\), simplifying the prediction formula as

\[P^{S 1}(u)_{j}=\overline{u}+\frac{1}{\operatorname{card}\left(R_{j}\right)} \sum_{i \in R_{j}} \operatorname{dev}_{j, i}\]

References

[1](1, 2) “Slope one predictors for online rating-based collaborative filtering.” Proceedings of the 2005 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2005.