Data

There are typically two kinds of available feedbacks from users: explicit feedback and implicit feedback.

Explicit Feedback

For explicit feedback, users give their preferences by ratings. For example, a user gives 5 stars to a movie he likes and 1 star to a movie that he dislikes. A user’s taste could be represented by ratings that he gives.

The dataset of explicit feedback is defined as:

\[D = \{ <u,i,r_{ui}> \}\]

There are multiple tuples \(<u,i,r_{ui}>\) in the dataset. A tuple is consisted of a user ID \(u\), a item ID \(i\) and a rating \(r_{ui}\). The table of a dataset should be:

User Item Rating
0 1 5
0 2 4
1 1 3
1 2 4

Example: MovieLens [1]

Implicit Feedback

For implicit feedback, only positive feedbacks are collected. For example, implicit feedback represents that a user watched a movie. However, there is no feedback that represents a user dislikes a movie. It’s even worse that implicit feedback couldn’t show that the user likes the movie. Implicit feedbacks are naturally noisy but it is the common situation in practice.

The dataset of implicit feedback is defined as:

\[D = \{ <u,i> \}\]

There are multiple tuples \(<u,i>\) in the dataset. A tuple is consisted of a user ID \(u\) and a item ID \(i\). The table of a dataset should be:

User Item
0 1
0 2
1 1
1 2

Example: Last.FM [2]

Weighted Implicit Feedback

For weighted implicit feedback, positive feedbacks and their weights are collected. For example, a user \(u\) played a game \(i\) for \(r_{ui}\) hours. There is no rating that represents this user likes or dislikes this game. The weight \(r_{ui}\) is a score of confidence rather than preference.

The definition is almost the same as explicit feedback:

\[D = \{ <u,i,r_{ui}> \}\]

There are multiple tuples \(<u,i,r_{ui}>\) in the dataset. A tuple is consisted of a user ID \(u\), a item ID \(i\) and a weight \(r_{ui}\). The table of a dataset should be:

User Item Weight
0 1 100
0 2 20
1 1 5
1 2 50

Example: Steam Video Games [3].