# CoClustering¶

CoClustering [1] predicts ratings by clustering users and items.

## Definition¶

Let $$\mathcal{U} = \{u\}^m_{u=1}$$ be the set of users such that $$|\mathcal{U}|=m$$ and $$\mathcal{P} = \{i\}^n_{i=1}$$ be the set of items such that $$|\mathcal{P}|=n$$. Let $$A$$ be the $$m\times n$$ ratings matrix such that $$A_{ui}$$ is the rating of the user $$u$$ and let $$W$$ be the $$m\times n$$ matrix corresponding to the confidence of the ratings in $$A$$. In the absence of explicit confidence information, we assume $$W_{ij} = 1$$ when the rating is known and $$0$$ otherwise.

Let $$\rho:\{1,\cdots,m\} \rightarrow \{1,\cdots,k\}$$ and $$\gamma:\{1,\cdots,n\} \rightarrow \{1,\cdots,l\}$$ denote the user and item clustering where $$k$$ and $$l$$ are number of user and item clusters.

$\hat{A}_{ui}=A^{COC}_{gh}+(A^R_u-A^{RC}_g)+(A^C_i-A^{CC}_h)$

where $$g=\rho(u)$$, $$h=\gamma(i)$$ and $$A^R_u$$, $$A^C_i$$ are the average ratings of user $$u$$ and item $$i$$, and $$A^{COC}_{gh}$$, $$A^R_g$$ and $$A^{CC}_h$$ are the average ratings of the corresponding cocluster, user-cluster and item-cluster respectively, i.e.,

$A_{g h}^{C O C}=\frac{\sum_{u^{\prime} | \rho\left(u^{\prime}\right)=g} \sum_{i^{\prime} | \gamma\left(i^{\prime}\right)=h} A_{u^{\prime} i^{\prime}}}{\sum_{u^{\prime} | \rho\left(u^{\prime}\right)=g} \sum_{i^{\prime} | \gamma\left(i^{\prime}\right)=h} W_{u^{\prime} i^{\prime}}}$
$A_{g}^{R C}=\frac{\sum_{u^{\prime} | \rho\left(u^{\prime}\right)=g} \sum_{i^{\prime}=1}^{n} A_{u^{\prime} i^{\prime}}}{\sum_{u^{\prime} | \rho\left(u^{\prime}\right)=g} \sum_{i^{\prime}=1}^{n} W_{u^{\prime} i^{\prime}}}$
$A_{h}^{C C}=\frac{\sum_{u^{\prime}=1}^{m} \sum_{i^{\prime} | \gamma\left(i^{\prime}\right)=h} A_{u^{\prime} i^{\prime}}}{\sum_{u^{\prime}=1}^{m} \sum_{i^{\prime} | \gamma\left(i^{\prime}\right)=h} W_{u^{\prime} i^{\prime}}}$
$A_{u}^{R}=\frac{\sum_{i^{\prime}=1}^{n} A_{u i^{\prime}}}{\sum_{i^{\prime}=1}^{n} W_{u i^{\prime}}} A_{i}^{C}=\frac{\sum_{u^{\prime}=1}^{m} A_{u^{\prime} i}}{\sum_{u^{\prime}=1}^{m} W_{u^{\prime} i}}$

Using $$\hat A$$, we can now pose the prediction of unknown ratings as a co-clustering problem where we seek to find the optimal user and item clustering $$(\rho,\gamma)$$ such that the approximation error of $$A$$ (which is a function of $$(\rho,\gamma)$$) with respect to the known ratings of $$A$$ is minimized, i.e.,

$\min _{(\rho, \gamma)} \sum_{u=1}^{m} \sum_{i=1}^{n} W_{u i}\left(A_{u i}-\hat{A}_{u i}\right)^{2}$

## Training¶

Input: Ratings Matrix A , Non-zeros matrix W , #row clusters $$l$$, #col. clusters $$k$$.

Output: Locally optimal co-clustering $$(\rho,\gamma)$$ and averages $$A^{COC}$$, $$A^{RC}$$, $$A^{CC}$$, $$A^R$$ and $$A^C$$.

Method:

• Randomly initialize $$(\rho,\gamma)$$.

• repeat

• Compute averages $$A^{COC}$$, $$A^{RC}$$, $$A^{CC}$$, $$A^R$$ and $$A^C$$.

• Update row cluster assignments

$\rho(u)=\underset{1 \leq g \leq k}{\operatorname{argmin}} \sum_{i=1}^{n} W_{u i} \left( A_{u i} - A_{g \gamma(j)}^{C O C} -A_{i}^{R}+A_{g}^{R C} -A_{j}^{C}+A_{\gamma(j)}^{C C} \right)^{2}, \quad 1 \leq i \leq m$
• Update column cluster assignments

$\gamma(j)=\underset{1 \leq h \leq l}{\operatorname{argmin}} \sum_{i=1}^{m} W_{i j}\left(A_{i j}-A_{\rho(i) h}^{C O C}-A_{i}^{R}+A_{\rho(i)}^{R C}-A_{j}^{C}+A_{h}^{C C} \right)^{2}, \quad 1 \leq j \leq n$
• until convergence

## Prediction¶

Input:  $$A^{COC}$$, $$A^{RC}$$, $$A^{CC}$$, $$A^R$$ , $$A^C$$, $$A^{glob}$$, co-clustering $$(\rho,\gamma)$$ user set $$\mathcal U$$, item set $$\mathcal P$$, user $$u$$, item $$i$$.

Output: Rating $$r$$

Method:

• Case $$u\in\mathcal U$$ and $$i\in\mathcal P$$: // old user-old item
$g \leftarrow \rho(i), h \leftarrow \gamma(j)$
$r \leftarrow A_{i}^{R}+A_{j}^{C}-A_{g}^{R C}-A_{h}^{C C}+A_{g h}^{C o C}$
• Case $$u\in\mathcal U$$ and $$i\notin\mathcal P$$: // old user-new item
$g \leftarrow \rho(i), r \leftarrow A_{i}^{R}$
• Case $$u\notin\mathcal U$$ and $$i\in\mathcal P$$: // new user-old item
$h \leftarrow \gamma(j), r \leftarrow A_{j}^{C}$
• Case $$u\notin\mathcal U$$ and $$i\notin\mathcal P$$: // new user-new item
$r \leftarrow A^{g l o b}$

## References¶

 [1] George, Thomas, and Srujana Merugu. “A scalable collaborative filtering framework based on co-clustering.” Data Mining, Fifth IEEE international conference on. IEEE, 2005.