Derivation of least squares containing vectors

In summary, the conversation involves discussing how to derive a result from a paper, specifically the minimum values for a cost function with alternating least squares. The result is derived using the derivative of the cost function with respect to one specific user, and rearranging the equation to solve for the vector xu. The final solution is given in Equation \eqref{eq:xu-diff}.
  • #1
nobben
2
0
Hi,

I'm trying to derive a result from this paper:

http://www.research.yahoo.net/files/HuKorenVolinsky-ICDM08.pdf

given cost function

[tex]
\min_{x_* , y_* } \sum_{u, i} c_{ui} (p_{ui} - x^T_u y_i)^2 + \lambda \left( \sum_u ||x_u||^2 + \sum_i ||y_||^2 \right)
[/tex]

Where both xu and yi are vectors in ℝk.

I want to find the minimum by using alternating least squares. Therefore I fix y and find the derivative with respect to xu.

cui, pui and λ are constants.

They derive the following:

[tex]x_u = \left( Y^TC^uY + \lambda I \right) ^{-1} Y^T C^u p \left( u \right)[/tex]

I'm unsure how reproduce this result...

I don't know if I should try to derive with respect to the whole vector xu or against one entry k (xuk) in the vector and then try to map this to a function for the whole vector?

Any pointers are very much appreciated.
 
Physics news on Phys.org
  • #2
I think I solved it. Including solution below if anyone is interested!




First we define a variable S_u below which is the cost function from Equation \eqref{eq:implicit-cost-function} but where we have fixed u so that we are calculating the predicted errors for one specific user.

\begin{equation}
\label{eq:cost-u}
S_u = \sum_{i} c_{ui} (p_{ui} - x^T_u y_i)^2 + \lambda \left( || x_u ||^2 + \sum_i || y_i||^2 \right)
\end{equation}

We now find the derivative of S_u with respect to x_u.
To help clarify the process we introduce a variable \epsilon_{i} below which is the prediction-error of item i with respect to user u.
\begin{equation}
\label{eq:pred-error}
\epsilon_{i} = p_{i} - \sum_{j=1}^{n} x_{uj} y_{ij}
\end{equation}
We also calculate the derivative of \epsilon_{i} with respect to the j entries in the user factor x_uj.
\begin{equation}
\label{eq:error-deriv}
\frac{d \epsilon_{i}}{d x_{uj}} = - y_{ij}
\end{equation}
We now rewrite using the \epsilon_i variable.
There are m users, n items and f is the length of the factor vectors.
\begin{equation}
S_u = \sum_{i=1}^{n} \left( c_i \epsilon_i^2 + \lambda \sum_{k=1}^f y_{ik} \right) + \lambda \sum_{p=1}^f x_{up}^2
\end{equation}
Then we find the derivative.
\begin{equation}
\forall[1 \leq j \leq f] : \frac{d S_u}{x_{uj}} = 2 \sum_{i=1}^n \left( c_i \epsilon_i \frac{d \epsilon_i}{d x_{uj}} \right) + 2 \lambda x_{uj}
\end{equation}
Replace \epsilon_i and [tex]\frac{d \epsilon_i}{d x_{uj}}[/tex].
\begin{equation}
\forall[1 \leq j \leq f] : \frac{d S_u}{x_{uj}} = 2 \sum_{i=1}^n \left( c_i \left( p_{i} - \sum_{k=1}^{n} x_{uk} y_{ik} \right) \left( - y_{ij} \right) \right) + 2 \lambda x_{uj}
\end{equation}
Set the derivative to 0.
\begin{equation}
\forall[1 \leq j \leq f] : 0 = 2 \sum_{i=1}^n \left( c_i \left( p_{i} - \sum_{k=1}^{n} x_{uk} y_{ik} \right) \left( - y_{ij} \right) \right) + 2 \lambda x_{uj}
\end{equation}
Rearrange.
\begin{equation}
\forall[1 \leq j \leq f] : \sum_{i=1}^n \sum_{k=1}^{n} y_{ij} c_i y_{ik} x_{uk} + 2 \lambda x_{uj} = \sum_{i=1}^n c_i p_{i} y_{ij}
\end{equation}
Rewrite using matrix notation.
\begin{eqnarray}
x_u (Y^TC^uY + \lambda I) & = & Y^T C^u p(u) \\
\label{eq:xu-diff}
x_u & = & (Y^TC^uY + \lambda I)^{-1}Y^T C^u p(u)
\end{eqnarray}
Y is the $n \times f$ matrix containing the user factors.
$C^u$ is the diagonal matrix containing the confidence values where $C^u_{ii} = c_{ui}$.
The vector $p(u)$ contains all the preference values $p_{ui}$ by user $u$.
 

Related to Derivation of least squares containing vectors

What is the derivation of least squares containing vectors?

The derivation of least squares containing vectors is a mathematical method used to find the best fit line or plane for a set of data points. It minimizes the sum of the squared distances between the data points and the line or plane, providing a way to quantify the accuracy of the fit.

Why is least squares used for regression analysis?

Least squares is used for regression analysis because it provides a way to find the line or plane that best fits a set of data points. This line or plane can then be used to make predictions about future data points. Additionally, the sum of the squared distances is a common measure of error, making it easy to compare different models.

What are the assumptions made in the derivation of least squares?

The derivation of least squares makes several assumptions, including: the data follows a linear trend, the errors are normally distributed, the errors have a mean of zero, and the errors are independent of each other. Violations of these assumptions can affect the accuracy and reliability of the least squares method.

What is the difference between simple and multiple linear regression?

Simple linear regression involves finding the best fit line for a set of data points with only one independent variable. Multiple linear regression, on the other hand, involves finding the best fit plane for a set of data points with multiple independent variables. Multiple linear regression is more complex and can provide a more accurate model, but it also requires additional assumptions and calculations.

How are the coefficients calculated in the derivation of least squares?

The coefficients in the derivation of least squares are calculated using the method of ordinary least squares, which involves minimizing the sum of the squared distances between the data points and the line or plane. This is typically done using calculus and linear algebra. The resulting coefficients represent the slope and intercept of the line or the intercepts of the plane.

Similar threads

Replies
3
Views
2K
Replies
9
Views
911
Replies
2
Views
2K
Replies
14
Views
1K
  • Special and General Relativity
Replies
14
Views
2K
Replies
3
Views
2K
Replies
1
Views
2K
Replies
9
Views
2K
Replies
2
Views
4K
Back
Top