0
$\begingroup$

I am having trouble try to show that this linear regression summation:

$J(w) = \sum^m_{i=1} u_i (w^T x_i - y_i)^2$

can be rewritten in the following matrix form:

$J(w) = (Xw - y)^T U(Xw - y)$

Apparently it's possible and I do not know where the $U$ comes from. Any help would be appreciated.

1 Answers 1

1

The subscript denotes the $i$'th columns of the corresponding matrices. Rewrite $J(w)$ as $$ \sum^m_{i=1} (w^T x_i - y_i) u_i (w^T x_i - y_i) $$ Then, notice that $w^Tx_i = x_i^Tw$ is a scalar quantity (dot product of two vectors), so $$ \sum^m_{i=1} (x_i^Tw - y_i) u_i (x_i^Tw - y_i) $$ Notice that $u_i$ is also scalar which hints us for a diagonal matrix $U$. The structure then becomes, $$ \pmatrix{x_1^Tw-y_1 &\ldots &x_m^Tw-y_m}\pmatrix{u_1 \\&u_2\\&&\ddots\\&&&u_m}\pmatrix{x_1^Tw-y_1 \\\vdots \\x_m^Tw-y_m} $$ With a final touch, we almost arrive at the result with a typo. $$ J(w) = (X^Tw-y)^T U(X^Tw-y) $$

  • 0
    The typo is the transpose of $(X^T w - y)^T$ ?2011-09-23
  • 0
    also how did you turn $(X^T_i w - y_i)$ into a row and column vector?2011-09-23
  • 0
    it is an element wise operation to compute the first element of $X^Tw$ you simply take the first row of $X^T$ which is denoted by $x_i^T$. I hope it is slightly clearer now>2011-09-23
  • 0
    @gprime You are very welcome2011-09-28