2
$\begingroup$

Well I'm reading a paper and this simple least squares equation is not making sense to me .. This is the equation

$\mathbf a=\underset{\mathbf a}{\arg\min}\sum_{i=1}^m\left(\mathbf a^T\mathbf x_i-y_i\right)^2$

A little background on this equation:

basically this equation is being used for finding a mapping which takes some high dimensionality data (x), and maps it to some low dimensional data (y). The mapping transformation matrix (a) is what we have to find. The mapping should be such that we minimize the error between the original and projected data (as you can probably tell) ..

Inside the summation, y and x are being accessed serially in a column by column manner, and so yi and xi (i being the subscript) are referring to individual columns of matrices y and x, respectively .. Also, as you can probably tell, both y and x contain the same number of columns, m. Only the number of rows they each have is different ..

Now here's my question .. As far as I can tell (and I'm virtually sure I'm correct here), the result of the evaluation being performed inside the brackets would result in a column vector ... And then this column vector is being squared !? That is, this is what the equation is suggesting:

( column vector here )2 ...... with 2 being the superscript here

How is this possible ? Or is the equation maybe referring to a dot product here ?

  • 0
    Its equation (9) in this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.9676&rep=rep1&type=pdf2011-04-17

2 Answers 2

5

Yes Ahmad, you are absolutely right the equation is referring to the dot product. Given a (real) vector $\mathbf{v}$, one conventionally defines $\mathbf{v}^2 = \mathbf{v}\cdot \mathbf{v} = \sum_j v_j^2.$

0

I recall someone posting a similar answer yesterday, but it seems to be removed. Reading your pdf, I am pretty sure that $\mathbf{a}$ is in fact a vector, not a matrix (well, an $n\times 1$-matrix).

In Section 2.1 one introduces a matrix $X=[\mathbf{x}_1,...,\mathbf{x}_m]$, so each $\mathbf{x}_i$ is a column vector. In Section 2.2 one takes the argmin of some expression involving $\mathbf{a}^T\mathbf{x}_i$, where $\mathbf{a}$ is such that $\mathbf{a}^TXDX\mathbf{a}=1$ (where $D$ is a diagonal matrix). Clearly then, $\mathbf{a}$ is a column vector and $\mathbf{a}^T\mathbf{x}_i$ is the dot product, resulting in a scalar. In Theorem 1 just before equation (9), $\mathbf{a}$ is called an eigen-vector.

So one takes the vector $\mathbf{a}$ which minimizes the scalar expression $\sum_{i=1}^m\left(\mathbf a^T\mathbf x_i-y_i\right)^2$.

  • 0
    . well looking again into the paper I agree that **a** is being called an eigen- _vector_ at one stage and I'll look into that, but I dont get your logic with saying that **a** is a column vector because _a(transpose)XDXa=1_ .. I dont get what you mean here ..2011-04-17