2
$\begingroup$

How to prove that if you have $x^*$ such that $x^*=\text{psuedoinverse}(A) b$, and $Ay=b$, then $$\Vert x^* \Vert_2 \leq \Vert y \Vert_2$$

  • 0
    Some people define the pseudoinverse as the mapping that takes $b$ as input and returns the vector $x$ of least norm such that $Ax=\hat{b}$, where $\hat{b}$ is the projection of $b$ onto the range of $A$. Formulas for the pseudoinverse are then derived from this more conceptual definition.2012-10-27
  • 0
    @Marcus. You don't specify, but should it be assumed that $A$ has more columns than rows and has full row rank?2012-10-28

3 Answers 3

0

Let me try a one liner solution.

$Ay=b$ $\Rightarrow$ $A(y-x^*)=0$ $\Rightarrow$ $\langle x^*,y-x^*\rangle=0$ $\Rightarrow$ $\|y\|^2=\|y-x^*\|^2+\|x\|^2\ge \|x\|^2$.

Remark. $R=A^t(AA^t)^{-1}$, $x^*=Rb$. Then we have $\langle x^*,y-x^*\rangle=\langle Rb,y-x^*\rangle=$ $\langle (AA^t)^{-1}b,A(y-x^*)\rangle=0$.

4

You essentially want to find the solution to he following optimization problem. $$\min_{x}\Vert x \Vert_2 \text{ such that } Ax = b$$ Using Lagrange multipliers, we get that $$\min_{x, \lambda} \dfrac{x^Tx}2 + \lambda^T (Ax - b)$$ Differentiate with respect to $x$ and $\lambda$ to get that $$x^* = \underbrace{A^T(AA^T)^{-1}}_{\text{pseudoinverse}}b$$

Proof: $$\dfrac{d \left(\dfrac{x^Tx}2 + \lambda^T (Ax - b) \right)}{dx} = 0 \implies x^* + A^T \lambda = 0 \implies x^* = -A^T \lambda$$ We also have $$Ax^* = b \implies AA^T \lambda = -b \implies \lambda = - \left( AA^T\right)^{-1}b$$ Hence, $$x^* = \underbrace{A^T(AA^T)^{-1}}_{\text{pseudoinverse}}b$$

  • 0
    But how does that prove that it is less than the 2-norm of y?2012-10-27
  • 0
    @Marcus We have found the $x$ satisfying $Ax=b$ such that $\Vert x \Vert_2$ is minimum.2012-10-27
  • 0
    Of course, sorry for being a bit dense there. Thank you2012-10-27
  • 1
    Your pseudoinverse is only true if $A$ has full row rank, so that $AA^T$ has full rank. (I.e. it is a right inverse of $A$ only.)2012-10-28
  • 0
    @Daryl True. This is assuming that $A$ has full rank.2012-10-28
  • 0
    @Daryl, if $A$ is not full rank as that, how could we solve the problem? Is it right that we should first decompose $A$ using SVD or else techniques?2015-03-28
  • 0
    @mining If $A$ doesn't have full row rank, there are redundant constraints. As long as $b$ is in the column space of $A$ (i.e. $Ax=b$ is consistent), then the redundant constraints can be dropped so that $A$ has full for rank.2015-03-28
  • 0
    Hi, @Daryl, thank you very much for replying and explanations! The $A$ is commonly as design matrix or data matrix, generally we could preprocess this matrix (e.g. remove redundant samples) to make it be full rank. Thank you!2015-03-28
2

Let $A: \mathbb{U} \mapsto \mathbb{V}$, then in terms of SVD, we can write $A$ as $$A=\sum_{n=1}^R\sigma_nv_nu_n^{\dagger},$$ where $\sigma_n$ is a nonzero singular value; $u_n$ and $v_n$ are the right and left singular vector, respectively; $R$ is the rank of $A$.

Since {$u_n$} form an orthonormal basis of $\mathbb{U}$, we can expand $y$ as $$y=\sum_{m=1}^M\alpha_m u_m,$$ where $M\ge R$ is the dimension of $\mathbb{U}$.

So $$b=Af=\sum_{n=1}^R\sigma_nv_nu_n^{\dagger}\sum_{m=1}^M\alpha_mu_m=\sum_{n=1}^R\alpha_n\sigma_nv_n$$

Also, the pseudoinverse is $$A^+=\sum_{m=1}^R\frac{1}{\sigma_m}u_mv_m^{\dagger}$$

Then $$x^*=A^+b=\sum_{m=1}^R\frac{1}{\sigma_m}u_mv_m^{\dagger}\sum_{n=1}^R\alpha_n\sigma_nv_n=\sum_{m=1}^R\alpha_mu_m$$.

Finally, we can see that $$\|y||_2=(\sum_{m=1}^M\alpha_m^2)^{1/2},$$ $$\|x^*||_2=(\sum_{m=1}^R\alpha_m^2)^{1/2}.$$ Therefore, $$\|x^*||_2 \le \|y||_2,$$ where equality holds when $M=R$ or $\alpha_m=0$ for $m=R+1, \cdots, M$.

In other words, if we define $$\|y_{null}\|_2=(\sum_{m=R+1}^M\alpha_m^2)^{1/2},$$ we have $$\|y\|_2^2=\|x^*||_2^2+\|y_{null}\|_2^2.$$