7
$\begingroup$

In many applications that is not with high requirements, it is common to use $(A^{\text T}A+\lambda I)^{-1}A^{\text T}$ or $A^{\text T}(AA^{\text T}+\lambda I)^{-1}$ ($\lambda$ is small) to approximate the Moore-Penrose pseudoinverse $A^{\dagger}$. But from the properties of Moore-Penrose pseudoinverse, how do we know that $(A^{\text T}A+\lambda I)^{-1}A^{\text T}$ and $A^{\text T}(AA^{\text T}+\lambda I)^{-1}$ will be close to the exact solution?

It seems that $ \lim_{\lambda \rightarrow 0}(A^{\text T}A+\lambda I)^{-1}A^{\text T} = A^{\dagger} $ I want to know why this happens (You can try in Matlab for verification). Besides, it is also possible for $A^{\text T}A + \lambda I$ to be singular. The whole thing is confusing me.

And the question following this is, if we want to restrict the error of $(A^{\text T}A+\lambda I)^{-1}A^{\text T}$ approximating the $A^{\dagger}$ to a certain extent, can we know the upper bound for the $\lambda$ according to some criterions?

  • 1
    If you can use singular value decompositon, the proof is utterly clear. However, if you want to prove the statement "from the properties of Moore-Penrose pseudoinverse" (I take you mean the four *defining properties* of M-P inverse), I am also inclined to see such a proof.2012-11-29

1 Answers 1

5

What happens is that $ A^{\dagger}=\lim_{\lambda\searrow0}(A^TA+\lambda I)^{-1}A^T=\lim_{\lambda\searrow0}A^T(AA^T+\lambda I)^{-1} $

Proof:

Write $A=WDV$ the singular value decomposition (i.e. $D$ is diagonal with non-negative diagonal, $W,V$ are unitaries). Then $A^TA=V^TD^2V$, and $(A^TA+\lambda I)^{-1}=V^T(D^2+\lambda I)^{-1}V$. $ (A^TA+\lambda I)^{-1}A^T-(A^TA+\mu I)^{-1}A^T=V^T(D^2+\lambda I)^{-1}VV^TDW^T-V^T(D^2+\mu I)^{-1}VV^TDW^T=V^T(D^2+\lambda I)^{-1}DW^T-V^T(D^2+\mu I)^{-1}DW^T=V^T[(D^2+\lambda I)^{-1}D-(D^2+\mu I)^{-1}D]W^T=V^T(D^2+\lambda I)^{-1}D[(\mu-\lambda)I](D^2+\mu I)^{-1}DW^T. $ As $(D^2+\lambda I)^{-1}D=f(D)$ where $f_\lambda(t)=t/(t^2+\lambda)$ and $f(0)=0$, the norm of $(D^2+\lambda I)^{-1}D$ is bounded (by the inverse of the smallest nonzero singular value of $A$) irrespective of $\lambda$. So the estimate above yields that $ \|(A^TA+\lambda I)^{-1}A^T-(A^TA+\mu I)^{-1}A^T\|\leq c|\lambda-\mu| $ for an appropriate constant $c$. This proves that $ B=\lim_{\lambda\searrow0}(A^TA+\lambda I)^{-1}A^T $ exists. Now $ ABA=\lim_{\lambda\searrow0}A(A^TA+\lambda I)^{-1}A^TA=\lim_{\lambda\searrow0}A(I-\lambda(A^TA+\lambda I)^{-1})=A-\lim_{\lambda\searrow0}\lambda A(A^TA+\lambda I)^{-1}=A-W\lim_{\lambda\searrow0}A\lambda D(D^2+\lambda I)^{-1}V=A-W\lim_{\lambda\searrow0}\lambda f_\lambda(t)V. $ As $f_\lambda(0)=0$ and $\lambda f_\lambda(t)\to0$ when $t\ne0$, the expression in the limit goes to zero, so $ABA=A$.

In a similar way one shows that $BAB=B$.

Finally, $AB$ is symmetric as $ AB=\lim_{\lambda\searrow0}A(A^TA+\lambda I)^{-1}A^T=\lim_{\lambda\searrow0}WD(D^2+\lambda I)^{-1}DW^T=\lim_{\lambda\searrow0}WD^2(D^2+\lambda I)^{-1}W^T. $ The matrices $D^2(D^2+\lambda I)^{-1}$ are symmetric, and so are their conjugates by $W$, and so is the limit. Similarly, $BA$ is symmetric.

As the Moore-Penrose pseudoinverse is unique, $B=A^\dagger$.

  • 0
    Thanks very much for the help!2012-11-30