0
$\begingroup$

We can find a left inverse $A^{L} = (A^T A)^{-1}A^T$

In a situation $Ax=b$ using this left inverse I can obtain $x=A^{L}b$

This provides with "best-fit" solution for $x$, if I were to re-compute b' = Ax I will get b \ne b'

This "best-fit" notion is based on a root-mean-square deviation. {1 \over n} \sqrt{\sum{(b_i-b_i')^2}}

Question: if I wanted certain values within $b$ to contribute less than other values, I would want some "weight" associated with it. However, I do not want to simply evaluate a weighted RMSD value. I want to compute a "weighted" left inverse. -- How can this be done?

  • 0
    And when weighted least squares is used, one usually wants to make the weights proportional to the reciprocals of the variances (not the reciprocals of the standard deviations) of the errors, since that's what minimizes the mean-square-error of estimates.2011-09-09

2 Answers 2

2

Consider the problem of minimizing $f(x) = \sum_i w_i (A x - b)_i^2$ where $w_i$ are positive weights. You can write this as $f(x) = (A x - b)^T W (A x - b)$ where $W$ is the diagonal matrix with diagonal entries $w_i$. Assuming $A^T W A$ is invertible, the solution is $x = (A^T W A)^{-1} A^T W b$.

  • 0
    Basically this is the same as the unweighted version if you replace $A$ by $W^{1/2} A$ and $b$ by $W^{1/2} b$. So if you know how to do unweighted least squares using SVD, you can use that with this adjusted matrix and vector.2011-09-09
1

Here is a reformulation of the previous answers and comments which I hope will be somewhat helpful to the OP.

A. The problem you are interested in is the following: given an inner product $\langle \cdot, \cdot \rangle$ find $x$ such that $\langle b - Ax, b - Ax \rangle$ is minimized.

When $\langle \cdot, \cdot \rangle$ is the ordinary inner product, this is the ordinary least squares solution. When $\langle x, y \rangle = x^T W y$ where $W$ is some positive diagonal matrix, this the weighted case you are interested in.

B. The solution will satisfy the following optimality criterion: the error must be orthogonal to the column space of $A$.

Formally, let $a_1, \ldots, a_n$ be the columns of $A$. Then the optimal $x^*$ will satisfy $ \langle a_i, b-Ax^* \rangle = 0 $ for all $i$.

Why? Because if the error could be orthogonally decomposed as

$ b- Ax = x_{R(A)} + x_{R(A)^\perp}$ where $x_{R(A)} \neq 0$ is the projection onto the range of $A$, and $x_{R(A)^\perp}$ is a projection onto its a complement, then we could pick a different $x$ to get a smaller error. Indeed, $ \langle b - Ax, b-Ax \rangle = \langle x_{R(A)}, x_{R(A)} \rangle + \langle x_{R(A)^\perp}, x_{R(A)^\perp} \rangle $ by the Pythagorean theorem. Now if $x_{R(A)} = Ay$, then $ \langle b-A(x+y), b-A(x+y) \rangle = \langle x_{R(A)^\perp}, x_{R(A)^\perp} \rangle$ which is smaller.

C. For the case of the ordinary inner product, the above optimality principle can be restated as $ A^T (b-Ax^*) = 0$ which immediately gives you your least-squares solution; and for the case of the weighted inner product, it can be restated as $ A^T W (b-Ax^*)=0$ which immediately gives you the weighted solution.