6
$\begingroup$

I have an application where I have to minimize a cost function of the form:
$J(\mathbf x) = \| \mathbf A \mathbf x - \mathbf b \|^2 \quad (1)$

By calculating the gradient, I derived that I have to solve the system of equations:
$\mathbf A^T \mathbf A \mathbf x = \mathbf A^T \mathbf b \quad (2)$

Now my question is, when can I solve the following system instead?
$\mathbf A \mathbf x = \mathbf b \quad (3)$

From my point of view, this depends on $\mathbf A$ to be invertible. In my application $\mathbf A$ is a square matrix of the form $\mathbf A = \mathbf I - \mathbf W$ where $\mathbf I$ is the identity matrix and $\mathbf W$ is a square matrix with zeros on the main diagonal and small values on a few secondary diagonals. The values of $\mathbf W$ are arbitrary but normalized, so that each row of $\mathbf W$ sums to 1. However, some lines of $\mathbf W$ can also be zero.

For example $\mathbf A$ may look like this:
$\mathbf A = \pmatrix{ 1 & -0.2 & 0 & 0 & -0.3 & -0.2 & -0.3 & 0 & 0 \cr -0.1 & 1 & -0.2 & 0 & 0 & -0.4 & -0.2 & -0.1 & 0 \cr 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \cr 0 & 0 & -0.2 & 1 & -0.1 & 0 & 0 & -0.6 & -0.1 \cr -0.2 & 0 & 0 & -0.2 & 1 & -0.5 & 0 & 0 & -0.1 \cr -0.4 & -0.3 & 0 & 0 & -0.1 & 1 & -0.2 & 0 & 0 \cr -0.1 & -0.1 & -0.1 & 0 & 0 & -0.6 & 1 & -0.1 & 0 \cr 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \cr 0 & 0 & -0.2 & -0.4 & -0.1 & 0 & 0 & -0.3 & 1 \cr }$

Have I argued correctly? Then how can I show that $\mathbf A$ is invertible? Or is there any other argument for solving (3) instead of (2)?

I tried to solve both systems (2) and (3) with MATLAB and the Intel MKL, but surprisingly only (3) gave me the expected result. I would expect that it also works with (2). Maybe a numerical problem?

  • 0
    What are the dimension of your matrix and the vectors $x$ and $b$.2012-07-20
  • 0
    It's actually an image processing problem, where the dimension $n$ equals the number of pixels. (e.g. $n=2073600$ for a high definition image). $\mathbf A$ is a $n$x$n$ matrix. $\mathbf x$ and $\mathbf b$ are $n$x1 column vectors.2012-07-20
  • 1
    FYI, you might want to use conjugate gradient to solve $A^TA x = A^Tb$ rather than solve $Ax=b$ directly. In image processing and the like, you often want an iterative method that applies the operator without actually forming the matrix. It's not necessarily the fastest algorithm since you're roughly squaring the condition number, but in my experience it's easy to implement and get an answer quickly.2012-07-20
  • 0
    I also thought about doing that. However, since I don't get any useful result when solving $\mathbf A^T \mathbf A \mathbf x = \mathbf A^T \mathbf b$ using a direct solver, I haven't tried any iterative solvers yet.2012-07-20
  • 0
    Note: $A^TAx=A^Tb$ is more well behaved and perhaps easier to solve. It is symmetric and positive definite (or perhaps semi-definite).2012-07-21
  • 0
    @Gabriel, "well-behaved", not necessarily. As p.s. says in his comment, forming the cross-product matrix (roughly) squares the condition number, which leads to some loss of significant figures in the process of forming the solution. Consider for instance the Hilbert matrix, which is both symmetric positive definte **and** ill-conditioned.2012-07-21

4 Answers 4