1
$\begingroup$

I need to find vector $\bf{p}$ in the following system:

$$\bf{0} \approx \bf{W} \left[ \bf{C}^2 \bf{p} - \bf{d} \right]$$

$$\bf{0} \approx \varepsilon \bf{p}$$

In the above, $\bf{0}$ is a vector, $\bf{W}$ is a matrix, $\bf{C}$ is a matrix, $\bf{d}$ is a vector, $\epsilon$ is a scalar, and all of these are known inputs.

Perhaps the first and easiest way to proceed would be to re-write the system as the following:

$$\bf{0} \approx {\bf{W}}\left[ \bf{C}^2 \bf{p} - \bf{d} \right] + \varepsilon \bf{p}$$

How would I proceed to be able to find a $\bf{p}$ vector that will minimize the system? I am prototyping this calculation in Matlab, but I need to re-write the calculation in C++. Could anyone suggest an optimization algorithm (with some sort of numerical library) that I can use to minimize? I believe that the optimization algorithms bundled with Matlab require a scalar output of a function to be minimized.

As a reference, this system of equations is given in the following paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.8057&rep=rep1&type=pdf

  • 1
    It is not clear what you are trying to do. What do you mean by the $\approx$ symbol?2012-05-11
  • 0
    @copper.hat: I want to find a vector $\bf{p}$ that will minimize the RHS of the expression. The notation follows this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.8057&rep=rep1&type=pdf.2012-05-11
  • 0
    The paper doesn't define what it means by $\approx$ other than 'minimizing the residual'. There are many ways of doing this. If you choose the $2$-norm, then you could use least squares. I can't imagine what $0 \approx \epsilon p$ means, other than some regularization, or attempt to limit $p$ in some manner.2012-05-11
  • 0
    @copper.hat: Sure, that sounds reasonable; thanks for suggesting this. How would I set up the math so that I can use least squares to find a minimum $\bf{p}$? Maybe adding the two equations together as I did in my question above is the way to get rid of the $\bf{0} \approx \varepsilon \bf{p}$? I don't know how to limit $\bf{p}$, but I have to say that much is left to our imaginations.2012-05-12
  • 0
    You could minimize $||W(C^2p-d)||_2^2+||\epsilon p||_2^2$. Look for least squares routines, or just solve the normal equations (not necessarily the best way, but easy to set up).2012-05-12
  • 0
    @copper.hat: Thanks again, I think that you are pushing me on the right track. I suppose that page 5 of this least squares tutorial (http://www.maths.lth.se/matematiklth/personal/fredrik/kombopt/f10eng.pdf) shows how to solve $||W(C^2p-d)||_2^2$, but how do I deal with the $||\epsilon p||_2^2$ term? I assume that $||W(C^2p-d)||_2^2+||\epsilon p||_2^2$ is a scalar, since this is the L-2 norm that we are calculating? (http://mathworld.wolfram.com/L2-Norm.html)2012-05-12
  • 0
    Solve $||\begin{bmatrix} W C^2 \\ \epsilon I \end{bmatrix}p - \begin{bmatrix} W d \\ 0 \end{bmatrix}||_2^2$.2012-05-12
  • 0
    @copper.hat: Thank you! OK, I will give it a go in Matlab. That's really nice to consider the solution in the form $\left\| {{\bf{{\rm A}p}} - {\bf{B}}} \right\|_2^2$, where $\bf{A}$ is a matrix and $\bf{B}$ is a vector.2012-05-12
  • 0
    @copper.hat: Thanks again for your help. I tried to solve in Matlab using the solution given on pg. 5 of the least squares tutorial (http://www.maths.lth.se/matematiklth/personal/fredrik/kombopt/f10eng.pdf). However, computing ${{\bf{A}}^{\rm{T}}}{\bf{A}}$ produces some elements in the matrix to be NaN, so the inverse ${\left( {{{\bf{A}}^{\rm{T}}}{\bf{A}}} \right)^{ - 1}}$ does not work well. What am I doing wrong here? Perhaps there is a good reference on how to solve such least-squares problems without using the method contained in the tutorial.2012-05-12
  • 0
    Computing $A^TA$ just involves additions and multiplications, so if an element is a Nan, you must be doing something wrong...2012-05-12
  • 0
    @copper.hat: I agree, and I'll investigate further. The Matlab command `Amat' * Amat` for computing $A^TA$ produces some NaN elements, so there might be something buggy with the software or otherwise. I'll try some other programming languages/techniques and post back a reply here. Many thanks.2012-05-12
  • 0
    You must have Nans in Amat then...2012-05-12
  • 0
    @copper.hat: You are very right. The issue with the NaN numbers occurred due to $\tau = 0$ when computing Equation (5) of the original paper (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.8057&rep=rep1&type=pdf). Minimization of the L2-norm seems to work well. Could you write a short answer so that I can mark it as the accepted answer to this question? I am also curious with respect to the theory behind why writing $||\begin{bmatrix} W C^2 \\ \epsilon I \end{bmatrix}p - \begin{bmatrix} W d \\ 0 \end{bmatrix}||_2^2$ works for this problem.2012-05-12
  • 0
    Also, what does the $||_2^2$ mean? I am not familiar with this notation for the L2-norm.2012-05-12
  • 0
    The upper 2 just means the norm squared, the lower 2 refers to the 2 (usual) norm, ie, square root of the sum of squares of the elements.2012-05-12
  • 0
    @copper.hat: Of course, this makes sense; thank you.2012-05-12

1 Answers 1

1

Try solving the least-squares problem:

$$ \min_{p} \; \; ||\begin{bmatrix} W C^2 \\ \epsilon I \end{bmatrix}p - \begin{bmatrix} W d \\ 0 \end{bmatrix} ||_2^2 $$

  • 0
    Thank you; do I have to use the squared norm to be able to find a solution using an algorithm such as the one given on pg. 5 of the least squares tutorial (http://www.maths.lth.se/matematiklth/personal/fredrik/kombopt/f10eng.pdf)? I believe that the solution given on pg. 5 of the tutorial does not square the norm.2012-05-12
  • 0
    Since the square root is strictly increasing (for $x\geq 0$), minimizing either will have the same effect. As an aside, the result on pg. 5 was derived using the norm squared.2012-05-12
  • 0
    Thanks again for all of your help. I think that I now understand what is going on here.2012-05-12